Software Engineer, Machine Learning Infrastructure
TeslaJob Overview
As a Machine Learning Software Engineer within the Dojo team at Tesla, you will play a pivotal role in enhancing the capabilities of our cutting-edge Dojo training accelerator. This position involves close collaboration with top-tier ML Researchers, Compiler, and Hardware Engineers to address unique challenges at the intersection of AI and ML training accelerators. Your expertise will be crucial in optimizing and scaling our neural network training infrastructure.
Key Responsibilities
- Collaborate with machine learning researchers and engineers to run Full Self-Driving (FSD) models on our in-house ML training accelerator.
- Profile performance of training workloads in our cluster, identify bottlenecks in and between CPU/Dojo code execution, and work on optimizing throughput and scalability within and across nodes to reduce convergence time.
- Coordinate with the team managing the hardware cluster to maintain high availability and job throughput for machine learning tasks.
- Integrate the training software into our continuous integration cluster to support metrics persistence across experiments, weekly/nightly neural network builds, and other unit/throughput tests.
Required Skills and Experience
- Degree in Engineering, Computer Science, or equivalent experience with evidence of exceptional ability.
- Practical experience programming in Python and/or C++.
- Experience working with training frameworks, ideally PyTorch.
- Proficient in system-level software, particularly hardware-software interactions and resource utilization.
- Understanding of modern machine learning concepts and state-of-the-art deep learning.
- Experience in profiling and optimizing CPU-accelerator interactions (pipelining compute/transfers, etc.).
- DevOps experience, particularly in managing clusters of training nodes and filesystems for large amounts of training data.
Benefits
As a full-time Tesla employee, you are eligible for a comprehensive benefits package starting from day one, including:
- Aetna PPO and HSA plans with $0 payroll deduction options.
- Family-building, fertility, adoption, and surrogacy benefits.
- Dental and vision plans with options for $0 paycheck contribution.
- Company-paid HSA contributions, life, AD&D, short-term, and long-term disability insurance.
- 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits.
- Employee Assistance Program, sick and vacation time, and paid holidays.
- Back-up childcare and parenting support resources.
- Voluntary benefits including critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance.
- Weight Loss and Tobacco Cessation Programs, Tesla Babies program, commuter benefits, and employee discounts and perks.
Compensation
The expected annual salary for this position ranges from €104,000 to €360,000, depending on experience and level, plus cash and stock awards and benefits. The total compensation package may vary based on market location, job-related knowledge, skills, and experience.
Join us at Tesla and contribute to building a sustainable future by enhancing our machine learning infrastructure and accelerating the world's transition to sustainable energy.
Benefits Extracted with AI
- Disability insurance
- Commuter benefits
- 401(k)
- Aetna PPO and HSA plans
- Family-building benefits
- Dental and vision plans
- Healthcare and Dependent Care FSAs
- LGBTQ+ care concierge services
- Employee Stock Purchase Plans
- Life and disability insurance
- Employee Assistance Program
- Sick and Vacation time
- Back-up childcare
- Voluntary benefits
- Weight Loss and Tobacco Cessation Programs
- Tesla Babies program
- Employee discounts and perks
Similar jobs
Last update: 23 minutes ago
Internship Software Engineer - Machine Learning Infrastructure
Join Tesla as an Internship Software Engineer in Machine Learning Infrastructure. Work on AI infrastructure and neural network scaling.
AI Engineer Intern, Self-Driving
Join Tesla as an AI Engineer Intern to develop large-scale models for self-driving technology. Work on cutting-edge AI techniques.
AI Engineer Intern - Export & Inference
Join Tesla as an AI Engineer Intern focusing on Export & Inference. Work on cutting-edge AI projects in Palo Alto.
Internship, Reinforcement Learning Engineer
Join Tesla as a Reinforcement Learning Engineer Intern to develop robotic learning systems for humanoid robots.
Senior Backend Software Engineer - Design Exchange
Join Tesla's Design Exchange team as a Senior Backend Software Engineer, focusing on API and microservices.
Internship Technical Program Manager - Vehicle Software
Join Tesla as a Technical Program Manager Intern in Vehicle Software, focusing on software release management and program coordination.
Internship, Correctness & Reliability Engineer, Dojo
Join Tesla as a Correctness & Reliability Engineer Intern in Palo Alto, focusing on program analysis tools for supercomputers.
Frontend Software Engineer
Join Tesla as a Frontend Software Engineer to build scalable HR systems using Angular and React.js. Competitive salary and benefits.
Backend Software Engineer, Digital Experience
Join Tesla as a Backend Software Engineer to develop and support customer-facing applications, enhancing digital experiences.
Senior ML Infrastructure Engineer
Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.
Senior Software Engineer, Machine Learning
Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.
Remote Software Engineer - Machine Learning and Cloud Infrastructure
Join Helm.ai as a Remote Software Engineer to develop ML tools, build cloud infrastructure, and work on AI technology.
Software Engineer, Energy Software
Join Tesla as a Software Engineer in Palo Alto to develop backend software for energy products.
Mobile Software Development Engineer, Digital Experience
Join Tesla as a Mobile Software Engineer to develop next-gen digital experiences for iOS and Android.
Senior Distributed Systems Engineer
Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.
Machine Learning Engineer with AI/ML Experience
Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.
Machine Learning Engineer
Join Snap Inc. as a Machine Learning Engineer in Los Angeles. Develop and deploy ML models to enhance user experience. Competitive salary and benefits.
Machine Learning Compiler Engineer
Join Qualcomm as a Machine Learning Compiler Engineer to optimize ML compilers for cutting-edge accelerators.
AI Framework Engineer
Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.
Senior Machine Learning Engineer
Join Intuit as a Senior Machine Learning Engineer to innovate and scale AI algorithms in San Diego.
Machine Learning Engineer
Join PhysicsX as a Machine Learning Engineer to develop innovative models for physics simulations using Python and PyTorch.
Software Engineer - Platform
Join Refuel as a Software Engineer - Platform to design and develop critical features using Python, AWS, and LLMs in a hybrid work environment.
AI Solutions Software Engineer
Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama.
Machine Learning Engineer for Vehicle Safety Systems
Join Porsche AG as a Machine Learning Engineer to enhance vehicle safety systems using AI and data science.