Senior Software Engineer - AI/ML, AWS Neuron Distributed Training
Amazon Web Services (AWS)Job Overview
Amazon Web Services (AWS) is seeking a Senior Software Engineer to join the Machine Learning Applications (ML Apps) team, focusing on AWS Neuron for distributed training. This role involves building, delivering, and maintaining complex products that impact millions globally, designing fault-tolerant systems that operate at massive scale in the AWS Cloud.
Responsibilities
- Lead the development of distributed training support in Pytorch and Tensorflow using XLA and the Neuron compiler and runtime stacks.
- Tune ML models to ensure high performance and efficiency on AWS Trainium and Inferentia silicon and TRn1, Inf1 servers.
- Collaborate with chip architects, compiler engineers, and runtime engineers to build and optimize distributed training solutions.
Qualifications
Basic Qualifications
- 3+ years of non-internship professional software development experience.
- Experience in design or architecture of new and existing systems.
- Proficiency in programming with at least one software programming language.
- Deep learning industry experience.
Preferred Qualifications
- Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.
- Bachelor's degree in computer science or equivalent.
- Expertise in Pytorch, Jax, Tensorflow, distributed libraries, and frameworks.
- Experience in end-to-end model training.
About the Team
The ML Apps team at AWS Neuron works closely with various disciplines including silicon engineering, hardware design and verification, software, and operations. The team is dedicated to supporting new members, promoting knowledge sharing and mentorship, and is committed to providing a work environment that balances professional challenges with personal life.
Benefits
- Flexible working hours to support work-life balance.
- Opportunities for mentorship and career growth within the company.
- Inclusive team culture with a focus on employee well-being.
Benefits Extracted with AI
- Flexible working hours
- Work-life balance support
- Mentorship and career growth opportunities
Similar jobs
Last update: 23 minutes ago
Applied Scientist, Artificial General Intelligence
Join AWS as an Applied Scientist in Artificial General Intelligence, driving AI innovation in cloud computing.
Senior Applied Scientist, AWS Marketing AI/ML
Join AWS as a Senior Applied Scientist in Marketing AI/ML, leading personalization and targeting initiatives.
Senior Distributed Systems Engineer
Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.
Senior Software Development Engineer, Customer Trust - Core Services
Join Amazon as a Senior Software Development Engineer to build scalable AI systems for customer trust.
Machine Learning Engineer with AI/ML Experience
Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.
Senior Fullstack Software Engineer, GenAI Horizontal Task Tooling
Join Scale AI as a Senior Fullstack Software Engineer to build web-based applications for AI data annotation.
Senior Software Engineer, Platform
Join Augment AI as a Senior Software Engineer to build AI-driven platforms using AWS, Ruby, and Python. Enjoy great benefits and stock options.
AI Framework Engineer
Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.
Senior Cloud Engineer
Join as a Senior Cloud Engineer to architect and deploy cloud solutions using Azure, AWS, and GCP. Lead innovation in cloud technology.
Senior Software Engineer, Machine Learning
Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.
Senior Worldwide Specialist, GenAI Model Training & Inference
Join AWS as a Senior Specialist in GenAI Model Training & Inference, driving customer adoption and scaling workloads.
Frontend Engineer II
Join AWS as a Frontend Engineer II to build web applications using Angular, CSS, and JavaScript frameworks.
Senior Machine Learning Engineer
Join Amazon as a Senior Machine Learning Engineer to build scalable AI/ML infrastructure and MLOps platforms.
Senior Software Engineer, Backend
Join Hayden AI as a Senior Backend Engineer to build scalable cloud services using AWS, Python, and Go.
Senior Software Engineer, Backend
Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.
Senior Full-stack Engineer
Join Cascading AI as a Senior Full-stack Engineer to develop AI-driven lending solutions in San Francisco.
Senior Software Engineer (.NET Core, AWS)
Join Reveleer as a Senior Software Engineer to develop and maintain cloud-native applications using .NET Core and AWS.
MLOps Engagement Engineer
Join Nebius AI as an MLOps Engagement Engineer to design and optimize ML workflows using Kubernetes, Docker, and Slurm.
Senior Python Engineer (Cloud Platform)
Join Bonfy.AI as a Senior Python Engineer to build and maintain a cloud-based SaaS platform using Python and AWS.
Staff Engineer - Python, Cloud, Distributed Systems
Join Keelvar as a Staff Engineer to lead design and architecture in a remote role, focusing on Python, cloud, and distributed systems.
Deployment Cloud Support Engineer - Spanish Speaker
Join AWS as a Deployment Cloud Support Engineer in Dublin, fluent in Spanish, to support global cloud solutions.
Senior Software Development Engineer
Join Microsoft as a Senior Software Development Engineer to drive AI and ML innovations in Windows.
Applied Scientist, Artificial General Intelligence
Join Amazon's AGI team as an Applied Scientist to develop cutting-edge AI technology in Computer Vision and NLP.
Senior Software Development Engineer
Join Amazon as a Senior Software Development Engineer to innovate in delivery and fulfillment technology.