Amazon Web Services (AWS) logo

Senior Software Engineer - AI/ML, AWS Neuron Distributed Training

Amazon Web Services (AWS)

Job Overview

Amazon Web Services (AWS) is seeking a Senior Software Engineer to join the Machine Learning Applications (ML Apps) team, focusing on AWS Neuron for distributed training. This role involves building, delivering, and maintaining complex products that impact millions globally, designing fault-tolerant systems that operate at massive scale in the AWS Cloud.

Responsibilities

  • Lead the development of distributed training support in Pytorch and Tensorflow using XLA and the Neuron compiler and runtime stacks.
  • Tune ML models to ensure high performance and efficiency on AWS Trainium and Inferentia silicon and TRn1, Inf1 servers.
  • Collaborate with chip architects, compiler engineers, and runtime engineers to build and optimize distributed training solutions.

Qualifications

Basic Qualifications

  • 3+ years of non-internship professional software development experience.
  • Experience in design or architecture of new and existing systems.
  • Proficiency in programming with at least one software programming language.
  • Deep learning industry experience.

Preferred Qualifications

  • Experience with full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.
  • Bachelor's degree in computer science or equivalent.
  • Expertise in Pytorch, Jax, Tensorflow, distributed libraries, and frameworks.
  • Experience in end-to-end model training.

About the Team

The ML Apps team at AWS Neuron works closely with various disciplines including silicon engineering, hardware design and verification, software, and operations. The team is dedicated to supporting new members, promoting knowledge sharing and mentorship, and is committed to providing a work environment that balances professional challenges with personal life.

Benefits

  • Flexible working hours to support work-life balance.
  • Opportunities for mentorship and career growth within the company.
  • Inclusive team culture with a focus on employee well-being.

Benefits
Extracted with AI

  • Flexible working hours
  • Work-life balance support
  • Mentorship and career growth opportunities

Similar jobs

Last update: 23 minutes ago

Amazon Web Services (AWS) logo
Amazon Web Services (AWS)

Applied Scientist, Artificial General Intelligence

Join AWS as an Applied Scientist in Artificial General Intelligence, driving AI innovation in cloud computing.

Amazon Web Services (AWS) logo
Amazon Web Services (AWS)

Senior Applied Scientist, AWS Marketing AI/ML

Join AWS as a Senior Applied Scientist in Marketing AI/ML, leading personalization and targeting initiatives.

webAI logo
webAI

Senior Distributed Systems Engineer

Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.

Amazon logo
micro1 logo
micro1

Machine Learning Engineer with AI/ML Experience

Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.

Scale AI logo
Scale AI

Senior Fullstack Software Engineer, GenAI Horizontal Task Tooling

Join Scale AI as a Senior Fullstack Software Engineer to build web-based applications for AI data annotation.

Augment AI logo
Augment AI

Senior Software Engineer, Platform

Join Augment AI as a Senior Software Engineer to build AI-driven platforms using AWS, Ruby, and Python. Enjoy great benefits and stock options.

webAI logo
webAI

AI Framework Engineer

Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.

Inclusively logo
Inclusively

Senior Cloud Engineer

Join as a Senior Cloud Engineer to architect and deploy cloud solutions using Azure, AWS, and GCP. Lead innovation in cloud technology.

Niantic, Inc. logo
Niantic, Inc.

Senior Software Engineer, Machine Learning

Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.

Amazon Web Services (AWS) logo
Amazon Web Services (AWS)

Senior Worldwide Specialist, GenAI Model Training & Inference

Join AWS as a Senior Specialist in GenAI Model Training & Inference, driving customer adoption and scaling workloads.

Amazon Web Services (AWS) logo
Amazon Web Services (AWS)

Frontend Engineer II

Join AWS as a Frontend Engineer II to build web applications using Angular, CSS, and JavaScript frameworks.

Amazon logo
Amazon

Senior Machine Learning Engineer

Join Amazon as a Senior Machine Learning Engineer to build scalable AI/ML infrastructure and MLOps platforms.

Hayden AI logo
Hayden AI

Senior Software Engineer, Backend

Join Hayden AI as a Senior Backend Engineer to build scalable cloud services using AWS, Python, and Go.

Standard AI logo
Standard AI

Senior Software Engineer, Backend

Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.

Cascading AI (YC S23) logo
Cascading AI (YC S23)

Senior Full-stack Engineer

Join Cascading AI as a Senior Full-stack Engineer to develop AI-driven lending solutions in San Francisco.

Reveleer logo
Reveleer

Senior Software Engineer (.NET Core, AWS)

Join Reveleer as a Senior Software Engineer to develop and maintain cloud-native applications using .NET Core and AWS.

Nebius AI logo
Nebius AI

MLOps Engagement Engineer

Join Nebius AI as an MLOps Engagement Engineer to design and optimize ML workflows using Kubernetes, Docker, and Slurm.

Bonfy.AI logo
Bonfy.AI

Senior Python Engineer (Cloud Platform)

Join Bonfy.AI as a Senior Python Engineer to build and maintain a cloud-based SaaS platform using Python and AWS.

Keelvar logo
Keelvar

Staff Engineer - Python, Cloud, Distributed Systems

Join Keelvar as a Staff Engineer to lead design and architecture in a remote role, focusing on Python, cloud, and distributed systems.

Amazon Web Services (AWS) logo
Amazon Web Services (AWS)

Deployment Cloud Support Engineer - Spanish Speaker

Join AWS as a Deployment Cloud Support Engineer in Dublin, fluent in Spanish, to support global cloud solutions.

Microsoft logo
Microsoft

Senior Software Development Engineer

Join Microsoft as a Senior Software Development Engineer to drive AI and ML innovations in Windows.

Amazon logo
Amazon

Applied Scientist, Artificial General Intelligence

Join Amazon's AGI team as an Applied Scientist to develop cutting-edge AI technology in Computer Vision and NLP.

Amazon logo
Amazon

Senior Software Development Engineer

Join Amazon as a Senior Software Development Engineer to innovate in delivery and fulfillment technology.