Scale AI logo

Senior Software Engineer, Machine Learning Infrastructure

Scale AI

Job Overview

Scale AI is seeking a Senior Software Engineer to join our Machine Learning Infrastructure team. This role involves building and optimizing our Training Platform, working closely with Machine Learning researchers to enhance experimentation throughput. The ideal candidate will have a strong foundation in machine learning, backend system design, and prior experience in ML Infrastructure.

Key Responsibilities

  • Develop highly available, observable, performant, and cost-effective APIs for model training.
  • Participate in the team’s on-call process to ensure service availability.
  • Manage projects end-to-end, from requirements gathering to implementation, in a collaborative environment.
  • Make informed decisions on build vs. buy tradeoffs, focusing on cost efficiency.

Required Skills and Experience

  • 4+ years of experience in building machine learning training pipelines or inference services in production.
  • Proficiency in distributed training techniques such as DeepSpeed and FSDP.
  • Experience in building, deploying, and monitoring complex microservice architectures.
  • Strong skills in Python, Docker, Kubernetes, and Infrastructure as Code (e.g., Terraform).

Nice to Have

  • Experience with LLM inference latency optimization techniques, such as kernel fusion, quantization, and dynamic batching.
  • Familiarity with cloud technology stacks like AWS or GCP.

Compensation and Benefits

  • Base salary range: $160,000—$225,600 USD
  • Equity-based compensation, subject to Board approval
  • Comprehensive health, dental, and vision coverage
  • Retirement benefits
  • Learning and development stipend
  • Generous PTO
  • Potential additional benefits such as a commuter stipend

About Scale AI

At Scale, we are committed to accelerating the development of AI applications. Our mission is to transition from traditional software to AI across industries, transforming how organizations build and deploy AI. We power the world's most advanced LLMs, generative models, and computer vision models, trusted by companies like OpenAI, Meta, and Microsoft.

Scale AI is an affirmative action employer and an inclusive and equal opportunity workplace. We are committed to providing reasonable accommodations to applicants with disabilities. If you need assistance, please contact us.

Join us in our mission to unlock the value of AI and transform industries worldwide.

Benefits
Extracted with AI

  • Comprehensive health, dental and vision coverage
  • Retirement benefits
  • Learning and development stipend
  • Generous PTO
  • Commuter stipend

Similar jobs

Last update: 23 minutes ago

Scale AI logo
Scale AI

Senior Fullstack Software Engineer, GenAI Horizontal Task Tooling

Join Scale AI as a Senior Fullstack Software Engineer to build web-based applications for AI data annotation.

CHAI: AI Platform logo
CHAI: AI Platform

Senior ML Infrastructure Engineer

Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.

Scale AI logo
Scale AI

Software Engineer - New Grad

Join Scale AI as a Software Engineer - New Grad in San Francisco. Work on AI applications with TypeScript, MongoDB, and more.

Standard AI logo
Standard AI

Senior Software Engineer, Backend

Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.

webAI logo
webAI

Senior Distributed Systems Engineer

Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.

Scale AI logo
Scale AI

Software Engineering Intern (Summer 2025)

Join Scale AI as a Software Engineering Intern for Summer 2025, working on AI applications with Python, TypeScript, and MongoDB.

LlamaIndex logo
LlamaIndex

Founding AI Engineer, Backend

Join LlamaIndex as a Founding AI Engineer, Backend to build scalable cloud services for LLM applications.

Niantic, Inc. logo
Niantic, Inc.

Senior Software Engineer, Machine Learning

Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.

Intuit logo
Intuit

Senior Machine Learning Engineer

Join Intuit as a Senior Machine Learning Engineer to innovate and scale AI algorithms in San Diego.

Tesla logo
Tesla

Internship Software Engineer - Machine Learning Infrastructure

Join Tesla as an Internship Software Engineer in Machine Learning Infrastructure. Work on AI infrastructure and neural network scaling.

Cascading AI (YC S23) logo
Cascading AI (YC S23)

Senior Full-stack Engineer

Join Cascading AI as a Senior Full-stack Engineer to develop AI-driven lending solutions in San Francisco.

Scale AI logo
Scale AI

Staff AI Product Manager, Generative AI

Join Scale AI as a Staff AI Product Manager to lead the development of ML-powered products in a hybrid role in San Francisco.

Hayden AI logo
Hayden AI

Senior Software Engineer, Backend

Join Hayden AI as a Senior Backend Engineer to build scalable cloud services using AWS, Python, and Go.

Inclusively logo
Inclusively

Senior Software Engineer, Machine Learning

Join as a Senior Software Engineer in Machine Learning, working remotely to build ML-driven products for user engagement.

Inclusively logo
Inclusively

Senior Cloud Engineer

Join as a Senior Cloud Engineer to architect and deploy cloud solutions using Azure, AWS, and GCP. Lead innovation in cloud technology.

micro1 logo
micro1

Machine Learning Engineer with AI/ML Experience

Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.

eyepop.ai logo
eyepop.ai

Senior Software Engineer - Machine Learning and Data Science

Join EyePop.ai as a Senior Software Engineer to develop and scale machine learning and data science software pipelines.

webAI logo
webAI

AI Framework Engineer

Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.

LlamaIndex logo
LlamaIndex

Founding Applied AI Engineer

Join LlamaIndex as a Founding Applied AI Engineer to build and deploy LLM applications. Competitive salary and equity offered.

Helm.ai logo
Helm.ai

Remote Software Engineer - Machine Learning and Cloud Infrastructure

Join Helm.ai as a Remote Software Engineer to develop ML tools, build cloud infrastructure, and work on AI technology.

LlamaIndex logo
LlamaIndex

Founding AI Engineer

Join LlamaIndex as a Founding AI Engineer to shape the future of LLM applications with cutting-edge AI projects.

SSi People logo
SSi People

Senior Machine Learning Engineer

Join as a Senior Machine Learning Engineer to design and deploy advanced ML solutions using Python, Spark, and cloud platforms. Remote work opportunity.

CHAI: AI Platform logo
CHAI: AI Platform

Senior Applied AI Researcher

Join CHAI: AI Platform as a Senior Applied AI Researcher to optimize and innovate AI solutions in a high-growth environment.

Snowflake logo
Snowflake

AI Specialist - Machine Learning and AI

Join Snowflake as an AI Specialist focusing on Machine Learning and AI, supporting technical decision-makers in AI solutions.