Senior Software Engineer, AI Inference
DeepgramCompany Overview
Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI models including speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.
Opportunity
We are seeking a backend engineer focused on AI inference to join the team powering Deepgram’s core speech inference APIs. You’ll implement and optimize inference code, experiment with cutting-edge technologies, and develop, maintain, and deploy the stack of services behind our blazing-fast, massive-throughput inference system. This role blends work on backend services and systems with domain specialty in neural networks and GPU programming. Our team owns the applications that serve api.deepgram.com and empowers builders of innovative speech products by focusing on a world-class combination of reliability, efficiency, and latency.
What You’ll Do
- Implement inference for novel model architectures developed by Deepgram’s trailblazing research team
- Develop, test, and deploy application code for massive-scale production services
- Debug complex system issues that include networking, scheduling, and high-performance computing interactions
- Build tooling for internal analysis and benchmarking to identify opportunities for efficiency improvements
- Experiment with optimization techniques for ML workloads on NVIDIA GPUs and ship the key wins to prod
You’ll Love This Role If You
- Think of yourself as a generalist while enjoying learning deeply in specific areas, causing you to go from debugging a customer issue one day to designing an algorithm the next
- Like sipping piña coladas and getting caught in the rain
- Enjoy taking ownership of features from early collaborations with researchers through testing in production
- Love getting nitty-gritty with profilers, hardware architectures, and inference algorithms
- Want to work within the context of a humble, collaborative team that collectively owns mission-critical production services
It’s Important to Us That You Have
- The ability to work collaboratively in a fast-paced environment and adapt to changing priorities
- Proven industry experience building and shipping production services
- Strong confidence in a lower-level language like C, C++, or Rust
- Experience slicing large projects or initiatives into smaller experiments or incremental improvements
- Expertise in a ML framework like Torch or Tensorflow
- Experience with GPU programming using tools like CUDA or libraries like cuDNN, cuBLAS, etc.
It Would Be Great If You Also Had
- Extensive professional experience with Rust and C++
- Experience optimizing ML workloads in production
- Familiarity with GPU hardware architecture and its impact on inference pipelines
Benefits Extracted with AI
- Remote work flexibility
Similar jobs
Last update: 23 minutes ago
Senior/Principal Software Engineer
Join Groq as a Senior/Principal Software Engineer to design and develop scalable software for AI inference technology.
Senior Software Engineer, Backend
Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.
Senior Software Engineer - Backend (Python, Go, C++)
Join Ambient.ai as a Senior Software Engineer - Backend to design and scale distributed systems using Python, Go, or C++.
Senior Full Stack Software Engineer
Join Argon AI as a Senior Full Stack Software Engineer to build AI solutions for biopharma in NYC. Work with Python, PostgreSQL, and more.
Machine Learning Engineer with AI/ML Experience
Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.
Senior Software Engineer, Backend
Join Hayden AI as a Senior Backend Engineer to build scalable cloud services using AWS, Python, and Go.
Founding Applied AI Engineer
Join Argon AI as a Founding Applied AI Engineer to lead AI initiatives in pharma, focusing on domain-specific AI and RAG systems.
Senior Software Engineer, Machine Learning
Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.
Research Engineer, Language - Generative AI
Join Meta as a Research Engineer in Generative AI, focusing on large language models and NLP.
Senior Software Engineer, Machine Learning
Join as a Senior Software Engineer in Machine Learning, working remotely to build ML-driven products for user engagement.
Remote Software Engineer
Join Waabi as a Remote Software Engineer to develop cutting-edge self-driving technology. Work with AI, Python, C++, and more.
Senior ML Infrastructure Engineer
Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.
AI Framework Engineer
Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.
Founding Senior Backend Engineer
Join Vapi as a Founding Senior Backend Engineer to shape voice AI technology. Work on-site in San Francisco.
Senior Machine Learning Engineer
Join Intuit as a Senior Machine Learning Engineer to innovate and scale AI algorithms in San Diego.
Senior Fullstack Software Engineer, GenAI Horizontal Task Tooling
Join Scale AI as a Senior Fullstack Software Engineer to build web-based applications for AI data annotation.
Founding Software Engineer
Join Reducto as a Founding Software Engineer to shape AI document processing. Work on-site in San Francisco with Python, Next.js, and vision models.
Founding AI Engineer, Backend
Join LlamaIndex as a Founding AI Engineer, Backend to build scalable cloud services for LLM applications.
Senior Backend Engineer, Moderation
Join Reddit as a Senior Backend Engineer in Moderation, working remotely in the U.S. with Python, Rust, and GraphQL.
AI Solutions Software Engineer
Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama.
Senior Distributed Systems Engineer
Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.
Senior Full-stack Engineer
Join Cascading AI as a Senior Full-stack Engineer to develop AI-driven lending solutions in San Francisco.
Senior Prompt Engineer
Join Accrete AI as a Senior Prompt Engineer to design and optimize prompts for AI agents, enhancing NLP applications.
Senior Backend Engineer (AI)
Join Stability AI as a Senior Backend Engineer to develop REST APIs and AI/ML services for Generative AI models.