Deepgram logo

Senior Software Engineer, AI Inference

Deepgram

Company Overview

Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI models including speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.

Opportunity

We are seeking a backend engineer focused on AI inference to join the team powering Deepgram’s core speech inference APIs. You’ll implement and optimize inference code, experiment with cutting-edge technologies, and develop, maintain, and deploy the stack of services behind our blazing-fast, massive-throughput inference system. This role blends work on backend services and systems with domain specialty in neural networks and GPU programming. Our team owns the applications that serve api.deepgram.com and empowers builders of innovative speech products by focusing on a world-class combination of reliability, efficiency, and latency.

What You’ll Do

  • Implement inference for novel model architectures developed by Deepgram’s trailblazing research team
  • Develop, test, and deploy application code for massive-scale production services
  • Debug complex system issues that include networking, scheduling, and high-performance computing interactions
  • Build tooling for internal analysis and benchmarking to identify opportunities for efficiency improvements
  • Experiment with optimization techniques for ML workloads on NVIDIA GPUs and ship the key wins to prod

You’ll Love This Role If You

  • Think of yourself as a generalist while enjoying learning deeply in specific areas, causing you to go from debugging a customer issue one day to designing an algorithm the next
  • Like sipping piña coladas and getting caught in the rain
  • Enjoy taking ownership of features from early collaborations with researchers through testing in production
  • Love getting nitty-gritty with profilers, hardware architectures, and inference algorithms
  • Want to work within the context of a humble, collaborative team that collectively owns mission-critical production services

It’s Important to Us That You Have

  • The ability to work collaboratively in a fast-paced environment and adapt to changing priorities
  • Proven industry experience building and shipping production services
  • Strong confidence in a lower-level language like C, C++, or Rust
  • Experience slicing large projects or initiatives into smaller experiments or incremental improvements
  • Expertise in a ML framework like Torch or Tensorflow
  • Experience with GPU programming using tools like CUDA or libraries like cuDNN, cuBLAS, etc.

It Would Be Great If You Also Had

  • Extensive professional experience with Rust and C++
  • Experience optimizing ML workloads in production
  • Familiarity with GPU hardware architecture and its impact on inference pipelines

Benefits
Extracted with AI

  • Remote work flexibility

Similar jobs

Last update: 23 minutes ago

Groq logo
Groq

Senior/Principal Software Engineer

Join Groq as a Senior/Principal Software Engineer to design and develop scalable software for AI inference technology.

Standard AI logo
Standard AI

Senior Software Engineer, Backend

Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.

Ambient.ai logo
Ambient.ai

Senior Software Engineer - Backend (Python, Go, C++)

Join Ambient.ai as a Senior Software Engineer - Backend to design and scale distributed systems using Python, Go, or C++.

Argon AI (YC W24) logo
Argon AI (YC W24)

Senior Full Stack Software Engineer

Join Argon AI as a Senior Full Stack Software Engineer to build AI solutions for biopharma in NYC. Work with Python, PostgreSQL, and more.

micro1 logo
micro1

Machine Learning Engineer with AI/ML Experience

Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.

Hayden AI logo
Hayden AI

Senior Software Engineer, Backend

Join Hayden AI as a Senior Backend Engineer to build scalable cloud services using AWS, Python, and Go.

Argon AI (YC W24) logo
Argon AI (YC W24)

Founding Applied AI Engineer

Join Argon AI as a Founding Applied AI Engineer to lead AI initiatives in pharma, focusing on domain-specific AI and RAG systems.

Niantic, Inc. logo
Niantic, Inc.

Senior Software Engineer, Machine Learning

Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.

Meta logo
Meta

Research Engineer, Language - Generative AI

Join Meta as a Research Engineer in Generative AI, focusing on large language models and NLP.

Inclusively logo
Inclusively

Senior Software Engineer, Machine Learning

Join as a Senior Software Engineer in Machine Learning, working remotely to build ML-driven products for user engagement.

Waabi logo
Waabi

Remote Software Engineer

Join Waabi as a Remote Software Engineer to develop cutting-edge self-driving technology. Work with AI, Python, C++, and more.

CHAI: AI Platform logo
CHAI: AI Platform

Senior ML Infrastructure Engineer

Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.

webAI logo
webAI

AI Framework Engineer

Join webAI as an AI Framework Engineer to develop innovative AI frameworks for distributed computing environments.

Vapi logo
Vapi

Founding Senior Backend Engineer

Join Vapi as a Founding Senior Backend Engineer to shape voice AI technology. Work on-site in San Francisco.

Intuit logo
Intuit

Senior Machine Learning Engineer

Join Intuit as a Senior Machine Learning Engineer to innovate and scale AI algorithms in San Diego.

Scale AI logo
Scale AI

Senior Fullstack Software Engineer, GenAI Horizontal Task Tooling

Join Scale AI as a Senior Fullstack Software Engineer to build web-based applications for AI data annotation.

Reducto logo
Reducto

Founding Software Engineer

Join Reducto as a Founding Software Engineer to shape AI document processing. Work on-site in San Francisco with Python, Next.js, and vision models.

LlamaIndex logo
LlamaIndex

Founding AI Engineer, Backend

Join LlamaIndex as a Founding AI Engineer, Backend to build scalable cloud services for LLM applications.

Reddit, Inc. logo
Reddit, Inc.

Senior Backend Engineer, Moderation

Join Reddit as a Senior Backend Engineer in Moderation, working remotely in the U.S. with Python, Rust, and GraphQL.

DwellFi  logo
DwellFi

AI Solutions Software Engineer

Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama.

webAI logo
webAI

Senior Distributed Systems Engineer

Join webAI as a Senior Distributed Systems Engineer to design and maintain scalable systems using Python, Kubernetes, and more.

Cascading AI (YC S23) logo
Cascading AI (YC S23)

Senior Full-stack Engineer

Join Cascading AI as a Senior Full-stack Engineer to develop AI-driven lending solutions in San Francisco.

Accrete AI logo
Accrete AI

Senior Prompt Engineer

Join Accrete AI as a Senior Prompt Engineer to design and optimize prompts for AI agents, enhancing NLP applications.

Stability AI logo
Stability AI

Senior Backend Engineer (AI)

Join Stability AI as a Senior Backend Engineer to develop REST APIs and AI/ML services for Generative AI models.