Perplexity logo

AI Inference Engineer

Perplexity

Job Overview

We are seeking an AI Inference Engineer to join our dynamic team at Perplexity in San Francisco. This role involves working on large-scale deployment of machine learning models for real-time inference, focusing on both internal and external applications.

Responsibilities

  • Develop APIs for AI inference to be used by a diverse range of customers.
  • Benchmark and address bottlenecks in our inference stack.
  • Improve the reliability and observability of our systems and respond to system outages.
  • Explore novel research and implement LLM inference optimizations.

Qualifications

  • Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX.
  • Familiarity with common LLM architectures and inference optimization techniques like continuous batching and quantization.
  • Optional: Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Company Growth and Opportunities

Perplexity has experienced tremendous growth, amassing 10 million monthly active users and serving over 500 million queries globally. With significant funding and a valuation over $1 billion, we offer substantial opportunities for career advancement and impact.

Compensation and Benefits

  • Salary Range: $190,000 - $240,000 annually.
  • Equity: Equity is part of the total compensation package.
  • Benefits: Comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan.

Join us at Perplexity, where your work in AI inference will help drive the future of real-time, large-scale machine learning applications.

Benefits
Extracted with AI

  • Comprehensive health, dental, and vision insurance
  • 401(k) plan

Similar jobs

Last update: 23 minutes ago

Perplexity logo
Perplexity

Senior Backend Software Engineer - API

Join Perplexity as a Senior Backend Software Engineer to design and scale API systems using Python, PostgreSQL, and Kubernetes.

Perplexity logo
Perplexity

Senior Machine Learning Engineer

Join Perplexity as a Senior Machine Learning Engineer in New York, focusing on AI, ML, and backend development.

Perplexity logo
Perplexity

AI Software Engineer - Full Stack

Join Perplexity as a Full Stack AI Software Engineer to develop cutting-edge AI products using Python, TypeScript, and Kubernetes.

CHAI: AI Platform logo
CHAI: AI Platform

Senior ML Infrastructure Engineer

Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.

Ema Unlimited logo
Ema Unlimited

Machine Learning Engineer

Join Ema Unlimited as a Machine Learning Engineer in SF Bay Area, working on cutting-edge AI solutions with a focus on NLP and ML technologies.

Tesla logo
Tesla

AI Engineer Intern - Export & Inference

Join Tesla as an AI Engineer Intern focusing on Export & Inference. Work on cutting-edge AI projects in Palo Alto.

Perplexity logo
Perplexity

Senior Backend Software Engineer

Join Perplexity as a Senior Backend Software Engineer in New York. Lead design, implementation, and scaling of systems for web and mobile products.

LlamaIndex logo
LlamaIndex

Founding Applied AI Engineer

Join LlamaIndex as a Founding Applied AI Engineer to build and deploy LLM applications. Competitive salary and equity offered.

Accrete AI logo
Accrete AI

Senior Prompt Engineer

Join Accrete AI as a Senior Prompt Engineer to design and optimize prompts for AI agents, enhancing NLP applications.

Niantic, Inc. logo
Niantic, Inc.

Senior Software Engineer, Machine Learning

Join Niantic as a Senior Software Engineer in Machine Learning to enhance products using generative AI technologies.

DwellFi  logo
DwellFi

AI Solutions Software Engineer

Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama.

LlamaIndex logo
LlamaIndex

Founding AI Engineer

Join LlamaIndex as a Founding AI Engineer to shape the future of LLM applications with cutting-edge AI projects.

CHAI: AI Platform logo
CHAI: AI Platform

Senior Applied AI Researcher

Join CHAI: AI Platform as a Senior Applied AI Researcher to optimize and innovate AI solutions in a high-growth environment.

Intuit logo
Intuit

Senior Machine Learning Engineer

Join Intuit as a Senior Machine Learning Engineer to innovate and scale AI algorithms in San Diego.

Tesla logo
Tesla

Internship Software Engineer - Machine Learning Infrastructure

Join Tesla as an Internship Software Engineer in Machine Learning Infrastructure. Work on AI infrastructure and neural network scaling.

ResiQuant logo
ResiQuant

Founding Applied AI Engineer

Join ottobooks as a Founding Applied AI Engineer to revolutionize accounting with AI. Focus on NLP, OCR, and more.

micro1 logo
micro1

Machine Learning Engineer with AI/ML Experience

Join us as a Machine Learning Engineer to develop AI/ML models and applications. Work remotely with top-tier companies.

Meta logo
Meta

Research Engineer, Language - Generative AI

Join Meta as a Research Engineer in Generative AI, focusing on large language models and NLP.

Sentry logo
Sentry

Machine Learning Engineer

Join Sentry as a Machine Learning Engineer to develop AI models and algorithms for smarter software solutions.

LlamaIndex logo
LlamaIndex

Founding AI Engineer, Backend

Join LlamaIndex as a Founding AI Engineer, Backend to build scalable cloud services for LLM applications.

Standard AI logo
Standard AI

Senior Software Engineer, Backend

Join Standard AI as a Senior Backend Engineer to design scalable microservices and APIs. Remote role with competitive salary and benefits.

Zep AI (YC W24) logo
Zep AI (YC W24)

Senior AI Engineer

Join Zep AI as a Senior AI Engineer to lead LLM-based AI solutions development in a hybrid work environment.

Mithrl logo
Mithrl

AI Engineer with Deep Learning and NLP Expertise

Join Mithrl as an AI Engineer to lead AI research in biology, focusing on deep learning and NLP.

Argon AI (YC W24) logo
Argon AI (YC W24)

Founding Applied AI Engineer

Join Argon AI as a Founding Applied AI Engineer to lead AI initiatives in pharma, focusing on domain-specific AI and RAG systems.