Perplexity logo

AI Inference Engineer

Perplexity

Job Overview

We are seeking an AI Inference Engineer to join our dynamic team at Perplexity in San Francisco. This role involves working on large-scale deployment of machine learning models for real-time inference, focusing on both internal and external applications.

Responsibilities

  • Develop APIs for AI inference to be used by a diverse range of customers.
  • Benchmark and address bottlenecks in our inference stack.
  • Improve the reliability and observability of our systems and respond to system outages.
  • Explore novel research and implement LLM inference optimizations.

Qualifications

  • Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX.
  • Familiarity with common LLM architectures and inference optimization techniques like continuous batching and quantization.
  • Optional: Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Company Growth and Opportunities

Perplexity has experienced tremendous growth, amassing 10 million monthly active users and serving over 500 million queries globally. With significant funding and a valuation over $1 billion, we offer substantial opportunities for career advancement and impact.

Compensation and Benefits

  • Salary Range: $190,000 - $240,000 annually.
  • Equity: Equity is part of the total compensation package.
  • Benefits: Comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan.

Join us at Perplexity, where your work in AI inference will help drive the future of real-time, large-scale machine learning applications.

Benefits
Extracted with AI

  • Comprehensive health, dental, and vision insurance
  • 401(k) plan

Similar jobs

Last update: 23 minutes ago

Perplexity logo
Perplexity

AI Research Engineer

Join Perplexity as an AI Research Engineer to innovate AI-powered search solutions using LLMs in San Francisco.

Perplexity logo
Perplexity

AI Research Engineer - LLM Training

Join Perplexity as an AI Research Engineer to enhance LLMs using AI, ML, and NLP in San Francisco.

Perplexity logo
Perplexity

AI Software Engineer - Full Stack

Join Perplexity as a Full Stack AI Software Engineer to develop cutting-edge AI products using Python, TypeScript, and Kubernetes.

Perplexity logo
Perplexity

Senior Machine Learning Engineer

Join Perplexity as a Senior Machine Learning Engineer in New York, focusing on AI, ML, and backend development.

Perplexity logo
Perplexity

Senior Full Stack Software Engineer

Senior Full Stack Engineer in AI, using NextJS, TypeScript, Python, AWS. Develop and scale applications, debug, and improve systems.

Perplexity logo
Perplexity

Senior Backend Software Engineer

Join Perplexity as a Senior Backend Software Engineer in New York. Lead design, implementation, and scaling of systems for web and mobile products.

Perplexity logo
Perplexity

Senior Backend Software Engineer - API

Join Perplexity as a Senior Backend Software Engineer to design and scale API systems using Python, PostgreSQL, and Kubernetes.

Perplexity logo
Perplexity

Senior Frontend Software Engineer (React, TypeScript)

Join Perplexity as a Senior Frontend Engineer to revolutionize web search with React and TypeScript.

Pulley logo
Pulley

AI Engineer with Machine Learning and Deep Learning Expertise

Join Pulley as an AI Engineer to develop AI-driven solutions, enhance internal tools, and collaborate with cross-functional teams.

Mithrl logo
Mithrl

AI Engineer with Deep Learning and NLP Expertise

Join Mithrl as an AI Engineer to lead AI research in biology, focusing on deep learning and NLP.

CentML logo
CentML

Senior Software Engineer - LLM Inference

Join CentML as a Senior Software Engineer focusing on LLM Inference, leveraging AI, ML, and GPU technologies.

Resolve AI logo
Resolve AI

AI Engineer with LLM Expertise

Join Resolve AI as an AI Engineer in San Francisco to build AI-powered workflows with LLM expertise.

Sentry logo
Sentry

Machine Learning Engineer

Join Sentry as a Machine Learning Engineer to develop AI models and algorithms for smarter software solutions.

OpenAI logo
OpenAI

Research Engineer/Scientist, Perception - OpenAI

Join OpenAI as a Research Engineer/Scientist in Perception, enhancing AI capabilities in San Francisco. Hybrid work, relocation offered.

Poggio logo
Poggio

Senior AI Engineer

Join Poggio as a Senior AI Engineer to innovate AI systems for enterprise sales, focusing on AI capabilities and system performance.

Deepgram logo
Deepgram

Senior Software Engineer, AI Inference

Senior AI Inference Engineer specializing in backend development and optimization techniques for high-performance computing.

Zep AI (YC W24) logo
Zep AI (YC W24)

Senior AI Engineer

Join Zep AI as a Senior AI Engineer to lead LLM-based AI solutions development in a hybrid work environment.

Ema Unlimited logo
Ema Unlimited

Machine Learning Engineer

Join Ema Unlimited as a Machine Learning Engineer in SF Bay Area, working on cutting-edge AI solutions with a focus on NLP and ML technologies.

Sourcegraph logo
Sourcegraph

Senior Machine Learning Engineer

Join Sourcegraph as a Senior ML Engineer to revolutionize code intelligence with AI and NLP.

LlamaIndex logo
LlamaIndex

Founding Applied AI Engineer

Join LlamaIndex as a Founding Applied AI Engineer to build and deploy LLM applications. Competitive salary and equity offered.

Tesla logo
Tesla

AI Engineer Intern - Export & Inference

Join Tesla as an AI Engineer Intern focusing on Export & Inference. Work on cutting-edge AI projects in Palo Alto.

CHAI: AI Platform logo
CHAI: AI Platform

Senior ML Infrastructure Engineer

Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.

Ikigai logo
Ikigai

AI/ML Engineer

Join Ikigai Labs as an AI/ML Engineer in San Mateo, CA. Engage in ML optimization, tool development, and collaborative problem-solving.

Scale AI logo
Scale AI

Senior Software Engineer, Machine Learning Infrastructure

Join Scale AI as a Senior Software Engineer in Machine Learning Infrastructure, focusing on backend system design and ML Infrastructure.