Perplexity logo

AI Inference Engineer

Perplexity

Job Overview

We are seeking an AI Inference Engineer to join our dynamic team at Perplexity in San Francisco. This role involves working on large-scale deployment of machine learning models for real-time inference, focusing on both internal and external applications.

Responsibilities

  • Develop APIs for AI inference to be used by a diverse range of customers.
  • Benchmark and address bottlenecks in our inference stack.
  • Improve the reliability and observability of our systems and respond to system outages.
  • Explore novel research and implement LLM inference optimizations.

Qualifications

  • Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX.
  • Familiarity with common LLM architectures and inference optimization techniques like continuous batching and quantization.
  • Optional: Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Company Growth and Opportunities

Perplexity has experienced tremendous growth, amassing 10 million monthly active users and serving over 500 million queries globally. With significant funding and a valuation over $1 billion, we offer substantial opportunities for career advancement and impact.

Compensation and Benefits

  • Salary Range: $190,000 - $240,000 annually.
  • Equity: Equity is part of the total compensation package.
  • Benefits: Comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan.

Join us at Perplexity, where your work in AI inference will help drive the future of real-time, large-scale machine learning applications.

Benefits
Extracted with AI

  • Comprehensive health, dental, and vision insurance
  • 401(k) plan

Similar jobs

Last update: 23 minutes ago

Perplexity logo
Perplexity

AI Research Engineer

Join Perplexity as an AI Research Engineer to innovate AI-powered search solutions using LLMs in San Francisco.

Perplexity logo
Perplexity

AI Research Engineer - LLM Training

Join Perplexity as an AI Research Engineer to enhance LLMs using AI, ML, and NLP in San Francisco.

Perplexity logo
Perplexity

AI Software Engineer - Full Stack

Join Perplexity as a Full Stack AI Software Engineer to develop cutting-edge AI products using Python, TypeScript, and Kubernetes.

Perplexity logo
Perplexity

Senior Machine Learning Engineer

Join Perplexity as a Senior Machine Learning Engineer in New York, focusing on AI, ML, and backend development.

Perplexity logo
Perplexity

Senior Full Stack Software Engineer

Senior Full Stack Engineer in AI, using NextJS, TypeScript, Python, AWS. Develop and scale applications, debug, and improve systems.

Perplexity logo
Perplexity

Senior Backend Software Engineer

Join Perplexity as a Senior Backend Software Engineer in New York. Lead design, implementation, and scaling of systems for web and mobile products.

Perplexity logo
Perplexity

Senior Backend Software Engineer - API

Join Perplexity as a Senior Backend Software Engineer to design and scale API systems using Python, PostgreSQL, and Kubernetes.

FoodLabs logo
FoodLabs

Senior C++ Computer Vision Engineer

Join a cutting-edge AI-DeepTech startup in Berlin as a Senior C++ Computer Vision Engineer. Work on world-class on-device AI technology.

Pulley logo
Pulley

AI Engineer with Machine Learning and Deep Learning Expertise

Join Pulley as an AI Engineer to develop AI-driven solutions, enhance internal tools, and collaborate with cross-functional teams.

Perplexity logo
Perplexity

Senior Frontend Software Engineer (React, TypeScript)

Join Perplexity as a Senior Frontend Engineer to revolutionize web search with React and TypeScript.

DeepL logo
DeepL

Senior Backend Engineer C++

Join DeepL as a Senior Backend Engineer C++ to design and maintain scalable backend services using C++ and AI technologies.

Poggio logo
Poggio

Senior AI Engineer

Join Poggio as a Senior AI Engineer to innovate AI systems for enterprise sales, focusing on AI capabilities and system performance.

CentML logo
CentML

Senior Software Engineer - LLM Inference

Join CentML as a Senior Software Engineer focusing on LLM Inference, leveraging AI, ML, and GPU technologies.

Persona logo
Persona

LLM Backend Developer

Join Persona as a LLM Backend Developer, work remotely, and develop AI-driven backend systems for top startups.

Resolve AI logo
Resolve AI

AI Engineer with LLM Expertise

Join Resolve AI as an AI Engineer in San Francisco to build AI-powered workflows with LLM expertise.

BCG X logo
BCG X

AI Engineer

Join BCG X as an AI Engineer in Milan, Italy. Develop AI solutions, partner with clients, and drive innovation in a dynamic environment.

Mithrl logo
Mithrl

AI Engineer with Deep Learning and NLP Expertise

Join Mithrl as an AI Engineer to lead AI research in biology, focusing on deep learning and NLP.

yourfirm GmbH logo
yourfirm GmbH

Senior Fullstack Developer for AI-Driven Mission Technologies

Seeking a Senior Fullstack Developer for AI-driven mission technologies, focusing on Java, JavaScript, Python, and C++. Remote work available.

Cere Network logo
Cere Network

Principal AI Engineer

Join Cere Network as a Principal AI Engineer to drive AI innovation in Web3. Requires 10+ years in AI/ML, NLP, and software development.

Huawei Nederland logo
Huawei Nederland

Information Retrieval Algorithm Engineer

Join Huawei as an Information Retrieval Algorithm Engineer to develop cutting-edge AI technologies in Amsterdam.

DwellFi  logo
DwellFi

AI Solutions Software Engineer

Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama. Remote position in Palo Alto, CA.

Sentry logo
Sentry

Machine Learning Engineer

Join Sentry as a Machine Learning Engineer to develop AI models and algorithms for smarter software solutions.

Applied Intuition logo
Applied Intuition

Software Engineer - Autonomous Driving

Join Applied Intuition as a Software Engineer in Munich to tackle autonomous driving challenges with top ADAS/AV programs.

NVIDIA logo
NVIDIA

Machine Learning Engineer - LLM Fine-tuning and Performance

Join NVIDIA as a Machine Learning Engineer specializing in LLM fine-tuning and performance optimization. Work with cutting-edge ML technologies.