Job Overview
We are seeking an AI Inference Engineer to join our dynamic team at Perplexity in San Francisco. This role involves working on large-scale deployment of machine learning models for real-time inference, focusing on both internal and external applications.
Responsibilities
- Develop APIs for AI inference to be used by a diverse range of customers.
- Benchmark and address bottlenecks in our inference stack.
- Improve the reliability and observability of our systems and respond to system outages.
- Explore novel research and implement LLM inference optimizations.
Qualifications
- Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, and ONNX.
- Familiarity with common LLM architectures and inference optimization techniques like continuous batching and quantization.
- Optional: Understanding of GPU architectures or experience with GPU kernel programming using CUDA.
Company Growth and Opportunities
Perplexity has experienced tremendous growth, amassing 10 million monthly active users and serving over 500 million queries globally. With significant funding and a valuation over $1 billion, we offer substantial opportunities for career advancement and impact.
Compensation and Benefits
- Salary Range: $190,000 - $240,000 annually.
- Equity: Equity is part of the total compensation package.
- Benefits: Comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan.
Join us at Perplexity, where your work in AI inference will help drive the future of real-time, large-scale machine learning applications.
Benefits Extracted with AI
- Comprehensive health, dental, and vision insurance
- 401(k) plan
Similar jobs
Last update: 23 minutes ago
AI Research Engineer
Join Perplexity as an AI Research Engineer to innovate AI-powered search solutions using LLMs in San Francisco.
AI Research Engineer - LLM Training
Join Perplexity as an AI Research Engineer to enhance LLMs using AI, ML, and NLP in San Francisco.
AI Software Engineer - Full Stack
Join Perplexity as a Full Stack AI Software Engineer to develop cutting-edge AI products using Python, TypeScript, and Kubernetes.
Senior Machine Learning Engineer
Join Perplexity as a Senior Machine Learning Engineer in New York, focusing on AI, ML, and backend development.
Senior Full Stack Software Engineer
Senior Full Stack Engineer in AI, using NextJS, TypeScript, Python, AWS. Develop and scale applications, debug, and improve systems.
Senior Backend Software Engineer
Join Perplexity as a Senior Backend Software Engineer in New York. Lead design, implementation, and scaling of systems for web and mobile products.
Senior Backend Software Engineer - API
Join Perplexity as a Senior Backend Software Engineer to design and scale API systems using Python, PostgreSQL, and Kubernetes.
Senior Frontend Software Engineer (React, TypeScript)
Join Perplexity as a Senior Frontend Engineer to revolutionize web search with React and TypeScript.
AI Engineer with Machine Learning and Deep Learning Expertise
Join Pulley as an AI Engineer to develop AI-driven solutions, enhance internal tools, and collaborate with cross-functional teams.
AI Engineer with Deep Learning and NLP Expertise
Join Mithrl as an AI Engineer to lead AI research in biology, focusing on deep learning and NLP.
Senior Software Engineer - LLM Inference
Join CentML as a Senior Software Engineer focusing on LLM Inference, leveraging AI, ML, and GPU technologies.
AI Engineer with LLM Expertise
Join Resolve AI as an AI Engineer in San Francisco to build AI-powered workflows with LLM expertise.
Machine Learning Engineer
Join Sentry as a Machine Learning Engineer to develop AI models and algorithms for smarter software solutions.
Research Engineer/Scientist, Perception - OpenAI
Join OpenAI as a Research Engineer/Scientist in Perception, enhancing AI capabilities in San Francisco. Hybrid work, relocation offered.
Senior AI Engineer
Join Poggio as a Senior AI Engineer to innovate AI systems for enterprise sales, focusing on AI capabilities and system performance.
Senior Software Engineer, AI Inference
Senior AI Inference Engineer specializing in backend development and optimization techniques for high-performance computing.
Senior AI Engineer
Join Zep AI as a Senior AI Engineer to lead LLM-based AI solutions development in a hybrid work environment.
Machine Learning Engineer
Join Ema Unlimited as a Machine Learning Engineer in SF Bay Area, working on cutting-edge AI solutions with a focus on NLP and ML technologies.
Senior Machine Learning Engineer
Join Sourcegraph as a Senior ML Engineer to revolutionize code intelligence with AI and NLP.
Founding Applied AI Engineer
Join LlamaIndex as a Founding Applied AI Engineer to build and deploy LLM applications. Competitive salary and equity offered.
AI Engineer Intern - Export & Inference
Join Tesla as an AI Engineer Intern focusing on Export & Inference. Work on cutting-edge AI projects in Palo Alto.
Senior ML Infrastructure Engineer
Join CHAI: AI Platform as a Senior ML Infrastructure Engineer to build and scale ML systems in Palo Alto.
AI/ML Engineer
Join Ikigai Labs as an AI/ML Engineer in San Mateo, CA. Engage in ML optimization, tool development, and collaborative problem-solving.
Senior Software Engineer, Machine Learning Infrastructure
Join Scale AI as a Senior Software Engineer in Machine Learning Infrastructure, focusing on backend system design and ML Infrastructure.