About The Team
At OpenAI, we strongly believe in the importance of data and have seen repeatedly how large of an impact focusing on data quality can yield across all of our projects. The Pre-training Data Processing team brings this focus to the pre-training of our flagship GPT models, owning the pipelines for turning raw data into the high quality, diverse, and multimodal datasets used to train our largest models. We work closely with teams focused on data acquisition, data quality, and multimodal data throughout Research. Most recently, in collaboration with these groups, we were responsible for building the dataset used to pre-train OpenAI’s newest multimodal model GPT-4o.
In addition to building new pre-training datasets, we collaborate on data research and acquisition with teams in Pre-training and Multimodal to explore ways to get more out of data, including questions around efficiency, efficacy, and diversity. We also own and continuously improve the infrastructure used across several teams to prepare data for training models small and large.
About The Role
As a Research Engineer here, you will be responsible for building AI systems that can perform previously impossible tasks or achieve unprecedented levels of performance. We're looking for people with solid engineering skills who are comfortable working with large distributed systems and strive to write quality, well-tested code.
The most outstanding deep learning results are increasingly attained at a massive scale, and these results require engineers who are comfortable working in large distributed systems. We expect engineering to play a key role in most major advances in AI of the future.
In This Role, You Will
- Build and own data pipelines operating on internet-scale data spanning the text, image, and audio modalities.
- Collaborate with many teams within Pre-training and across the company to incorporate our latest and greatest research into pre-training datasets.
- Research new methods for improving our datasets alongside researchers within Pre-training.
You Might Thrive In This Role If You
- Enjoy working at the cutting-edge of large language model research.
- Have experience running complicated processing on very large datasets.
- Are comfortable working in a fast-paced, dynamic environment - research can evolve quite rapidly!
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
Benefits Extracted with AI
- Equal opportunity employer
- Diversity and inclusion initiatives
- Reasonable accommodations for disabilities
Similar jobs
Last update: 23 minutes ago
Expert Machine Learning Engineer
Join Dataroots as an Expert Machine Learning Engineer to design and deliver AI-powered solutions, focusing on machine learning models.
AI Engineer
Join BCG X as an AI Engineer in Milan, Italy. Develop AI solutions, partner with clients, and drive innovation in a dynamic environment.
Senior Backend Engineer - Java, Rust, Go
Join Together AI as a Senior Backend Engineer in Amsterdam. Work with Java, Rust, and Go to build scalable backend systems.
Information Retrieval Algorithm Engineer
Join Huawei as an Information Retrieval Algorithm Engineer to develop cutting-edge AI technologies in Amsterdam.
Senior Backend Engineer C++
Join DeepL as a Senior Backend Engineer C++ to design and maintain scalable backend services using C++ and AI technologies.
Principal AI Engineer
Join Cere Network as a Principal AI Engineer to drive AI innovation in Web3. Requires 10+ years in AI/ML, NLP, and software development.
Software Engineer - Autonomous Driving
Join Applied Intuition as a Software Engineer in Munich to tackle autonomous driving challenges with top ADAS/AV programs.
Staff Software Engineer
Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.
AI Solutions Software Engineer
Join DwellFi as an AI Solutions Software Engineer to develop innovative AI solutions using LangChain or Llama. Remote position in Palo Alto, CA.
Senior ASR / TTS Researcher
Join Huawei's research center in Amsterdam as a Senior ASR/TTS Researcher, focusing on speech synthesis and AI.
Product AI Engineer
Join xAI as a Product AI Engineer to develop cutting-edge AI consumer products using ML, Python, and Rust in Palo Alto, CA.
AI Engineer - Machine Learning and Robotics
Join Blueprint as an AI Engineer in Machine Learning and Robotics, focusing on scalable AI model training systems. Hybrid role in Redmond, WA.
Senior AI Engineer
Join Poggio as a Senior AI Engineer to innovate AI systems for enterprise sales, focusing on AI capabilities and system performance.
Senior Backend Engineer (Go)
Join Nebius as a Senior Backend Engineer (Go) to develop fault-tolerant cloud services in a hybrid work environment.
Cloud Data Engineer
Seeking a Cloud Data Engineer with expertise in AWS, Python, and CI/CD for a hybrid role in Hannover. Join our dynamic team!
Senior Software Engineer (C++)
Join Nebius as a Senior Software Engineer (C++) to develop reliable cloud services in a hybrid work environment.
Python AI Developer Advocate
Join Stream as a Python AI Developer Advocate to build community and enhance AI integrations. Engage with developers and influence product roadmaps.
AI Software Engineer
Join Zoom as an AI Software Engineer to design and optimize AI algorithms and applications. Work remotely with a focus on AI infrastructure.
Machine Learning Platform Engineer
Join Shopify as a Machine Learning Platform Engineer to build cutting-edge AI infrastructure and tools. Work remotely in a dynamic environment.
Frontend Engineer, AI
Join Aleph as a Frontend Engineer focusing on AI to develop innovative features using React.js and AI technologies in a remote role.
Senior Software Engineer - AWS, Python, Ruby on Rails
Join HeyJobs as a Senior Software Engineer to design scalable systems using AWS, Python, and Ruby on Rails in a dynamic team.
AI Engineer & Researcher - Data / Crawling
Join xAI as an AI Engineer & Researcher to build data processing systems and manage cloud workloads.
Machine Learning Engineer - LLM Fine-tuning and Performance
Join NVIDIA as a Machine Learning Engineer specializing in LLM fine-tuning and performance optimization. Work with cutting-edge ML technologies.
Machine Learning Scientist
Join Arena as a Machine Learning Scientist to develop AI systems using PyTorch and TensorFlow, focusing on real-world problem-solving.