About The Team
At OpenAI, we strongly believe in the importance of data and have seen repeatedly how large of an impact focusing on data quality can yield across all of our projects. The Pre-training Data Processing team brings this focus to the pre-training of our flagship GPT models, owning the pipelines for turning raw data into the high quality, diverse, and multimodal datasets used to train our largest models. We work closely with teams focused on data acquisition, data quality, and multimodal data throughout Research. Most recently, in collaboration with these groups, we were responsible for building the dataset used to pre-train OpenAI’s newest multimodal model GPT-4o.
In addition to building new pre-training datasets, we collaborate on data research and acquisition with teams in Pre-training and Multimodal to explore ways to get more out of data, including questions around efficiency, efficacy, and diversity. We also own and continuously improve the infrastructure used across several teams to prepare data for training models small and large.
About The Role
As a Research Engineer here, you will be responsible for building AI systems that can perform previously impossible tasks or achieve unprecedented levels of performance. We're looking for people with solid engineering skills who are comfortable working with large distributed systems and strive to write quality, well-tested code.
The most outstanding deep learning results are increasingly attained at a massive scale, and these results require engineers who are comfortable working in large distributed systems. We expect engineering to play a key role in most major advances in AI of the future.
In This Role, You Will
- Build and own data pipelines operating on internet-scale data spanning the text, image, and audio modalities.
- Collaborate with many teams within Pre-training and across the company to incorporate our latest and greatest research into pre-training datasets.
- Research new methods for improving our datasets alongside researchers within Pre-training.
You Might Thrive In This Role If You
- Enjoy working at the cutting-edge of large language model research.
- Have experience running complicated processing on very large datasets.
- Are comfortable working in a fast-paced, dynamic environment - research can evolve quite rapidly!
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
Benefits Extracted with AI
- Equal opportunity employer
- Diversity and inclusion initiatives
- Reasonable accommodations for disabilities
Similar jobs
Last update: 23 minutes ago
Research Engineer, Pre-training Architecture
Join OpenAI as a Research Engineer to advance neural network architectures and improve AI models.
Research Scientist, Pre-training Synthetic Data
Join OpenAI as a Research Scientist focusing on pre-training synthetic data, leveraging skills in biochemistry, cell biology, and machine learning.
Research Engineer/Scientist, Perception - OpenAI
Join OpenAI as a Research Engineer/Scientist in Perception, enhancing AI capabilities in San Francisco. Hybrid work, relocation offered.
Software Engineer, Applied Engineering
Join OpenAI as a Software Engineer in Applied Engineering to develop innovative AI products using JavaScript, React, and Python.
Software Engineer Intern, Applied Emerging Talent
Join OpenAI as a Software Engineer Intern to work on cutting-edge AI technology in a fast-paced environment.
Research Scientist, Human-AI Interaction
Join OpenAI as a Research Scientist in Human-AI Interaction, focusing on data collection and cognitive science.
Tech Lead Manager, Human Data
Lead a team enhancing AI data solutions with OpenAI, focusing on safety and innovation in San Francisco.
Software Engineer, Applied Emerging Talent
Join OpenAI as a Software Engineer to develop ChatGPT and API features using JavaScript, React, and Python.
Engineering Manager, Human Data
Lead the Human Data Team at OpenAI, enhancing AI models like ChatGPT through data solutions. Hybrid work, based in San Francisco.
AI Engineer & Researcher - Data / Crawling
Join xAI as an AI Engineer & Researcher to build data processing systems and manage cloud workloads.
Software Engineer, ChatGPT Enterprise
Join OpenAI as a Software Engineer for ChatGPT Enterprise, focusing on secure, scalable AI solutions.
Residency - Model Behavior
Join OpenAI's Residency program to transition into AI, focusing on model behavior with Python and data analytics skills.
New Products Platform Engineer
Join OpenAI as a New Products Platform Engineer to build future computing systems in a hybrid work model in San Francisco.
Senior Data Engineer - Real Estate and Workplace
Senior Data Engineer for Real Estate and Workplace at OpenAI, skilled in ETL, Apache Spark, and Airflow.
Software Engineer, Privacy
Join OpenAI as a Software Engineer focusing on privacy, developing secure backend systems in a hybrid work model in San Francisco.
Senior Software Engineer, Observability
Join OpenAI as a Senior Software Engineer in Observability, ensuring system reliability and scalability in a fast-paced environment.
Expert Machine Learning Engineer
Join Dataroots as an Expert Machine Learning Engineer to design and deliver AI-powered solutions, focusing on machine learning models.
Full-Stack Software Engineer - People Innovation
Join OpenAI as a Full-Stack Software Engineer in San Francisco, focusing on HR, culture, and recruiting innovations.
Senior C++ Computer Vision Engineer
Join a cutting-edge AI-DeepTech startup in Berlin as a Senior C++ Computer Vision Engineer. Work on world-class on-device AI technology.
Developer Advocate, Developer Experience
Join OpenAI as a Developer Advocate to engage with the developer community, create technical content, and advocate for developers' needs.
AI Engineer
Join BCG X as an AI Engineer in Milan, Italy. Develop AI solutions, partner with clients, and drive innovation in a dynamic environment.
Solutions Engineer, Global Affairs
Join OpenAI as a Solutions Engineer in Global Affairs, enhancing stakeholder engagement and AI adoption in San Francisco.
Senior Design Engineer, Communications Design
Join OpenAI as a Senior Design Engineer in San Francisco to craft high-impact user experiences with a focus on design and engineering.
Backend Software Engineer
Join OpenAI as a Backend Software Engineer to develop platform capabilities and integrate systems using AI.