Join Our Team as an MLOps Engagement Engineer
Nebius AI is seeking an experienced MLOps Engagement Engineer to join our dynamic team. This role is pivotal in designing, implementing, and maintaining large-scale distributed machine learning (ML) training and inference workflows. As an MLOps Engagement Engineer, you will work closely with our Solutions Architect and support team, providing hands-on expertise to our largest customers and internal teams.
Key Responsibilities
- Design and Implement Distributed ML Workflows: Develop and maintain scalable, efficient, and reliable ML training pipelines on Kubernetes (K8s) and Slurm, leveraging containerization (e.g., Docker) and orchestration.
- Optimize ML Training Performance: Collaborate with data scientists and engineers to enhance ML model training and inference performance.
- Develop Solutions Library: Design, deploy, and manage K8s and Slurm clusters for large-scale ML training, utilizing our ready-to-deploy solutions.
- Integrate with ML Frameworks: Ensure seamless execution of distributed ML training workloads by integrating K8s and Slurm with popular ML frameworks like TensorFlow, PyTorch, or MXNet.
- Monitor and Troubleshoot: Develop monitoring and logging tools to track distributed training performance, identify bottlenecks, and troubleshoot issues.
- Develop Automation Tools: Create automation scripts and tools to streamline ML training workflows, leveraging technologies like Ansible, Terraform, or Python.
- Stay Updated with Industry Trends: Participate in industry conferences, meetups, and online forums to stay abreast of the latest developments in MLOps, K8s, Slurm, and ML.
Required Qualifications
- 3+ years of experience in MLOps, DevOps, or a related field.
- Strong experience with Kubernetes and containerization (e.g., Docker).
- Experience with Slurm or other distributed computing frameworks.
- Proficiency in Python, with experience in ML frameworks like TensorFlow, PyTorch, or MXNet.
- Strong understanding of distributed computing concepts, including parallel processing and job scheduling.
- Experience with automation tools like Ansible, Terraform, or Python.
- Excellent problem-solving skills with the ability to troubleshoot complex issues.
- Strong communication and collaboration skills, with experience working with cross-functional teams.
Preferred Qualifications
- Experience with cloud providers like AWS, GCP, or Azure.
- Knowledge of ML model serving and deployment.
- Familiarity with CI/CD pipelines and tools like Jenkins, GitLab CI/CD, or CircleCI.
- Experience with monitoring and logging tools like Prometheus, Grafana, or ELK Stack.
Why Join Us?
Nebius AI is a leading AI cloud platform with one of the largest GPU capacities in Europe. We offer a unique opportunity to work with cutting-edge technology and a team of highly skilled engineers. If you are passionate about AI and ML and eager to tackle new challenges, we invite you to join our team.
Work Environment
This position offers a hybrid work environment, allowing you to work both on-site in our Amsterdam office and remotely. We are committed to providing a flexible work environment that supports work-life balance and professional growth.
Apply today to become a part of our innovative team and contribute to the future of AI and ML at Nebius AI.
Benefits Extracted with AI
- Flexible work environment
- Opportunities for professional development
- Access to cutting-edge technology
Similar jobs
Last update: 23 minutes ago
Senior Backend Engineer (Go)
Join Nebius as a Senior Backend Engineer (Go) to develop fault-tolerant cloud services in a hybrid work environment.
Senior Elastic Stack as a Service (ELKaaS) DevOps Engineer
Join ING as a Senior DevOps Engineer to enhance our ELKaaS platform, leveraging Docker, Kubernetes, and Azure in a hybrid work environment.
Lead Developer with DevOps and Functional Programming
Join Reaktor as a Lead Developer in Amsterdam, focusing on DevOps, Functional Programming, and JavaScript in a hybrid work environment.
Senior Software Engineer (C++)
Join Nebius as a Senior Software Engineer (C++) to develop reliable cloud services in a hybrid work environment.
Cloud Engineer
Join Tibo Energy as a Cloud Engineer to drive energy transition with cloud architecture skills in a dynamic team.
Freelance DevOps Engineer with Python Expertise
Join Greener Power Solutions as a Freelance DevOps Engineer to drive sustainable energy solutions with Python and DevOps expertise.
Senior IoT Engineer
Join Skytree as a Senior IoT Engineer to lead IoT projects, focusing on Azure IoT solutions, edge computing, and data pipelines.
Senior API Platform Engineer
Join Brenntag as a Senior API Platform Engineer in Amsterdam to drive API platform development using Kubernetes, Istio, and AWS EKS.
Senior Backend Developer (Node.js) / SRE
Join Binance as a Senior Backend Developer (Node.js) / SRE to develop monitoring systems for high-load production environments.
Expert Machine Learning Engineer
Join Dataroots as an Expert Machine Learning Engineer to design and deliver AI-powered solutions, focusing on machine learning models.
Solutions Engineer
Join Darktrace as a Solutions Engineer in Amsterdam, providing technical pre-sales and post-sales support in a hybrid work environment.
Senior Cloud DevOps Engineer
Join netgo as a Senior Cloud DevOps Engineer in Berlin. Work with Kubernetes, GitOps, and more in a dynamic team environment.
Oracle Cloud Engineer
Join Albert Heijn as an Oracle Cloud Engineer to drive automation and manage cloud infrastructure in Zaandam, Netherlands.
Senior DevOps Engineer - Price & Deal Management
Join Covestro as a Senior DevOps Engineer to drive digital transformation in pricing and deal management with AWS, Docker, and Java expertise.
Full Stack Developer with AI and API Expertise
Join Catalyze Group as a Full Stack Developer to build AI-powered grant-writing tools. Work with React, Django, and more in Amsterdam.
Software Engineer - Cloud Applications and Python
Join Topicus as a Software Engineer in Arnhem to develop cloud applications using Python, REST APIs, and ETL processes for healthcare data services.
Cloud Data Engineer
Seeking a Cloud Data Engineer with expertise in AWS, Python, and CI/CD for a hybrid role in Hannover. Join our dynamic team!
Senior Python Developer with AWS Experience
Join Basetime BV as a Senior Python Developer to develop and maintain AWS cloud solutions. Hybrid work, competitive salary, and growth opportunities.
Senior Backend Engineer - PHP, Symfony, Laravel
Join Instapro Group as a Senior Backend Engineer, working with PHP, Symfony, and Laravel in a hybrid environment.
Senior Fullstack Developer for AI-Driven Mission Technologies
Seeking a Senior Fullstack Developer for AI-driven mission technologies, focusing on Java, JavaScript, Python, and C++. Remote work available.
Staff Software Engineer
Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.
Data Engineer with ETL and SQL Expertise
Join Holland Casino as a Data Engineer to build and maintain data infrastructure for the Online Casino, focusing on ETL, SQL, and cloud solutions.
Senior C++ Computer Vision Engineer
Join a cutting-edge AI-DeepTech startup in Berlin as a Senior C++ Computer Vision Engineer. Work on world-class on-device AI technology.
Production Engineer
Join Optiver as a Production Engineer in Amsterdam to manage live trading environments and enhance system reliability and performance.