InstaDeep logo

Senior Machine Learning Engineer - Scaling and Performance Optimization

InstaDeep

About the Team

Our team plays a pivotal role in enhancing the capabilities and efficiency of our advanced AI systems. We design solutions that enable our machine learning models to scale seamlessly and perform optimally in real-world applications and large-scale research. Collaborating across InstaDeep, we directly impact projects in diverse fields including Life Sciences, Logistics, Chip Design, and Quantum ML.

The Role

We seek a highly skilled Machine Learning Engineer with a passion for tackling the challenges of large-scale ML development. You'll play a vital role in making our ambitious AI solutions a practical reality. If you thrive on system-level analysis, find joy in squeezing every ounce of performance from hardware, and love diving deep into algorithm optimisation, this is the position for you.

Responsibilities

  • Scaling Expertise: Design and implement strategies to efficiently scale machine learning models across diverse hardware platforms (GPU/TPU).
  • Performance Optimization: Analyse and profile ML systems under heavy load, pinpointing bottlenecks, and implementing targeted optimizations.
  • Distributed Systems Architecture: Create robust distributed training and inference solutions for maximum computational efficiency.
  • Algorithmic Optimization: Research and understand the latest deep learning literature to implement and optimize state-of-the-art algorithms and architectures, ensuring compute efficiency and performance.
  • Low-Level Mastery: Write high-quality Python, C/C++, XLA, Pallas, Triton, and/or CUDA code to achieve performance breakthroughs.

Required Skills

  • Understanding of Linux systems, performance analysis tools, and hardware optimisation techniques.
  • Experience with distributed training frameworks (Ray, Dask, PyTorch Lightning, etc.).
  • Expertise with Python and/or C++.
  • Development with machine learning frameworks (JAX, Tensorflow, PyTorch etc.).
  • Passion for profiling, identifying bottlenecks, and delivering efficient solutions.

Highly Desirable

  • Track record of successfully scaling ML models.
  • Experience writing custom CUDA kernels or XLA operations.
  • Understanding of GPU/TPU architectures and their implications for efficient ML systems.
  • Fundamentals of modern Deep Learning.
  • Actively following ML trends and a desire to push boundaries.

Example Projects

  • Profile algorithm traces, identifying opportunities for custom XLA operations and CUDA kernel development.
  • Implement and apply SOTA architectures (MAMBA, Griffin, Hyena) to research and applied projects.
  • Adapt algorithms for large-scale distributed architectures across HPC clusters.
  • Employ memory-efficient techniques within models for increased parameter counts and longer context lengths.

What We Offer

  • Real-World Impact: Directly contribute to the performance and reach of our AI solutions.
  • Cutting-Edge Challenges: Tackle complex problems at the forefront of machine learning and large-scale system design.
  • Growth-Oriented Environment: Expand your expertise in a team of talented engineers dedicated to advancing ML scalability.

Benefits
Extracted with AI

  • Real-World Impact
  • Cutting-Edge Challenges
  • Growth-Oriented Environment

Similar jobs

Last update: 23 minutes ago

DeepL logo
DeepL

Senior Backend Engineer C++

Join DeepL as a Senior Backend Engineer C++ to design and maintain scalable backend services using C++ and AI technologies.

dataroots logo
dataroots

Expert Machine Learning Engineer

Join Dataroots as an Expert Machine Learning Engineer to design and deliver AI-powered solutions, focusing on machine learning models.

BCG X logo
BCG X

AI Engineer

Join BCG X as an AI Engineer in Milan, Italy. Develop AI solutions, partner with clients, and drive innovation in a dynamic environment.

Together AI logo
Together AI

Senior Backend Engineer - Java, Rust, Go

Join Together AI as a Senior Backend Engineer in Amsterdam. Work with Java, Rust, and Go to build scalable backend systems.

Applied Intuition logo
Applied Intuition

Software Engineer - Autonomous Driving

Join Applied Intuition as a Software Engineer in Munich to tackle autonomous driving challenges with top ADAS/AV programs.

Aiven logo
Aiven

Staff Software Engineer

Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.

Nebius AI logo
Nebius AI

Senior Software Engineer (C++)

Join Nebius as a Senior Software Engineer (C++) to develop reliable cloud services in a hybrid work environment.

Aiven logo
Aiven

Senior Software Engineer - Python, Apache Kafka

Join Aiven as a Senior Software Engineer in Berlin, focusing on Python and Apache Kafka in a hybrid work environment.

Metroscope logo
Metroscope

Senior Software Engineer - Full Stack/Back-End with Python and TypeScript

Join Metroscope as a Senior Software Engineer in Paris, working on innovative energy solutions with Python and TypeScript in a hybrid environment.

Ilkari logo
Ilkari

Senior Software Engineer - Python, Django, Angular

Join Ilkari as a Senior Software Engineer to lead development in Python, Django, and Angular, creating scalable solutions in a hybrid work environment.

Cere Network logo
Cere Network

Principal AI Engineer

Join Cere Network as a Principal AI Engineer to drive AI innovation in Web3. Requires 10+ years in AI/ML, NLP, and software development.

Nebius AI logo
Nebius AI

Senior Backend Engineer (Go)

Join Nebius as a Senior Backend Engineer (Go) to develop fault-tolerant cloud services in a hybrid work environment.

Bitmovin logo
Bitmovin

Senior C++ Software Engineer

Join Bitmovin as a Senior C++ Software Engineer to develop scalable video streaming solutions using modern C++ and cloud-native architectures.

HeyJobs logo
HeyJobs

Senior Software Engineer - AWS, Python, Ruby on Rails

Join HeyJobs as a Senior Software Engineer to design scalable systems using AWS, Python, and Ruby on Rails in a dynamic team.

NVIDIA logo
NVIDIA

Machine Learning Engineer - LLM Fine-tuning and Performance

Join NVIDIA as a Machine Learning Engineer specializing in LLM fine-tuning and performance optimization. Work with cutting-edge ML technologies.

Computer Futures logo
Computer Futures

Cloud Data Engineer

Seeking a Cloud Data Engineer with expertise in AWS, Python, and CI/CD for a hybrid role in Hannover. Join our dynamic team!

netgo logo
netgo

Senior Cloud DevOps Engineer

Join netgo as a Senior Cloud DevOps Engineer in Berlin. Work with Kubernetes, GitOps, and more in a dynamic team environment.

Sofico logo
Sofico

Senior Software Engineer - Java, Microservices

Join Sofico as a Senior Software Engineer focusing on Java and Microservices in Bavaria, Germany. Work on ERP solutions for automotive finance.

Huawei Nederland logo
Huawei Nederland

Information Retrieval Algorithm Engineer

Join Huawei as an Information Retrieval Algorithm Engineer to develop cutting-edge AI technologies in Amsterdam.

TrueLayer logo
TrueLayer

Senior Software Engineer - C#/.NET

Join TrueLayer as a Senior Software Engineer in Milan, working with C#, .NET, AWS, and Kubernetes to build scalable systems.

Nubank logo
Nubank

Senior Software Engineer - Data Platform

Join Nubank as a Senior Software Engineer to build and maintain core data infrastructure, ensuring reliable and scalable data flow.

Covestro logo
Covestro

Senior DevOps Engineer - Price & Deal Management

Join Covestro as a Senior DevOps Engineer to drive digital transformation in pricing and deal management with AWS, Docker, and Java expertise.

VIAVI Solutions logo
VIAVI Solutions

Senior Software Engineer (C++, Python & Cloud)

Join VIAVI Solutions as a Senior Software Engineer specializing in C++, Python, and cloud technologies. Work in a hybrid environment in Berlin.

Motius logo
Motius

Senior Backend Developer

Join Motius as a Senior Backend Developer to work on cutting-edge R&D projects using AWS, Docker, GraphQL, and more in a hybrid work environment.