Nebius AI logo

Senior Software Engineer - Distributed Systems and HPC

Nebius AI

About the Role

We are seeking a Senior Software Engineer to join our TractoAI team at Nebius. TractoAI is an innovative platform built on the robust, open-source YTsaurus technology, which has a decade-long proven track record within BigTech companies. Our platform is designed to be highly scalable, resilient, and multitenant, capable of handling up to tens of thousands of servers, exabytes of data, millions of CPU cores, and tens of thousands of GPUs. TractoAI is a new experimental direction within Nebius AI, aiming to provide an end-to-end platform solution for all kinds of BigData and AI challenges, including data preparation, distributed training, offline model inference, and more.

Responsibilities

  • Improve the YTsaurus core for the needs of our TractoAI platform.
  • Work on performance and reliability of a large-scale distributed system.
  • Design new features and microservices of the system.
  • Investigate performance issues of user workloads.
  • Collaborate with our SRE team to ensure system stability.

Representative Projects

  • Support S3 on a par with physical disks as storage for TractoAI.
  • Implement fair strategy of disk throughput distribution to share IO resources between concurrent computations of different users on the cluster.
  • Make modern hardware like high-end GPUs or InfiniBand connectors work inside user job VMs and collect statistics about their utilization.
  • Improve TractoAI scheduler to make splitting the input data to the jobs in MapReduce operation more uniform by using various metadata.
  • Improve TractoAI IO engines to squeeze maximum out of the HDDs and NVMe SSDs.

What We Offer

  • A dynamic and collaborative work environment.
  • Opportunity to work on cutting-edge technology.
  • Flexible working conditions.
  • Competitive salary and benefits.

Requirements

  • 5+ years of experience as a software engineer.
  • Experience with C++.
  • Experience with concurrency.
  • Result-oriented and ready to push a feature from the idea to the client adoption.
  • Ready to dig into unknown areas, including HPC or GPU computing.
  • Ready to occasionally write code in Go/Python.

Preferred Qualifications

  • Distributed systems design.
  • Understanding how databases work.
  • OS internals knowledge.
  • Performance engineering experience.
  • Experience of maintaining large-scale stateful systems.

Join us if you’re up to the challenge and are excited about AI and ML as much as we are!

Benefits
Extracted with AI

  • Dynamic and collaborative work environment
  • Opportunity to work on cutting-edge technology
  • Flexible working conditions
  • Competitive salary and benefits

Similar jobs

Last update: 23 minutes ago

Nebius AI logo
Nebius AI

Senior Software Engineer (C++)

Join Nebius as a Senior Software Engineer (C++) to develop reliable cloud services in a hybrid work environment.

Nebius AI logo
Nebius AI

Senior Backend Engineer (Go)

Join Nebius as a Senior Backend Engineer (Go) to develop fault-tolerant cloud services in a hybrid work environment.

Nebius AI logo
Nebius AI

System Engineer - IT Infrastructure

Join Nebius AI as a System Engineer focusing on Microsoft technologies, managing enterprise solutions, and automating processes.

Nebius AI logo
Nebius AI

System Engineer IAM

Join Nebius AI as a System Engineer IAM in Amsterdam to design and manage IAM systems with a focus on Azure AD.

Nebius AI logo
Nebius AI

MLOps Engagement Engineer

Join Nebius AI as an MLOps Engagement Engineer to design and optimize ML workflows using Kubernetes, Docker, and Slurm.

Together AI logo
Together AI

Senior Backend Engineer - Java, Rust, Go

Join Together AI as a Senior Backend Engineer in Amsterdam. Work with Java, Rust, and Go to build scalable backend systems.

Neon logo
Neon

Lead AI Engineer

Lead AI Engineer role focusing on building AI Agents for Neon platform, leading a small team, and enhancing developer experience.

Bitvavo logo
Bitvavo

Senior Systems Engineer

Join Bitvavo as a Senior Systems Engineer to lead low latency system design and optimization for trading at scale.

SentinelOne logo
SentinelOne

Senior AI Platform Engineer

Join SentinelOne as a Senior AI Platform Engineer to develop cutting-edge AI technology for cybersecurity solutions. Remote role in the Netherlands.

Bitvavo logo
Bitvavo

Senior Software Engineer (Backend)

Join Bitvavo as a Senior Software Engineer (Backend) to drive innovation in crypto services using AWS, Kubernetes, and TypeScript.

NVIDIA logo
NVIDIA

Senior Deep Learning Performance Software Engineer

Senior role optimizing deep learning performance at NVIDIA, involving Python, HPC, and AI technologies.

Bitvavo logo
Bitvavo

Senior Software Engineer - Low Latency

Join Bitvavo as a Senior Software Engineer focusing on low latency services, leveraging AWS, Kubernetes, and microservices.

Neon logo
Neon

Senior Systems Software Engineer (Postgres)

Join Neon as a Senior Systems Software Engineer to enhance PostgreSQL on our cloud platform. Work remotely with a global team.

DataSnipper logo
DataSnipper

Senior Backend Engineer

Join DataSnipper as a Senior Backend Engineer in Amsterdam! Design and build software impacting over 500,000 users globally.

Caide logo
Caide

Senior Fullstack Developer (Next.js/FastAPI)

Senior Fullstack Developer role focusing on Python FastAPI and Next.js for scalable web applications in The Hague.

Stichting RINIS logo
Stichting RINIS

Senior Developer with C#, Java, and Python

Join RINIS as a Senior Developer to build secure data exchange solutions using C#, Java, Python, and more in a hybrid work environment.

Neon logo
Neon

Senior DevOps Engineer

Senior DevOps Engineer for cloud-native PostgreSQL, remote, with skills in CI/CD, Docker, Kubernetes, and multiple programming languages.

BridgeFund logo
BridgeFund

Senior Software Engineer (Node.js, Microservices)

Join BridgeFund as a Senior Software Engineer to lead Node.js microservices development in a dynamic fintech environment.

Skytree logo
Skytree

Senior IoT Edge Software Engineer

Join Skytree as a Senior IoT Edge Software Engineer to lead IoT projects, focusing on edge and backend solutions in Amsterdam.

NVIDIA logo
NVIDIA

Senior Software Engineer - HPC

Senior Software Engineer for HPC at NVIDIA in Westford, MA. Design and improve high-performance computing systems.

DEPT® logo
DEPT®

Senior Data Scientist / AI Engineer

Senior Data Scientist/AI Engineer needed to build and deploy AI solutions using cutting-edge technologies in Rotterdam.

Aiven logo
Aiven

Staff Software Engineer

Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.

Basetime BV logo
Basetime BV

Senior Python Developer with AWS Experience

Join Basetime BV as a Senior Python Developer to develop and maintain AWS cloud solutions. Hybrid work, competitive salary, and growth opportunities.

Lab Digital logo
Lab Digital

Senior Backend Software Engineer (Typescript/Node.js)

Senior Backend Software Engineer specializing in Typescript/Node.js, focusing on API development and cloud technology in Utrecht, Netherlands.