Nebius AI logo

Senior Software Engineer - Distributed Systems and HPC

Nebius AI

About the Role

We are seeking a Senior Software Engineer to join our TractoAI team at Nebius. TractoAI is an innovative platform built on the robust, open-source YTsaurus technology, which has a decade-long proven track record within BigTech companies. Our platform is designed to be highly scalable, resilient, and multitenant, capable of handling up to tens of thousands of servers, exabytes of data, millions of CPU cores, and tens of thousands of GPUs. TractoAI is a new experimental direction within Nebius AI, aiming to provide an end-to-end platform solution for all kinds of BigData and AI challenges, including data preparation, distributed training, offline model inference, and more.

Responsibilities

  • Improve the YTsaurus core for the needs of our TractoAI platform.
  • Work on performance and reliability of a large-scale distributed system.
  • Design new features and microservices of the system.
  • Investigate performance issues of user workloads.
  • Collaborate with our SRE team to ensure system stability.

Representative Projects

  • Support S3 on a par with physical disks as storage for TractoAI.
  • Implement fair strategy of disk throughput distribution to share IO resources between concurrent computations of different users on the cluster.
  • Make modern hardware like high-end GPUs or InfiniBand connectors work inside user job VMs and collect statistics about their utilization.
  • Improve TractoAI scheduler to make splitting the input data to the jobs in MapReduce operation more uniform by using various metadata.
  • Improve TractoAI IO engines to squeeze maximum out of the HDDs and NVMe SSDs.

What We Offer

  • A dynamic and collaborative work environment.
  • Opportunity to work on cutting-edge technology.
  • Flexible working conditions.
  • Competitive salary and benefits.

Requirements

  • 5+ years of experience as a software engineer.
  • Experience with C++.
  • Experience with concurrency.
  • Result-oriented and ready to push a feature from the idea to the client adoption.
  • Ready to dig into unknown areas, including HPC or GPU computing.
  • Ready to occasionally write code in Go/Python.

Preferred Qualifications

  • Distributed systems design.
  • Understanding how databases work.
  • OS internals knowledge.
  • Performance engineering experience.
  • Experience of maintaining large-scale stateful systems.

Join us if you’re up to the challenge and are excited about AI and ML as much as we are!

Benefits
Extracted with AI

  • Dynamic and collaborative work environment
  • Opportunity to work on cutting-edge technology
  • Flexible working conditions
  • Competitive salary and benefits

Similar jobs

Last update: 23 minutes ago

Nebius AI logo
Nebius AI

Senior Software Engineer (C++)

Join Nebius as a Senior Software Engineer (C++) to develop reliable cloud services in a hybrid work environment.

Nebius AI logo
Nebius AI

Senior Backend Engineer (Go)

Join Nebius as a Senior Backend Engineer (Go) to develop fault-tolerant cloud services in a hybrid work environment.

Reaktor logo
Reaktor

Lead Developer with DevOps and Functional Programming

Join Reaktor as a Lead Developer in Amsterdam, focusing on DevOps, Functional Programming, and JavaScript in a hybrid work environment.

Brenntag logo
Brenntag

Senior API Platform Engineer

Join Brenntag as a Senior API Platform Engineer in Amsterdam to drive API platform development using Kubernetes, Istio, and AWS EKS.

Skytree logo
Skytree

Senior IoT Engineer

Join Skytree as a Senior IoT Engineer to lead IoT projects, focusing on Azure IoT solutions, edge computing, and data pipelines.

Topicus logo
Topicus

Software Engineer - Cloud Applications and Python

Join Topicus as a Software Engineer in Arnhem to develop cloud applications using Python, REST APIs, and ETL processes for healthcare data services.

Aiven logo
Aiven

Staff Software Engineer

Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.

Together AI logo
Together AI

Senior Backend Engineer - Java, Rust, Go

Join Together AI as a Senior Backend Engineer in Amsterdam. Work with Java, Rust, and Go to build scalable backend systems.

Basetime BV logo
Basetime BV

Senior Python Developer with AWS Experience

Join Basetime BV as a Senior Python Developer to develop and maintain AWS cloud solutions. Hybrid work, competitive salary, and growth opportunities.

Binance logo
Binance

Senior Backend Developer (Node.js) / SRE

Join Binance as a Senior Backend Developer (Node.js) / SRE to develop monitoring systems for high-load production environments.

yourfirm GmbH logo
yourfirm GmbH

Senior Fullstack Developer for AI-Driven Mission Technologies

Seeking a Senior Fullstack Developer for AI-driven mission technologies, focusing on Java, JavaScript, Python, and C++. Remote work available.

Tibo Energy Management Software logo
Tibo Energy Management Software

Senior Backend Developer with TypeScript

Join Tibo Energy as a Senior Backend Developer to lead TypeScript-based solutions in energy management.

Instapro Group logo
Instapro Group

Senior Backend Engineer - PHP, Symfony, Laravel

Join Instapro Group as a Senior Backend Engineer, working with PHP, Symfony, and Laravel in a hybrid environment.

Tibo Energy Management Software logo
Tibo Energy Management Software

Cloud Engineer

Join Tibo Energy as a Cloud Engineer to drive energy transition with cloud architecture skills in a dynamic team.

TomTom logo
TomTom

Senior Software Engineer III - Java

Join TomTom as a Senior Software Engineer III in Amsterdam, focusing on Java and back-end development in a hybrid work environment.

Darktrace logo
Darktrace

Solutions Engineer

Join Darktrace as a Solutions Engineer in Amsterdam, providing technical pre-sales and post-sales support in a hybrid work environment.

Bonapolia logo
Bonapolia

Senior Java Developer

Join our team as a Senior Java Developer to design and develop high-quality software applications in a dynamic, hybrid work environment.

Stichting RINIS logo
Stichting RINIS

Senior Developer with C#, Java, and Python

Join RINIS as a Senior Developer to build secure data exchange solutions using C#, Java, Python, and more in a hybrid work environment.

Neon logo
Neon

Software Engineer, Storage (Rust, PostgreSQL)

Join Neon as a Software Engineer, Storage. Work with Rust and PostgreSQL to build scalable, reliable cloud-native database services.

DeepL logo
DeepL

Senior Backend Engineer C++

Join DeepL as a Senior Backend Engineer C++ to design and maintain scalable backend services using C++ and AI technologies.

NN Group logo
NN Group

Senior Full-stack Engineer (Angular, Node.js, TypeScript)

Join NN Group as a Senior Full-stack Engineer, leading software architecture and development with Angular, Node.js, and TypeScript.

Reddit, Inc. logo
Reddit, Inc.

Senior Solutions Engineer

Join Reddit as a Senior Solutions Engineer in Amsterdam to support our growing advertising business with technical expertise and problem-solving skills.

Aiven logo
Aiven

Senior Software Engineer - Python, Apache Kafka

Join Aiven as a Senior Software Engineer in Berlin, focusing on Python and Apache Kafka in a hybrid work environment.

Uber logo
Uber

Staff Software Engineer - Backend

Join Uber as a Staff Software Engineer - Backend, focusing on membership systems. Work with Java, Python, C++, and more in Amsterdam.