Blockhouse logo

Data Engineering Intern

Blockhouse

Job Description: Data Engineering Intern

Blockhouse is focused on real-time machine learning and data engineering, building scalable infrastructure for high-frequency ML models that redefine how organizations extract actionable insights from data. Our systems drive the future of real-time analytics, leveraging cutting-edge technology to deploy machine learning pipelines with sub-second level response times. If you’re passionate about building the future of MLOps and want to work with a world-class team, this is your opportunity.

Role Description:

We are looking for an exceptional Data Engineering Intern to join our team and help architect the data systems of the future. In this role, you will build and scale real-time data pipelines and analytics infrastructure, powering high-frequency machine learning models. This is not a typical internship – you will be working on mission-critical projects that process millions of data points per second, collaborating closely with machine learning scientists and MLOps engineers.

Your work will directly influence the performance of trading models and real-time decision-making engines. You’ll work with cutting-edge technologies for event-driven streaming and OLAP analytics, delivering insights at scale and speed.

Key Responsibilities:

  • Real-Time Data Pipelines: Design, develop, and optimize real-time data pipelines that feed high-frequency machine learning models. Ensure seamless data ingestion, transformation, and storage for analytics and machine learning at scale.
  • Advanced Data Integration: Collaborate with MLOps engineers and machine learning teams to ensure real-time data flows between systems, enabling models to continuously learn from and react to new data streams.
  • Performance Optimization: Work on optimizing the performance and reliability of data architectures using technologies like ClickHouse for high-throughput OLAP querying and Redpanda for low-latency event streaming.
  • Real-Time Monitoring & Diagnostics: Implement robust monitoring and diagnostic tools to track the health and performance of data pipelines, ensuring real-time models are supplied with accurate, up-to-date data.
  • Cloud Infrastructure: Build and manage scalable cloud infrastructure to support data pipelines in production, leveraging AWS or GCP services to ensure fault-tolerant, cost-efficient deployments.
  • Collaborate with Elite Teams: Engage with top-tier engineers, data scientists, and quantitative researchers to build scalable solutions that bridge the gap between data engineering and machine learning.

What You’ll Need:

  • 1+ Years of Data Engineering Experience: Hands-on experience building and scaling data pipelines, especially in high-throughput, low-latency environments.
  • Mastery of Real-Time Data Systems: Expertise in real-time data streaming and processing, with strong hands-on experience using technologies like Redpanda (or Kafka) and ClickHouse (or similar OLAP databases).
  • Proficiency in Data Engineering Tools: Strong command of Python, SQL, and other tools commonly used in data engineering. Experience with frameworks such as Apache Spark, Airflow, or similar is a plus.
  • Cloud Expertise: Proven experience with cloud platforms such as AWS or GCP, including services like S3, Lambda, EKS, or other tools for building scalable data infrastructure.
  • Data Architecture & Integration: Experience architecting systems that handle both streaming and batch processing, integrating real-time pipelines with machine learning workflows.
  • Monitoring at Scale: Familiarity with monitoring and alerting tools such as Prometheus, Grafana, or CloudWatch to ensure seamless operation of real-time data systems.

Ideal Candidate Profile:

  • Passion for Real-Time Systems: A deep interest in building data systems that operate in real time, optimizing for performance, latency, and throughput.
  • Experience with High-Frequency Systems: Familiarity with the challenges and complexities of handling large-scale, high-frequency data.
  • Self-Motivated & Results-Driven: You thrive in a fast-paced environment, are self-driven, and have the ability to work independently on complex tasks.
  • Collaborative Mindset: A team player with excellent communication skills, who can work effectively across teams to drive innovation and problem-solving.

Why You Should Join Us:

  • Innovative Environment: Be part of a team that is pushing the boundaries of real-time data engineering, solving complex challenges in financial technology and beyond.
  • Expert Team: Work alongside some of the brightest minds in data engineering, machine learning, and quantitative research.
  • Professional Growth: Blockhouse fosters a culture of continuous learning and development, ensuring you gain hands-on experience with cutting-edge technologies and best practices.
  • Cutting-Edge Projects: You’ll work on transformative projects that directly impact the future of trade execution, real-time analytics, and financial technology.
  • Compensation & Perks: Equity-only compensation. NYC-based employees enjoy daily free lunch and weekly company bonding events.

How to Apply:

If you are passionate about real-time data systems and eager to apply your skills to solve complex engineering challenges, join us at Blockhouse. Together, we will redefine the future of data engineering and real-time analytics.

Benefits
Extracted with AI

  • Equity-only compensation
  • Daily free lunch for NYC-based employees
  • Weekly company bonding events

Similar jobs

Last update: 23 minutes ago

Intuit logo
Intuit

Data Science Intern

Join Intuit as a Data Science Intern to work on real-world data products and machine learning models.

Blockhouse logo
Blockhouse

Full Stack Developer Intern

Join Blockhouse as a Full Stack Developer Intern to work on innovative financial technology projects using React, Python, and Django.

Duolingo logo
Duolingo

Data Scientist Intern

Join Duolingo as a Data Scientist Intern to work on innovative solutions using data analytics and predictive models.

Duolingo logo
Duolingo

Data Scientist Intern (PhD or Masters)

Join Duolingo as a Data Scientist Intern to apply advanced analytics and machine learning in a dynamic, data-driven environment.

Messari logo
Messari

Data Engineer with Blockchain and Cloud Experience

Join Messari as a Data Engineer to design blockchain data models, build dashboards, and derive insights. Remote role with competitive benefits.

Intuit logo
Intuit

Data Science Intern

Join Intuit as a Data Science Intern to apply technical skills and innovative ideas on financial data, building data products.

Autodesk logo
Autodesk

Data Scientist Intern

Join Autodesk as a Data Scientist Intern to work on data modeling, analysis, and contribute to scalable solutions. Gain insights and experience in a hybrid work environment.

Metyis logo
Metyis

Data Engineering Intern

Join Metyis as a Data Engineering Intern in Amsterdam. Gain hands-on experience in data pipelines, warehousing, and modeling.

Upper Hand logo
Upper Hand

Internship - Machine Learning Engineer & Data Science

Join Upper Hand as a Machine Learning Engineer & Data Scientist intern to build and deploy AI models in sports technology.

Duolingo logo
Duolingo

Data Scientist Intern

Join Duolingo as a Data Scientist Intern to work on innovative solutions using data analytics and predictive analytics.

Meta logo
Meta

Software Engineer Intern/Co-op

Join Meta as a Software Engineer Intern to develop impactful products and solve complex technical challenges.

OpenAI logo
OpenAI

Software Engineer Intern, Applied Emerging Talent

Join OpenAI as a Software Engineer Intern to work on cutting-edge AI technology in a fast-paced environment.

Tesla logo
Tesla

Data Engineer, Energy

Join Tesla as a Data Engineer in Buffalo, NY, to enhance quality engineering for Tesla Energy products.

Block logo
Block

Software Engineer, Investing

Join Cash App's Investing team as a Software Engineer to build robust financial products using Java, Kotlin, AWS, and microservices.

Amazon logo
Amazon

Data Engineer Intern

Join Amazon as a Data Engineer Intern in Luxembourg. Work on impactful projects and develop your skills in a fast-paced environment.

Partoo logo
Partoo

Lead Data Engineer

Join Partoo as a Lead Data Engineer in Paris, managing data pipelines, AI projects, and a team, with a focus on innovation and data security.

Autodesk logo
Autodesk

Machine Learning Intern (Digital Experience & Customer Empowerment)

Join Autodesk as a Machine Learning Intern to design and implement ML solutions, focusing on AI, data analytics, and customer empowerment.

Cohere logo
Cohere

Machine Learning Intern/Co-op (Winter 2025)

Join Cohere as a Machine Learning Intern to design and train cutting-edge AI models. Remote work, flexible, and inclusive culture.

Block logo
Block

Principal Software Engineer, Product Server

Join Block as a Principal Software Engineer to lead technical strategy and operational excellence in backend services.

Expedia Group logo
Expedia Group

Machine Learning Scientist Intern - Masters

Join Expedia Group as a Machine Learning Scientist Intern in Seattle. Gain hands-on experience in data science and machine learning.

Euronext logo
Euronext

Internship Data Scientist

Join Euronext as a Data Scientist intern in Milan. Engage in data analysis, innovate data solutions, and support trading platforms.

Zocdoc logo
Zocdoc

Software Engineering Intern

Join Zocdoc as a Software Engineering Intern to gain hands-on experience in healthcare technology.

Intuit logo
Intuit

Senior Machine Learning Engineer

Join Intuit as a Senior Machine Learning Engineer to develop and deploy scalable data science models.

Snowflake logo
Snowflake

Software Engineer Intern - Marketplace Provider

Join Snowflake as a Software Engineer Intern in Warsaw. Work on marketplace features, gain experience in TypeScript, React, and more.