Join Onehouse as a Data Platform Engineer
Onehouse is a mission-driven company dedicated to freeing data from data platform lock-in. We deliver the industry’s most interoperable data lakehouse through a cloud-native managed service built on Apache Hudi. Onehouse enables organizations to ingest data at scale with minute-level freshness, centrally store it, and make it available to any downstream query engine and use case, from traditional analytics to real-time AI/ML.
The Community You Will Join
When you join Onehouse, you're joining a team of passionate professionals tackling the deeply technical challenges of building a 2-sided engineering product. Our engineering team serves as the bridge between the worlds of open source and enterprise: contributing directly to and growing Apache Hudi and concurrently defining a new industry category - the transactional data lake. The Data Infrastructure team is the grounding heartbeat to all of this. We live and breathe databases, building cornerstone infrastructure by working under Hudi's hood to solve incredibly complex optimization and systems problems.
A Typical Day
- Be the thought leader around all things data engineering within the company - schemas, frameworks, data models.
- Implement new sources and connectors to seamlessly ingest data streams.
- Build scalable job management on Kubernetes to ingest, store, manage, and optimize petabytes of data on cloud storage.
- Optimize Spark or Flink applications to flexibly run in batch or streaming modes based on user needs, optimizing latency vs throughput.
- Tune clusters for resource efficiency and reliability, to keep costs low, while still meeting SLAs.
What You Bring to the Table
- 3+ years of experience in building and operating data pipelines in Apache Spark or Apache Flink.
- 2+ years of experience with workflow orchestration tools like Apache Airflow, Dagster.
- Proficient in Java, Maven, Gradle, and other build and packaging tools.
- Adept at writing efficient SQL queries and troubleshooting query plans.
- Experience managing large-scale data on cloud storage.
- Great problem-solving skills, eye for details. Can debug failed jobs and queries in minutes.
- Operational excellence in monitoring, deploying, and testing job workflows.
- Open-minded, collaborative, self-starter, fast-mover.
Nice to Haves
- Hands-on experience with Kubernetes and related toolchain in cloud environments.
- Experience operating and optimizing terabyte-scale data pipelines.
- Deep understanding of Spark, Flink, Presto, Hive, Parquet internals.
- Hands-on experience with open source projects like Hadoop, Hive, Delta Lake, Hudi, Nifi, Drill, Pulsar, Druid, Pinot, etc.
- Operational experience with stream processing pipelines using Apache Flink, Kafka Streams.
How We'll Take Care Of You
- Competitive Compensation: The estimated base salary range for this role is $150,000 - $220,000.
- Equity Compensation: Our success is your success with eligible participation in our company equity plan.
- Health & Well-being: We'll invest in your physical and mental well-being with up to 90% health coverage (50% for spouses/dependents) including comprehensive medical, dental & vision benefits.
- Financial Future: We'll invest in your financial well-being by making this role eligible to contribute to our company 401(k) or Roth 401(k) retirement plan.
- Location: We are a remote-friendly company (internationally distributed across N. America + India), though some roles will be subject to in-person requirements in alignment with the needs of the business.
- Generous Time Off: Unlimited PTO (mandatory 1 week/year minimum), uncapped sick days, and 11 paid company holidays.
- Company Camaraderie: Annual company offsites and Quarterly team onsites @Sunnyvale HQ.
- Food & Meal Allowance: Weekly lunch stipend, in-office snacks/drinks.
- Equipment: We'll provide you with the equipment you need to be successful and a one-time $500 stipend for your initial desk setup.
- Child Bonding: 8 weeks off for parents (birthing, non-birthing, adoptive, foster, child placement, new guardianship) - fully paid so you can focus your energy on your newest addition.
House Values
- One Team: Optimize for the company, your team, self - in that order. We may fight long and hard in the trenches, take care of your co-workers with empathy. We give more than we take to build the one house, that everyone dreams of being part of.
- Tough & Persevering: We are building our company in a very large, fast-growing but highly competitive space. Life will get tough sometimes. We take hardships in the stride, be positive, focus all energy on the path forward and develop a champion's mindset to overcome odds. Always day one!
- Keep Making It Better Always: Rome was not built in a day; If we can get 1% better each day for one year, we'll end up thirty-seven times better. This means being organized, communicating promptly, taking even small tasks seriously, tracking all small ideas, and paying it forward.
- Think Big, Act Fast: We have tremendous scope for innovation, but we will still be judged by impact over time. Big, bold ideas still need to be strategized against priorities, broken down, set in rapid motion, measure, refine, repeat. Great execution is what separates promising companies from proven unicorns.
- Be Customer Obsessed: Everyone has the responsibility to drive towards the best experience for the customer, be an OSS user or a paid customer. If something is broken, own it, say something, do something; never ignore. Be the change that you want to see in the company.
Pay Range Transparency
Onehouse is committed to fair and equitable compensation practices. Our job titles may span more than one career level. The pay range(s) for this role is listed above and represents the base salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are dependent upon several factors that are unique to each candidate, including but not limited to: job-related skills, depth of transferable experience, relevant certifications and training, business needs, market demands, and specific work location. Based on the factors above, Onehouse utilizes the full width of the range; the base pay range is subject to change and may be modified in the future. The total compensation package for this position will also include eligibility for equity options and the benefits listed above.
Benefits Extracted with AI
- 401(k)
- Equity Compensation
- Health & Well-being
- Generous Time Off
- Company Camaraderie
- Food & Meal Allowance
- Equipment
- Child Bonding
Similar jobs
Last update: 23 minutes ago
Senior Data Engineer, Data Platform
Senior Data Engineer needed to build scalable data platforms using Kafka, Spark, and AWS. Inclusive team, great benefits.
Data Engineer
Join Adobe as a Data Engineer to design and maintain data pipelines, ensuring data quality and security.
Senior Data Platform Engineer
Senior Data Platform Engineer needed for Blip, focusing on Big Data management and cloud solutions. Expertise in SQL, Python, Spark, and cloud platforms required.
Senior Data Engineer
Senior Data Engineer needed in Tallinn to design and maintain data pipelines using Apache Spark, Kafka, and AWS.
Data Engineer with Apache Spark Experience
Join Mapiq as a Data Engineer to build scalable data pipelines using Apache Spark in a hybrid work environment.
Cloud Data Engineer
Seeking a Cloud Data Engineer with expertise in AWS, Python, and CI/CD for a hybrid role in Hannover. Join our dynamic team!
Data Platform Engineer (Kafka, Databricks, Python, Azure)
Join Albert Heijn as a Data Platform Engineer to enhance our data platform using Kafka, Databricks, Python, and Azure.
Lead Data Engineer – Data Platform
Lead Data Engineer role in Berlin, focusing on data platform scalability and efficiency, with skills in Kubernetes, Scala, and Apache Spark.
Staff Software Engineer, Data Infrastructure
Senior Data Infrastructure Engineer at Airbnb, focusing on data engineering tools and frameworks, remote eligible.
Data Engineering Intern
Join Blockhouse as a Data Engineering Intern to build real-time data pipelines and analytics infrastructure for high-frequency ML models.
Lead Data Engineer – Hadoop Applications
Lead Data Engineer role in Amsterdam focusing on Hadoop applications, requiring expertise in Kubernetes, Scala, and Apache Spark.
Senior Data Engineer - Apache Spark, PySpark, Azure Databricks
Senior Data Engineer specializing in Apache Spark, PySpark, and Azure Databricks for a leading UK fintech company.
Lead DevOps Engineer – Data Platform
Lead DevOps Engineer for Data Platform in Bangkok, focusing on Kubernetes, Apache Spark, and cloud technologies. Relocation provided.
Data Engineer
Join our team as a Data Engineer in Amsterdam, focusing on data pipelines, quality, and scaling using PySpark, Snowflake, Airflow, and AWS.
Senior Software Engineer, Machine Learning Platform
Join as a Senior Software Engineer in Machine Learning Platform, working remotely with cutting-edge ML tools and frameworks.
Staff Software Engineer, Data Platform
Join Personio as a Staff Software Engineer in Berlin to build scalable data platforms using Kafka, Kubernetes, and AWS. Drive innovation and excellence.
Lead DevOps Engineer – Data Platform
Lead DevOps Engineer for Data Platform in Bangkok, focusing on scalability and efficiency using Kubernetes, Spark, and cloud technologies.
Senior Software Engineer, Big Data
Join Attentive as a Senior Software Engineer, Big Data, to architect high-throughput data solutions and enhance our data platform.
Data Engineer II
Join Strava as a Data Engineer II in San Francisco, CA. Work with modern data technologies and a diverse team.
Data Engineering Tech Lead
Lead data engineering for AI at Wellhub, focusing on data quality and pipeline development. Remote role in Portugal.
Lead DevOps Engineer – Data Platform
Lead DevOps Engineer for Data Platform in Bangkok, focusing on Kubernetes, Apache Spark, and Hadoop. Relocation provided.
Senior Data Engineer with Spark
Senior Data Engineer role focusing on Spark, Kafka, and Airflow for data platform evolution. Fully remote, competitive benefits.
Senior Data Engineer - Real Estate and Workplace
Senior Data Engineer for Real Estate and Workplace at OpenAI, skilled in ETL, Apache Spark, and Airflow.
Senior Data Engineer (Java/Scala)
Join smartclip as a Senior Data Engineer to design scalable big data solutions using Java, Scala, and Spark. Remote work available.