GitHub logo

Software Engineer II, Data Engineering

GitHub

Job Overview

Join GitHub as a Software Engineer II on the Copilot Metrics team, where you'll be at the forefront of data engineering. This role is pivotal in designing, developing, and maintaining efficient and reliable data pipelines. You'll collaborate with stakeholders across the company to gather business requirements, build data models, and ensure data quality and accessibility. Your expertise in Python, SQL, Airflow, and Spark will be crucial in optimizing our data infrastructure and enabling data-driven decision-making.

Responsibilities

Data Pipeline Development

  • Design, build, and maintain scalable data pipelines using Python, SQL, Airflow, and Spark.

Business Requirements Gathering

  • Collaborate with stakeholders to understand and translate business requirements into technical specifications.

Data Modeling

  • Develop and implement data models that support analytics and reporting needs, ensuring alignment with business goals.

Data Quality and Governance

  • Ensure data accuracy, consistency, and reliability by implementing robust data validation and quality checks.

Stakeholder Collaboration

  • Work with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions.

Performance Optimization

  • Continuously monitor and optimize data pipelines for performance, scalability, and cost-efficiency.

Monitoring and Observability

  • Build and implement monitoring and observability metrics to ensure data quality and detect anomalies in data pipelines.

Documentation and Communication

  • Maintain clear and comprehensive documentation of data processes and effectively communicate technical concepts to non-technical stakeholders.

Qualifications

Required

  • 2+ years of experience in Software Engineering, Computer Science, or related technical discipline.
  • Proven experience maintaining production software coding in languages such as C, C++, C#, Java, JavaScript, Go, Ruby, Rust, or Python.
  • 2+ years of experience in data engineering or analytics engineering roles.
  • Strong proficiency in Python, SQL, Airflow, and Spark.
  • Extensive expertise in building and maintaining robust data pipelines and ETL processes.

Preferred

  • Familiarity with Go and Ruby.
  • Experience with cloud platforms such as AWS, GCP, or Azure.
  • Familiarity with data warehousing solutions (e.g., Snowflake, Redshift, BigQuery).
  • Knowledge of data governance and data security best practices.
  • Excellent verbal and written communication skills.
  • Proven ability to work effectively in a collaborative, cross-functional environment.

Compensation

The base salary range for this job is USD $75,000.00 - USD $198,900.00 per year. Additional benefits include annual bonuses and stock options, with opportunities for sales incentives based on revenue or utilization.

About GitHub

GitHub is the world’s leading AI-powered developer platform with 100 million developers and counting. We’re also home to the biggest open-source community on earth. At GitHub, our goal is to create the space you need to do your best work. We’re remote-first and offer competitive pay, generous learning and growth opportunities, and excellent benefits to support you, wherever you are.

Join us, and let’s change the world, together.

Benefits
Extracted with AI

  • Remote work
  • Competitive pay
  • Learning and growth opportunities
  • Annual bonus
  • Stock options
  • Diverse and inclusive environment

Similar jobs

Last update: 23 minutes ago

GitHub logo
GitHub

Software Engineer II

Join GitHub as a Software Engineer II to enhance collaboration experiences, working remotely with a diverse team.

GitHub logo
GitHub

Software Engineer I - Remote

Join GitHub as a Software Engineer I, enhancing AI-powered capabilities remotely. Work with Go, Ruby on Rails, and modern AI technologies.

GitHub logo
GitHub

Software Engineer II, Copilot

Join GitHub as a Software Engineer II, Copilot. Work on AI-powered platforms, enhance developer tools, and lead technical architecture.

GitHub logo
GitHub

Software Engineer II, Service Mesh

Join GitHub as a Software Engineer II, focusing on Service Mesh within the Developer Experience team, enhancing our Istio-based service mesh.

GitHub logo
GitHub

Senior Software Engineer at GitHub

Senior Software Engineer at GitHub, remote, focusing on internal tooling and platform security.

GitHub logo
GitHub

Software Engineer II, Billing

Join GitHub as a Software Engineer II, Billing. Enhance collaboration with cutting-edge tech in a remote role. Skills: JavaScript, TypeScript, Ruby, Python, React.

Airbnb logo
Airbnb

Staff Software Engineer, Data Infrastructure

Senior Data Infrastructure Engineer at Airbnb, focusing on data engineering tools and frameworks, remote eligible.

GitHub logo
GitHub

Software Engineer, Trust and Safety

Join GitHub as a Software Engineer in Trust and Safety, developing tools to protect our community. Remote work, competitive pay.

GitHub logo
GitHub

Senior Machine Learning Engineer

Senior Machine Learning Engineer at GitHub, focusing on platform health and security using advanced AI techniques.

GitHub logo
GitHub

Mid-Level Software Engineer - Go, Ruby, TypeScript

Join GitHub as a Mid-Level Software Engineer on the Dependency Graph team, focusing on security and open-source software.

Airtable logo
Airtable

Senior Software Engineer, Data

Join Airtable as a Senior Software Engineer, Data, to design and maintain scalable data pipelines and solutions.

GitHub logo
GitHub

Remote Software Engineer with C# and Front-End Development

Join GitHub as a Remote Software Engineer, focusing on C# and Front-End Development. Work with a diverse team to enhance collaboration for developers.

GitHub logo
GitHub

Senior Manager, Software Engineering

Join GitHub as a Senior Manager in Software Engineering, leading remote teams in developing AI-powered solutions.

GitHub logo
GitHub

Software Engineer II - CodeQL Analysis

Join GitHub as a Software Engineer II in Denmark, focusing on CodeQL analysis for software security and development.

GitHub logo
GitHub

Senior Research Engineer - Machine Learning & Software Development

Senior Research Engineer specializing in Machine Learning & Software Development at GitHub, remote work available.

Discord logo
Discord

Senior Software Engineer - Data Platform

Join Discord as a Senior Software Engineer on the Data Platform team, working with GCP, Airflow, and BigQuery.

Strava logo
Strava

Data Engineer II

Join Strava as a Data Engineer II in San Francisco, CA. Work with modern data technologies and a diverse team.

Algolia logo
Algolia

Senior Data Engineer

Join Algolia as a Senior Data Engineer to design and scale data pipelines using Python, Airflow, and AWS technologies.

Abnormal Security logo
Abnormal Security

Software Engineer II - Data Platform

Join Abnormal Security as a Software Engineer II on the Data Platform team, working remotely to build scalable data solutions.

Adobe logo
Adobe

Data Engineer

Join Adobe as a Data Engineer to design and maintain data pipelines, ensuring data quality and security.

Discord logo
Discord

Senior Software Engineer - Data Platform

Senior Software Engineer for Data Platform at Discord, specializing in GCP, BigQuery, and Airflow.

Webflow logo
Webflow

Senior Data Engineer, Data Platform

Senior Data Engineer needed to build scalable data platforms using Kafka, Spark, and AWS. Inclusive team, great benefits.

Mapbox logo
Mapbox

Software Development Engineer II, Data Platform

Join Mapbox as a Software Development Engineer II on the Data Platform team, working remotely with a focus on AWS, Airflow, and Amazon Kinesis.

Fullstory logo
Fullstory

Senior Data Engineer

Senior Data Engineer role focusing on ETL, Python, and Big Data in a remote setting with comprehensive benefits.