Site Reliability Engineer (SRE) - Hasura Cloud

Job Overview

Hasura is seeking a skilled Site Reliability Engineer (SRE) to join our team and ensure the smooth operation of Hasura Cloud systems. This role is crucial for maintaining system reliability and facilitating seamless updates without downtime. You will work remotely from India, aligning with US hours, and have the option to work from our Bangalore office if preferred.

Key Responsibilities

Infrastructure Development: Build and maintain infrastructure using Terraform, Kubernetes, VMs, and bare metal instances.
System Design: Design core infrastructure components to support Hasura Cloud's scalability, handling thousands of concurrent requests.
Cloud Expansion: Expand Hasura Cloud's capabilities to support multiple cloud providers.
Deployment Process Improvement: Enhance deployment processes to ensure reliability and minimize disruptions.
Incident Response: Participate in a PagerDuty rotation to address availability incidents and support service engineers with customer issues.
Proactive Issue Resolution: Use development time to address systemic issues and prevent future incidents.
Monitoring and Alerts: Design intelligent monitoring systems that provide meaningful alerts based on symptoms rather than causes.
Documentation and Automation: Document actions to create repeatable processes and automate tasks to improve efficiency.
Production Debugging: Troubleshoot production issues across various services and stack levels.
Infrastructure Growth Planning: Strategize the growth of Hasura Cloud's infrastructure.

Requirements

Experience: 4+ years in a similar role, with a strong understanding of system behaviors, edge cases, and failure modes.
Technical Skills: Proficiency in Linux, Unix Shell, Terraform, and programming languages such as Go and Python.
Collaboration: Ability to work asynchronously with a globally distributed team and document processes thoroughly.
Automation: Passion for building automation and tooling to streamline repetitive tasks.
Cloud and Monitoring Tools: Experience with cloud providers (AWS, GCP, Azure) and monitoring tools (Honeycomb, Datadog, Prometheus, Grafana).

Nice to Have

Familiarity with Hasura and its GraphQL APIs.
Strong SQL skills, particularly with PostgreSQL.
Experience in database management and scaling.

Working at Hasura

At Hasura, we empower developers to build modern applications quickly. Our team is dedicated to enhancing the developer experience and making our tools as user-friendly as possible. We offer a flexible work environment, allowing for remote or in-person collaboration at our offices in San Francisco and Bangalore.

Perks

Remote & Hybrid Work Environment: Flexibility to work remotely or from our office spaces.
Self-care Fridays: The second Friday of every month is a day off for personal rejuvenation.
Equipment and Learning Allowance: Budgets for necessary tools and learning opportunities.
Donation Matching: Annual fund to match donations to global organizations.
Flexible Timings & PTO: Freedom to set work schedules and generous paid time off options.

Application Process

We encourage applications even if you don't meet all the requirements. We value diverse perspectives and are open to discussing any questions you may have about our culture and work processes during the interview.

Join us at Hasura and contribute to building a robust developer ecosystem with cutting-edge technology.

Benefits
Extracted with AI

Remote & Hybrid Work Environment
Self-care Fridays
Equipment and learning allowance
Donation Matching
Flexible timings & PTO

Similar jobs

Last update: 23 minutes ago

Hasura

Senior Site Reliability Engineer (SRE) - Hasura Cloud

Join Hasura as a Senior Site Reliability Engineer to maintain and enhance Hasura Cloud's reliability and performance.

Job Overview

Key Responsibilities

Requirements

Nice to Have

Working at Hasura

Perks

Application Process

Benefits Extracted with AI

Similar jobs

Senior Site Reliability Engineer (SRE) - Hasura Cloud

Senior Site Reliability Engineer (SRE) - Hasura Cloud

Senior/Staff Software Engineer - Backend

Site Reliability Engineer (SRE) - Stability AI

Site Reliability Engineer - Enablement

Senior DevOps Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Software Engineer - Cloud Platform Reliability

Senior Site Reliability Engineer

Senior Platform Engineer, SRE

Senior Site Reliability Expert

Senior Software Engineer - Cloud Operations

Site Reliability Engineer - Delivery: Deployments, North America

Senior Systems Engineer - Cloud Infrastructure

Site Reliability Engineering Manager

Senior Cloud Site Reliability Engineer

Software Engineer (Fullstack/Cloud)

Senior Platform Engineer SRE

Staff Platform Engineer (Remote)

Senior Backend Engineer

Senior Software Engineer, GraphQL

Senior Production SRE Engineer - Storage

Tech Lead, Site Reliability Engineering (SRE)

Benefits
Extracted with AI