Site Reliability Engineer, FlashArray

Position Overview

In today's cloud-centric world, the reliability of cloud platforms underpins everything. Ensuring the heartbeat of these systems is crucial. At the frontier of cloud technology, Site Reliability Engineering (SRE) works diligently to bolster the availability of our cloud infrastructure and services. As we pivot to a cloud-first strategy, Pure Storage seeks Site Reliability Engineers, with the ability to play a leading role in our cloud-focused transformation across the broader engineering organization. This team will be passionate about ensuring impeccable uptime, seamless scalability, observability, and unmatched availability.

This position is based at our office in Prague, Czech Republic. This is a software development position, in a team that is distributed across US (California) and Europe (Prague).

Responsibilities

Become part of our nascent SRE team across US and Europe
Responsible for uptime and reliability of our core services and infrastructure, including proactive monitoring and incident response/ resolution
Maintain 24x7 production environment with a high level of service availability
Manage operational issues, drive root cause analysis and resolution of production issues
Explore and implement new cloud and high availability (HA) technologies and tools
Partner with development teams in defining and implementing improvements in services architecture
Implement automation and orchestration of manual processes required to operate and deploy cloud services
Setup and improve service health monitoring, observability, collecting & reporting metrics, alerting
Interface with engineering to establish a support structure, with runbooks to ensure uptime and customer success

Qualifications

8+ years of experience as Software Engineer and/or SRE or DevOps to support globally distributed SaaS services
Experience with one or more of the following: Java, Python, Go, Perl and/or Ruby
Proven ability to design, develop and operate commercially successful cloud services with high availability and well defined SLA
Experience with IaC, automation & configuration management using tools such as Terraform, Ansible, Puppet, Chef, CloudFormation or ARM templates
Experience with virtualization, containers and management systems such as Kubernetes
Experience setting up monitoring of production services using ELK or something similar
Practical experience setting up support processes using tools such as PagerDuty
Deep understanding of the software delivery process and what it takes to “go live”
In-depth knowledge of a public cloud platform such as AWS, Azure or GCP is a must
Experience with Unix/Linux operating systems internals and administration or networking
BS or higher in Computer Science, Computer Engineering or related field and equivalent practical experience

Benefits
Extracted with AI

Flexible time off
Wellness resources
Company-sponsored events

Similar jobs

Last update: 23 minutes ago

IBM

Senior Site Reliability Engineer

Senior Site Reliability Engineer at IBM in Cracow, skilled in AWS, Kubernetes, Linux, and Terraform.

Position Overview

Responsibilities

Qualifications

Benefits Extracted with AI

Similar jobs

Senior Site Reliability Engineer

Site Reliability Engineer (SRE) - Stability AI

Senior Production SRE Engineer - Storage

Cloud Ops Engineer

Senior Cloud Site Reliability Engineer

Site Reliability Engineer - Enablement

Senior Site Reliability Expert

SRE Lead at IBM

Site Reliability Engineering Manager

Senior Site Reliability Engineer

Senior Site Reliability Engineer - Platform

Lead Software Engineer – SRE (Relocation to Bangkok)

Senior Site Reliability Engineer

Senior Staff Software Engineer – Backend – Singularity Data Lake

Senior Site Reliability Engineer

Senior Site Reliability Engineer - Production Platform

Site Reliability Engineer, CI/CD

SW Engineering Manager

Site Reliability Engineer (SRE) - Hasura Cloud

Site Reliability Engineer - IBM Power Systems

Senior Site Reliability Engineer

Senior Site Reliability Engineer (SRE) - Hasura Cloud

Software Reliability Engineer

Senior Site Reliability Engineer (SRE) - Hasura Cloud

Benefits
Extracted with AI