IBM logo

Observability Lead

IBM

Introduction

IBM Technology Zone is the one stop shop for IBMers and business partners to build, show, and share solutions built on IBM technologies to facilitate opportunity progression and customer adoption.

This is a new role in TechZone that provides leadership in designing, implementing, and managing the overall observability framework for the organization. This includes setting up monitoring systems, logging tools, and tracing solutions to provide comprehensive visibility into the system's performance, health, and usage patterns. Additionally, they collaborate with various teams to integrate observability into their development workflows, ensuring that issues can be detected and resolved quickly. The Observability Lead also contributes to the continuous improvement of the observability practice by staying updated on latest trends, technologies, and best practices.

Your Role and Responsibilities

  • Develop and implement comprehensive observability solutions, including monitoring, logging, tracing, and alerting systems.
  • Work closely with developers, DevOps, and SRE teams to understand their requirements and ensure the observability tools meet their needs.
  • Continuously improve the observability infrastructure to ensure system reliability, performance, and scalability.
  • Collaborate in building and maintaining TechZone automation that deploys and provisions environments at scale.
  • Participate in incident response efforts, providing valuable insights through observability data to diagnose and resolve issues quickly.
  • Analyze observability data to identify trends, bottlenecks, and potential issues. Generate reports and dashboards for stakeholders.
  • Evaluate and integrate new observability tools and technologies to enhance our monitoring capabilities.
  • Develop and promote best practices for observability within the organization, including documentation and training for engineering teams.

Required Technical and Professional Expertise

  • Experience: Proven experience in observability, monitoring, or related fields, with a strong understanding of modern observability practices.
  • Technical Skills: Proficiency with observability tools such as Prometheus, Grafana, ELK stack, Jaeger, or similar. Strong scripting and automation skills (e.g., Python, Bash).
  • System Knowledge: In-depth knowledge of distributed systems, microservices architecture, and cloud platforms (e.g., AWS, Azure, GCP).
  • Problem-Solving: Excellent analytical and problem-solving skills, with the ability to diagnose complex issues using observability data.
  • Communication: Strong communication skills, with the ability to collaborate effectively with cross-functional teams and present findings to stakeholders.
  • Education: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.

Preferred Technical And Professional Expertise

  • Preferred consideration will be given to candidates with a development background and demonstrated programming experience.
  • Preference to candidates with cloud architecture experience.

Benefits
Extracted with AI

  • Pension plan

Similar jobs

Last update: 23 minutes ago

IBM logo
IBM

SRE Lead at IBM

Lead SRE role at IBM, overseeing system reliability, implementing best practices, and mentoring in New York.

IBM logo
IBM

Site Reliability Engineer - IBM Power Systems

Join IBM as a Site Reliability Engineer specializing in IBM Power Systems in Poughkeepsie, NY. Engage in automation, scalability testing, and system performance.

IBM logo
IBM

Senior Site Reliability Engineer

Senior Site Reliability Engineer at IBM in Cracow, skilled in AWS, Kubernetes, Linux, and Terraform.

IBM logo
IBM

Senior Software Developer

Senior Software Developer role at IBM in Cracow, focusing on hybrid cloud platforms, Kubernetes, and DevOps.

OpenAI logo
OpenAI

Senior Software Engineer, Observability

Join OpenAI as a Senior Software Engineer in Observability, ensuring system reliability and scalability in a fast-paced environment.

IBM logo
IBM

Senior Software Developer

Lead a skilled team in software development focusing on Data Integration at IBM, Cracow. Expertise in Java, JavaScript, C/C++, and cloud services required.

IBM logo
IBM

Senior Software Developer - Technical Leader

Senior Software Developer role focusing on technical leadership and cloud-based solutions in Cracow, Poland.

IBM logo
IBM

DevOps Developer at IBM

Join IBM as a DevOps Developer in New York, NY. Engage in building, automating, and maintaining cloud and on-prem solutions.

IBM logo
IBM

Senior DevOps Engineer

Senior DevOps Engineer role in Bucharest, focusing on system monitoring, cloud infrastructure, and container orchestration.

IBM logo
IBM

Manager, AI Advocacy

Lead AI advocacy at IBM, managing a team to create impactful AI content and tutorials. Drive organic traffic and shape AI perception.

IBM logo
IBM

Engagement Acceleration Leader - Data & AI

Lead role in IBM's Innovation Studio, focusing on accelerating client outcomes using data and AI in a hybrid work setting.

Elastic logo
Elastic

Senior Software Engineer, Observability - Hosted Services

Join Elastic as a Senior Software Engineer in Observability, working remotely to enhance our cloud monitoring solutions.

IBM logo
IBM

Solution Architect Manager

Lead a team of Solution Architects in IBM's Technology Expert Labs, focusing on presales and client solutions in Atlanta, GA.

Mollie logo
Mollie

Observability Engineer - Hybrid in Milan

Join Mollie as an Observability Engineer in Milan, enhancing monitoring systems and ensuring operational excellence.

IBM logo
IBM

Principal Software Architect

Join IBM as a Principal Software Architect in Cracow, leading technical strategy for enterprise SaaS solutions.

The Workshop logo
The Workshop

Site Reliability Engineering Manager

Lead a DevOps team in a dynamic IT environment, focusing on reliability engineering and cloud solutions.

Reddit, Inc. logo
Reddit, Inc.

Staff Software Engineer, Observability

Join Reddit as a Staff Software Engineer in Observability, developing tools for large-scale system monitoring and performance.

IBM logo
IBM

Senior Product Manager - DevOps Automation AI

Senior Product Manager role focusing on DevOps Automation AI at IBM, integrating AI with cloud-native software solutions.

Microsoft logo
Microsoft

Senior Site Reliability Engineer

Join Microsoft as a Senior Site Reliability Engineer to design and deliver Office 365 government cloud services.

IBM logo
IBM

Senior Software Development Engineer

Senior Software Development Engineer at IBM, Cracow. Expertise in Java, AWS, Microservices, DevOps, and BigData.

MongoDB logo
MongoDB

Senior Site Reliability Engineer

Join MongoDB as a Senior Site Reliability Engineer in Berlin to design and build global cloud infrastructure, ensuring reliability and performance.

IBM logo
IBM

Senior Software Implementation Consultant - Apptio at IBM

Senior Software Implementation Consultant at IBM, specializing in Apptio, Azure DevOps, and Agile methodologies in Raleigh, NC.

IBM logo
IBM

Senior Windows Engineer

Senior Windows Engineer at IBM, Radford, VA. Expertise in Windows Server, Active Directory, IaC. Full-time, onsite role with benefits.

IBM logo
IBM

Senior Software Developer

Senior Software Developer at IBM, Cracow. Skills: Java, JavaScript, Microservices, Cloud. Senior level, on-site.