Introduction
IBM Technology Zone is the one stop shop for IBMers and business partners to build, show, and share solutions built on IBM technologies to facilitate opportunity progression and customer adoption.
This is a new role in TechZone that provides leadership in designing, implementing, and managing the overall observability framework for the organization. This includes setting up monitoring systems, logging tools, and tracing solutions to provide comprehensive visibility into the system's performance, health, and usage patterns. Additionally, they collaborate with various teams to integrate observability into their development workflows, ensuring that issues can be detected and resolved quickly. The Observability Lead also contributes to the continuous improvement of the observability practice by staying updated on latest trends, technologies, and best practices.
Your Role and Responsibilities
- Develop and implement comprehensive observability solutions, including monitoring, logging, tracing, and alerting systems.
- Work closely with developers, DevOps, and SRE teams to understand their requirements and ensure the observability tools meet their needs.
- Continuously improve the observability infrastructure to ensure system reliability, performance, and scalability.
- Collaborate in building and maintaining TechZone automation that deploys and provisions environments at scale.
- Participate in incident response efforts, providing valuable insights through observability data to diagnose and resolve issues quickly.
- Analyze observability data to identify trends, bottlenecks, and potential issues. Generate reports and dashboards for stakeholders.
- Evaluate and integrate new observability tools and technologies to enhance our monitoring capabilities.
- Develop and promote best practices for observability within the organization, including documentation and training for engineering teams.
Required Technical and Professional Expertise
- Experience: Proven experience in observability, monitoring, or related fields, with a strong understanding of modern observability practices.
- Technical Skills: Proficiency with observability tools such as Prometheus, Grafana, ELK stack, Jaeger, or similar. Strong scripting and automation skills (e.g., Python, Bash).
- System Knowledge: In-depth knowledge of distributed systems, microservices architecture, and cloud platforms (e.g., AWS, Azure, GCP).
- Problem-Solving: Excellent analytical and problem-solving skills, with the ability to diagnose complex issues using observability data.
- Communication: Strong communication skills, with the ability to collaborate effectively with cross-functional teams and present findings to stakeholders.
- Education: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Preferred Technical And Professional Expertise
- Preferred consideration will be given to candidates with a development background and demonstrated programming experience.
- Preference to candidates with cloud architecture experience.
Benefits Extracted with AI
- Pension plan
Similar jobs
Last update: 23 minutes ago
SRE Lead at IBM
Lead SRE role at IBM, overseeing system reliability, implementing best practices, and mentoring in New York.
Site Reliability Engineer - IBM Power Systems
Join IBM as a Site Reliability Engineer specializing in IBM Power Systems in Poughkeepsie, NY. Engage in automation, scalability testing, and system performance.
Senior Site Reliability Engineer
Senior Site Reliability Engineer at IBM in Cracow, skilled in AWS, Kubernetes, Linux, and Terraform.
Senior Software Developer
Senior Software Developer role at IBM in Cracow, focusing on hybrid cloud platforms, Kubernetes, and DevOps.
Senior Systems Engineer, Managed Operations
Join AWS as a Senior Systems Engineer in Berlin to lead operations for the European Sovereign Cloud, ensuring high-availability AWS services.
Cloud Solution Engineer (IC4)
Join Oracle as a Cloud Solution Engineer to design and deploy cloud architectures, driving customer success in Amsterdam.
Associate Integration Solutions Technical Lead
Join EIB as an Associate Integration Solutions Technical Lead in Luxembourg, driving seamless integration solutions with cutting-edge technologies.
Senior Software Engineer, Observability
Join OpenAI as a Senior Software Engineer in Observability, ensuring system reliability and scalability in a fast-paced environment.
Cloud Engineer
Join Tibo Energy as a Cloud Engineer to drive energy transition with cloud architecture skills in a dynamic team.
Lead Developer with DevOps and Functional Programming
Join Reaktor as a Lead Developer in Amsterdam, focusing on DevOps, Functional Programming, and JavaScript in a hybrid work environment.
Oracle Cloud Engineer
Join Albert Heijn as an Oracle Cloud Engineer to drive automation and manage cloud infrastructure in Zaandam, Netherlands.
FullStack Engineer (Infrastructure Monitoring)
Join Coralogix as a FullStack Engineer to develop scalable solutions for Infrastructure Monitoring. Master Angular, Python, AWS, and more.
Staff Software Engineer: Data & Infrastructure Platforms - Metrics & Alerting
Join Uber's Amsterdam team as a Staff Software Engineer focusing on Data & Infrastructure Platforms, specializing in Metrics & Alerting.
Senior Elastic Stack as a Service (ELKaaS) DevOps Engineer
Join ING as a Senior DevOps Engineer to enhance our ELKaaS platform, leveraging Docker, Kubernetes, and Azure in a hybrid work environment.
Production Engineer
Join Optiver as a Production Engineer in Amsterdam to manage live trading environments and enhance system reliability and performance.
Senior Software Developer
Lead a skilled team in software development focusing on Data Integration at IBM, Cracow. Expertise in Java, JavaScript, C/C++, and cloud services required.
Staff Software Engineer
Join Aiven as a Staff Software Engineer to develop cloud operations platforms using open-source technologies. Hybrid work in Berlin.
Senior Software Developer - Technical Leader
Senior Software Developer role focusing on technical leadership and cloud-based solutions in Cracow, Poland.
Senior Backend Developer with TypeScript
Join Tibo Energy as a Senior Backend Developer to lead TypeScript-based solutions in energy management.
Full Stack Team Leader .Net
Lead a remote full-stack team with .NET and Angular expertise, focusing on technical leadership and hands-on development.
Senior IoT Engineer
Join Skytree as a Senior IoT Engineer to lead IoT projects, focusing on Azure IoT solutions, edge computing, and data pipelines.
DevOps Developer at IBM
Join IBM as a DevOps Developer in New York, NY. Engage in building, automating, and maintaining cloud and on-prem solutions.
Senior Software Engineer: Configuration Management/Deployment
Join Uber's Amsterdam team as a Senior Software Engineer focusing on configuration management and deployment. Solve infrastructure challenges at scale.
Senior DevOps Engineer
Senior DevOps Engineer role in Bucharest, focusing on system monitoring, cloud infrastructure, and container orchestration.