Introduction
IBM Technology Zone is the one stop shop for IBMers and business partners to build, show, and share solutions built on IBM technologies to facilitate opportunity progression and customer adoption.
This is a new role in TechZone that provides leadership in designing, implementing, and managing the overall observability framework for the organization. This includes setting up monitoring systems, logging tools, and tracing solutions to provide comprehensive visibility into the system's performance, health, and usage patterns. Additionally, they collaborate with various teams to integrate observability into their development workflows, ensuring that issues can be detected and resolved quickly. The Observability Lead also contributes to the continuous improvement of the observability practice by staying updated on latest trends, technologies, and best practices.
Your Role and Responsibilities
- Develop and implement comprehensive observability solutions, including monitoring, logging, tracing, and alerting systems.
- Work closely with developers, DevOps, and SRE teams to understand their requirements and ensure the observability tools meet their needs.
- Continuously improve the observability infrastructure to ensure system reliability, performance, and scalability.
- Collaborate in building and maintaining TechZone automation that deploys and provisions environments at scale.
- Participate in incident response efforts, providing valuable insights through observability data to diagnose and resolve issues quickly.
- Analyze observability data to identify trends, bottlenecks, and potential issues. Generate reports and dashboards for stakeholders.
- Evaluate and integrate new observability tools and technologies to enhance our monitoring capabilities.
- Develop and promote best practices for observability within the organization, including documentation and training for engineering teams.
Required Technical and Professional Expertise
- Experience: Proven experience in observability, monitoring, or related fields, with a strong understanding of modern observability practices.
- Technical Skills: Proficiency with observability tools such as Prometheus, Grafana, ELK stack, Jaeger, or similar. Strong scripting and automation skills (e.g., Python, Bash).
- System Knowledge: In-depth knowledge of distributed systems, microservices architecture, and cloud platforms (e.g., AWS, Azure, GCP).
- Problem-Solving: Excellent analytical and problem-solving skills, with the ability to diagnose complex issues using observability data.
- Communication: Strong communication skills, with the ability to collaborate effectively with cross-functional teams and present findings to stakeholders.
- Education: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Preferred Technical And Professional Expertise
- Preferred consideration will be given to candidates with a development background and demonstrated programming experience.
- Preference to candidates with cloud architecture experience.
Benefits Extracted with AI
- Pension plan
Similar jobs
Last update: 23 minutes ago
Reliability Programme Manager - Tech BE SRE - Change Expert
Join ING Belgium as a Reliability Programme Manager to drive SRE practices and improve service reliability.
Senior Site Reliability Engineer - OSDU
Join VASS as a Senior Site Reliability Engineer in Brussels, enhancing platform reliability and availability for the European Commission.
SRE Engineering Manager
Join Mindbox SA as an SRE Engineering Manager in Warsaw. Lead teams, manage software development, and ensure reliability in a hybrid work environment.
Site Reliability Engineer - Enablement
Join Happening as a Site Reliability Engineer to enhance gaming operations' performance and reliability using Kubernetes, Terraform, and more.
Frontend Developer with React.js and TypeScript
Join IBM as a Frontend Developer in Dublin, focusing on React.js, TypeScript, and UI design for AI-powered cloud solutions.
Senior Microsoft Fullstack Developer
Join Orion Innovation as a Senior Microsoft Fullstack Developer in Montvale, NJ. Work with Angular, C#, and .NET in a hybrid environment.
Senior Software Engineer, Observability
Join OpenAI as a Senior Software Engineer in Observability, ensuring system reliability and scalability in a fast-paced environment.
Principal Software Engineer - HealthTech
Join CareAbout Health as a Principal Software Engineer to lead HealthTech innovations using AWS, Python, and cloud computing.
Global IT LLM Engineer Director & Chapter Lead
Lead AI and ML innovation as Global IT LLM Engineer Director at BCG, focusing on GenAI product development and optimization.
Senior Software Developer (C#, Microservices)
Senior Software Developer role in Austin, TX, focusing on C# and Microservices with competitive salary and benefits.
Principal Software Developer - Frontend Framework
Join Oracle as a Principal Software Developer focusing on frontend frameworks. Work remotely with cutting-edge technologies.
Senior AI Engineer
Join Poggio as a Senior AI Engineer to innovate AI systems for enterprise sales, focusing on AI capabilities and system performance.
Senior Full Stack Engineer, LLM APIs
Join Abridge as a Senior Full Stack Engineer to build innovative ML-powered solutions in healthcare AI, focusing on LLM APIs and cloud services.
Lead Full Stack Engineer with Java and Spring Boot
Lead Full Stack Engineer role in Chicago, focusing on Java, Spring Boot, and AWS for Morningstar Indexes.
ML Engineering Manager, LLM Foundation
Lead ML engineering team at Stripe, focusing on LLMs and AI/ML systems. Drive innovation and manage high-impact projects.
Frontend Engineer, AI
Join Aleph as a Frontend Engineer focusing on AI to develop innovative features using React.js and AI technologies in a remote role.
Java Fullstack AWS Developer
Seeking a Java Fullstack AWS Developer with expertise in AWS, Angular, and Java for on-site role in New York. Join our innovative engineering team.
Senior Full-Stack Web Developer
Remote Senior Full-Stack Web Developer role at Grafana Labs, focusing on Next.js, Node.js, and Tailwind CSS.
Software Engineer, Growth
Join Arena as a Software Engineer, Growth, to solve complex challenges with AI, focusing on machine learning and algorithm design.
Senior Software Engineer - LLM
Join Snowflake as a Senior Software Engineer to build scalable machine learning platforms with LLMs, leveraging Python and TensorFlow.
Senior Full Stack Engineer - Assistant Vice President
Join Deutsche Bank as a Senior Full Stack Engineer in Cary, NC, driving engineering practices and application modernization.
Lead AI Full Stack Developer
Lead AI Full Stack Developer role in Alpharetta, GA, focusing on AI-driven applications using GCP, full-stack development, and MLOps.
Entry Level iOS Engineer
Join Bevel as an Entry Level iOS Engineer to develop innovative health apps using Swift and Objective-C in New York.
Senior Frontend Developer with React and TypeScript
Join IBM as a Senior Frontend Developer to create AI-powered, cloud-native solutions using React and TypeScript.