Introduction
IBM Technology Zone is the one stop shop for IBMers and business partners to build, show, and share solutions built on IBM technologies to facilitate opportunity progression and customer adoption.
This is a new role in TechZone that provides leadership in designing, implementing, and managing the overall observability framework for the organization. This includes setting up monitoring systems, logging tools, and tracing solutions to provide comprehensive visibility into the system's performance, health, and usage patterns. Additionally, they collaborate with various teams to integrate observability into their development workflows, ensuring that issues can be detected and resolved quickly. The Observability Lead also contributes to the continuous improvement of the observability practice by staying updated on latest trends, technologies, and best practices.
Your Role and Responsibilities
- Develop and implement comprehensive observability solutions, including monitoring, logging, tracing, and alerting systems.
- Work closely with developers, DevOps, and SRE teams to understand their requirements and ensure the observability tools meet their needs.
- Continuously improve the observability infrastructure to ensure system reliability, performance, and scalability.
- Collaborate in building and maintaining TechZone automation that deploys and provisions environments at scale.
- Participate in incident response efforts, providing valuable insights through observability data to diagnose and resolve issues quickly.
- Analyze observability data to identify trends, bottlenecks, and potential issues. Generate reports and dashboards for stakeholders.
- Evaluate and integrate new observability tools and technologies to enhance our monitoring capabilities.
- Develop and promote best practices for observability within the organization, including documentation and training for engineering teams.
Required Technical and Professional Expertise
- Experience: Proven experience in observability, monitoring, or related fields, with a strong understanding of modern observability practices.
- Technical Skills: Proficiency with observability tools such as Prometheus, Grafana, ELK stack, Jaeger, or similar. Strong scripting and automation skills (e.g., Python, Bash).
- System Knowledge: In-depth knowledge of distributed systems, microservices architecture, and cloud platforms (e.g., AWS, Azure, GCP).
- Problem-Solving: Excellent analytical and problem-solving skills, with the ability to diagnose complex issues using observability data.
- Communication: Strong communication skills, with the ability to collaborate effectively with cross-functional teams and present findings to stakeholders.
- Education: Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
Preferred Technical And Professional Expertise
- Preferred consideration will be given to candidates with a development background and demonstrated programming experience.
- Preference to candidates with cloud architecture experience.
Benefits Extracted with AI
- Pension plan
Similar jobs
Last update: 23 minutes ago
SRE Lead at IBM
Lead SRE role at IBM, overseeing system reliability, implementing best practices, and mentoring in New York.
Site Reliability Engineer - IBM Power Systems
Join IBM as a Site Reliability Engineer specializing in IBM Power Systems in Poughkeepsie, NY. Engage in automation, scalability testing, and system performance.
Senior Site Reliability Engineer
Senior Site Reliability Engineer at IBM in Cracow, skilled in AWS, Kubernetes, Linux, and Terraform.
Senior Software Developer
Senior Software Developer role at IBM in Cracow, focusing on hybrid cloud platforms, Kubernetes, and DevOps.
Senior Software Engineer, Observability
Join OpenAI as a Senior Software Engineer in Observability, ensuring system reliability and scalability in a fast-paced environment.
Senior Software Developer
Lead a skilled team in software development focusing on Data Integration at IBM, Cracow. Expertise in Java, JavaScript, C/C++, and cloud services required.
Senior Software Developer - Technical Leader
Senior Software Developer role focusing on technical leadership and cloud-based solutions in Cracow, Poland.
DevOps Developer at IBM
Join IBM as a DevOps Developer in New York, NY. Engage in building, automating, and maintaining cloud and on-prem solutions.
Senior DevOps Engineer
Senior DevOps Engineer role in Bucharest, focusing on system monitoring, cloud infrastructure, and container orchestration.
Manager, AI Advocacy
Lead AI advocacy at IBM, managing a team to create impactful AI content and tutorials. Drive organic traffic and shape AI perception.
Engagement Acceleration Leader - Data & AI
Lead role in IBM's Innovation Studio, focusing on accelerating client outcomes using data and AI in a hybrid work setting.
Senior Software Engineer, Observability - Hosted Services
Join Elastic as a Senior Software Engineer in Observability, working remotely to enhance our cloud monitoring solutions.
Solution Architect Manager
Lead a team of Solution Architects in IBM's Technology Expert Labs, focusing on presales and client solutions in Atlanta, GA.
Observability Engineer - Hybrid in Milan
Join Mollie as an Observability Engineer in Milan, enhancing monitoring systems and ensuring operational excellence.
Principal Software Architect
Join IBM as a Principal Software Architect in Cracow, leading technical strategy for enterprise SaaS solutions.
Site Reliability Engineering Manager
Lead a DevOps team in a dynamic IT environment, focusing on reliability engineering and cloud solutions.
Staff Software Engineer, Observability
Join Reddit as a Staff Software Engineer in Observability, developing tools for large-scale system monitoring and performance.
Senior Product Manager - DevOps Automation AI
Senior Product Manager role focusing on DevOps Automation AI at IBM, integrating AI with cloud-native software solutions.
Senior Site Reliability Engineer
Join Microsoft as a Senior Site Reliability Engineer to design and deliver Office 365 government cloud services.
Senior Software Development Engineer
Senior Software Development Engineer at IBM, Cracow. Expertise in Java, AWS, Microservices, DevOps, and BigData.
Senior Site Reliability Engineer
Join MongoDB as a Senior Site Reliability Engineer in Berlin to design and build global cloud infrastructure, ensuring reliability and performance.
Senior Software Implementation Consultant - Apptio at IBM
Senior Software Implementation Consultant at IBM, specializing in Apptio, Azure DevOps, and Agile methodologies in Raleigh, NC.
Senior Windows Engineer
Senior Windows Engineer at IBM, Radford, VA. Expertise in Windows Server, Active Directory, IaC. Full-time, onsite role with benefits.
Senior Software Developer
Senior Software Developer at IBM, Cracow. Skills: Java, JavaScript, Microservices, Cloud. Senior level, on-site.