Cloud Monitoring SRE Manager

2 weeks ago


Seattle, Washington, United States Apple Full time

At Apple, we're looking for a passionate and dedicated Site Reliability Engineering Manager to lead a team focused on providing our customers with the highest quality Apple Services experience.

Our services have to scale globally, stay highly available, and "just work." If you love designing, engineering, and running systems and infrastructure that will help millions of customers, then this is the place for you.

The Cloud Monitoring SRE organization is specifically tasked with enabling other teams to better understand their infrastructure and services, providing extraordinary observability capabilities.

As a Site Reliability Engineering manager for the Cloud Monitoring Team at Apple, you will be working to build and mentor a team to improve the reliability and performance of the software systems that provide access to the services & infrastructure that runs Apple.

Our monitoring, alerting, and visualization platform analyzes billions of metrics per minute and comprises the central nervous system of Apple's architecture.

Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges.

As a Site Reliability Engineering manager, you will be leading a team responsible for providing the platform for mission-critical observability services to maintain constant uptime, scale seamlessly, and allow for new applications and services to thrive.

The successful candidate will be highly self-motivated with a passion for excellence, quality, and detail.

The SRE Manager will not only support operations but also collaborate with the developers and architects within the team to aid in the design and assist with the implementation to improve stability, security, and scalability.

Responsibilities:

  • Lead SRE teams responsible for reliability and performance of cloud-based monitoring services
  • Leading and growing the engineers on your team
  • Lead staging and production environments with goal of maximizing availability
  • Promote observability of systems for monitoring, alerting, and metrics reporting
  • Advocate best practices of reliability engineering

Requirements:

  • Minimum 5+ years of handling services in a large scale environment
  • Desire to build, grow, and mentor a team to meet both their career goals and the organization's goals
  • Experience with hiring and leading engineers
  • Experience with Cloud Computing technologies (particularly Kubernetes)
  • Experience and confidence around incident response and incident management
  • Experience with the Prometheus ecosystem
  • Practical experience in Python, bash scripting
  • Theoretical knowledge of Go, Java, and/or Scala
  • Acute aim to automate manual operations and to improve them through repeated iteration
  • Strong sense of ownership and integrity demonstrated through clear communication and collaboration

  • Cloud Monitoring SRE

    2 weeks ago


    Seattle, Washington, United States Apple Full time

    Cloud Monitoring SREAt Apple, we're looking for a skilled Cloud Monitoring SRE to join our team. As a Cloud Monitoring SRE, you will be responsible for designing and building the next generation of cloud and systems monitoring infrastructure, focusing on automation, availability, performance, and efficiency at scale.You will work closely with our engineering...


  • Seattle, Washington, United States Apple Full time

    Job Title: Cloud Monitoring SRE ManagerAt Apple, we're looking for a highly skilled Cloud Monitoring SRE Manager to join our team. As a key member of our Cloud Monitoring organization, you will be responsible for leading a team of engineers to design, build, and operate our cloud-based monitoring services.About the RoleWe're seeking a seasoned Site...


  • Seattle, Washington, United States Apple Full time

    Job Title: Cloud Monitoring SRE ManagerAt Apple, we're looking for a highly skilled Cloud Monitoring SRE Manager to join our team. As a key member of our Cloud Monitoring organization, you will be responsible for leading a team of engineers to design, build, and operate our cloud-based monitoring services.Key Responsibilities:Lead a team of engineers to...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineering Manager to lead our Cloud Monitoring team at Apple. As a key member of our Apple Services Engineering organization, you will be responsible for designing, building, and operating the monitoring and observability platform that enables our customers to have a seamless...

  • Cloud Monitoring SRE

    2 weeks ago


    Seattle, Washington, United States Apple Full time

    Job Description:At Apple, we're looking for a skilled Cloud Monitoring SRE to join our team. As a Cloud Monitoring SRE, you will be responsible for designing and building the next generation of cloud and systems monitoring infrastructure, focusing on automation, availability, performance, and efficiency at scale.Key Responsibilities:Design and build cloud...


  • Seattle, Washington, United States Apple Full time

    Role SummaryAt Apple, we're committed to delivering exceptional services that revolutionize entire industries. As a Cloud Monitoring SRE Manager, you'll play a critical role in ensuring the reliability and performance of our cloud-based monitoring services.Key ResponsibilitiesLead SRE teams responsible for the reliability and performance of cloud-based...


  • Seattle, Washington, United States Apple Full time

    About the RoleApple is seeking a highly skilled Site Reliability Engineering Manager to lead our Cloud Monitoring team. As a key member of our Apple Services Engineering organization, you will be responsible for designing, building, and operating the monitoring and observability platform that fuels our services, including iCloud, iTunes, Siri, and Maps. Key...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Cloud Monitoring SRE Manager to lead our team in providing exceptional observability capabilities for our customers. As a key member of our Cloud Services Engineering team, you will be responsible for designing, engineering, and running systems and infrastructure that will help millions of customers.Key...


  • Seattle, Washington, United States Apple Full time

    Role OverviewApple is seeking a highly skilled Cloud Monitoring SRE Manager to lead a team responsible for ensuring the reliability and performance of our cloud-based monitoring services. As a key member of our Service Engineering team, you will be responsible for designing, implementing, and maintaining the systems and infrastructure that support our...

  • Cloud Monitoring SRE

    3 weeks ago


    Seattle, Washington, United States Apple Full time

    Cloud Monitoring SRE - Automation ExpertAt Apple, we're looking for a talented Cloud Monitoring SRE - Automation Expert to join our team. As a key member of our Cloud Monitoring team, you'll be responsible for designing and building the next generation of cloud and systems monitoring infrastructure. You'll work closely with our engineering teams to automate...

  • Cloud Monitoring SRE

    3 weeks ago


    Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Cloud Monitoring SRE to join our team at Apple. As a Cloud Monitoring SRE, you will be responsible for designing, building, and operating the monitoring infrastructure that provides visibility into the services and infrastructure that run Apple.Key ResponsibilitiesDesign and build the next generation of cloud and...

  • Cloud Monitoring SRE

    3 weeks ago


    Seattle, Washington, United States Apple Full time

    Job DescriptionApple Services Engineering infrastructure is BIG. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As a Site Reliability Engineer on the Cloud Monitoring Team at Apple, you will be working to improve the reliability and performance of the...


  • Seattle, Washington, United States Snapx Full time

    SRE / Sr. DevOps EngineerLocation: Seattle, WADuration: 12 monthsJob Summary:We are seeking an experienced SRE / Sr. DevOps Engineer to join our team at Snapx. As a key member of our engineering team, you will be responsible for designing, implementing, and operating large-scale systems on Azure Cloud. Your expertise in cloud computing, DevOps, and SRE will...


  • Seattle, Washington, United States People Tech Group Inc Full time

    Job Title: SRE / Sr. DevOps EngineerLocation: RemoteDuration: 12 months +Job Summary:We are seeking an experienced Senior Engineer to join our team as a SRE / Sr. DevOps Engineer. The ideal candidate will have a strong background in cloud infrastructure, Linux systems, and DevOps practices.Key Responsibilities:Design and implement scalable cloud...

  • SRE DevOps Engineer

    3 weeks ago


    Seattle, Washington, United States Adobe Full time

    About the RoleWe are seeking an experienced SRE DevOps Engineer to join our Identity Resilience team at Adobe. As a key member of our team, you will be responsible for building and evolving the next generation of Identity Services for Adobe's cloud platform.Key ResponsibilitiesDesign and implement performance and availability optimizations across all layers...

  • SRE DevOps Engineer

    3 weeks ago


    Seattle, Washington, United States Adobe Full time

    About AdobeAt Adobe, we're passionate about empowering people to create and deliver exceptional digital experiences. Our company is built on a foundation of innovation, creativity, and a commitment to making a positive impact on the world. We're looking for talented individuals to join our team and help us shape the future of digital experiences.The...


  • Seattle, Washington, United States CloudBC Labs Full time

    Job Title: Senior SRE - Data DevOps SpecialistCloudBC Labs is seeking a highly skilled Senior SRE - Data DevOps Specialist to join our team. As a key member of our Cloud Infrastructure team, you will be responsible for ensuring the health and reliability of our production systems.Key Responsibilities:Develop and maintain monitoring dashboards to ensure...

  • SRE/DevOps Engineer

    3 weeks ago


    Seattle, Washington, United States Capgemini Full time

    Job Title: SRE/DevOps EngineerCapgemini is seeking an experienced SRE/DevOps Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining scalable and highly available systems in the cloud.Key Responsibilities:Design and implement scalable and highly available systems in the...

  • Cloud Advocate

    2 weeks ago


    Seattle, Washington, United States Datadog Full time

    About the RoleWe are seeking a highly skilled Developer Advocate to join our team at Datadog. As a key member of our engineering team, you will play a critical role in shaping the future of cloud observability and monitoring.Key ResponsibilitiesDevelop and deliver technical content, including blog posts, conference talks, and demos, to educate developers on...

  • Cloud Advocate

    3 weeks ago


    Seattle, Washington, United States Datadog Full time

    We're seeking a seasoned Cloud Advocate to join our team at Datadog. As a key member of our Cloud Alliances team, you'll play a crucial role in shaping the future of cloud observability and monitoring.Your primary responsibility will be to educate developers on the benefits of cloud computing, leveraging your expertise in GCP services to create compelling...