Senior Site Reliability Engineering Manager

2 weeks ago


Redmond, Washington, United States Microsoft Corporation Full time
Overview

Are you passionate about harnessing the power of cloud infrastructure to drive innovation and efficiency? Do you thrive in complex problem-solving environments where no two days are ever the same? Microsoft's Azure Storage team is seeking a seasoned Site Reliability Engineering Manager to lead our efforts in optimizing fleet availability and health at scale.

Responsibilities
  • Develop, test, and implement changes to optimize code and improve scalability, leveraging end-to-end technical expertise and telemetry analysis to identify patterns and opportunities for improvement.
  • Investigate hardware and system issues impacting available capacity and customer experience, working closely with cross-functional teams to resolve complex problems.
  • Drive Sprint planning, SCRUM stand-ups, code/design reviews, and regular cross-team meetings to ensure seamless collaboration and knowledge sharing.
  • Respond to incidents during regular on-call rotations, sharing details through post-mortem reports and regular review meetings to drive continuous improvement.
About the Role

This is an exceptional opportunity to join a high-performing team at the forefront of cloud infrastructure innovation. As a Senior Site Reliability Engineering Manager, you will be responsible for leading a team of engineers in designing, developing, and improving automation and uptime. You will work closely with cross-functional teams to drive business outcomes and make a significant impact on reducing costs and improving customer satisfaction.

What We Offer

At Microsoft, we empower every person and organization on the planet to achieve more. As a member of our team, you will be part of a culture that values respect, integrity, and accountability. You will have the opportunity to work on massive distributed systems, deepen your knowledge and experience, and make a meaningful impact on the business.



  • Redmond, Washington, United States Microsoft Corporation Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineering Manager to lead our team in delivering high-quality, scalable, and reliable cloud services. As a key member of our Office 365 Enterprise Cloud team, you will be responsible for building and developing a team of Software Engineers focused on Site Reliability, providing deep...


  • Redmond, Washington, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineering Manager to lead our team of engineers in delivering high-quality, scalable, and reliable cloud services. As a key member of our Cloud Infrastructure team, you will be responsible for designing, implementing, and operating our cloud infrastructure to meet the needs of our...


  • Redmond, Washington, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineering Manager to lead our team in delivering high-quality, scalable, and reliable cloud services. As a key member of our engineering organization, you will be responsible for building and managing a team of software engineers focused on site reliability, ensuring the availability,...


  • Redmond, Washington, United States Microsoft Corporation Full time

    OverviewAre you passionate about harnessing the power of cloud infrastructure to drive innovation and efficiency? Do you thrive in complex problem-solving environments where no two days are ever the same? Microsoft's Azure Storage team is seeking a seasoned Site Reliability Engineering Manager to lead our efforts in optimizing fleet availability and health...


  • Redmond, Washington, United States Microsoft Full time

    Job Title: Senior Site Reliability EngineerMicrosoft is seeking a highly skilled Senior Site Reliability Engineer to join our Cloud+Artificial Intelligence (C+AI) Silver SQL Team. As a key member of this team, you will play a critical role in deploying and operating the Azure SQL family of services within Azure Government clouds.Key Responsibilities:Design...


  • Redmond, Washington, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled and experienced Senior Site Reliability Engineering Manager to join our team at Microsoft. As a key member of our engineering organization, you will be responsible for providing technical leadership to a team of highly passionate and skilled engineers.Key ResponsibilitiesRecruit, onboard, and grow a team of...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Senior Site Reliability EngineerMicrosoft Corporation is seeking a highly skilled Senior Site Reliability Engineer to join our Core Silver Team. As a key member of our team, you will be responsible for deploying and operating a Secure Work Area, including the infrastructure for collaboration within an airgapped environment.About the RoleIn this...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Senior Site Reliability EngineerMicrosoft is seeking a highly skilled Senior Site Reliability Engineer to join our Cloud+Artificial Intelligence (C+AI) Silver SQL Team. As a key member of this team, you will be responsible for deploying and operating the Azure SQL family of services within Azure Government clouds.Responsibilities:Design and...


  • Redmond, Washington, United States Microsoft Corporation Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Core Silver Team at Microsoft Corporation. As a key member of our team, you will be responsible for deploying and operating a Secure Work Area, including the infrastructure for collaboration within an airgapped environment.ResponsibilitiesDesign and implement...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability EngineerSpaceX is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Responsibilities:Develop automation to deploy and manage compute resources both on-premises and in the...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Transforming the Future of Cloud ServicesAt Microsoft Corporation, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help shape the future of our cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, ensuring the reliability,...


  • Redmond, Washington, United States Microsoft Corporation Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Microsoft Corporation. As a Site Reliability Engineer, you will play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and scalability of our cloud services.ResponsibilitiesCollaborate with cross-functional teams to...


  • Redmond, Washington, United States Microsoft Full time

    Job Title: Senior Active Directory Site Reliability EngineerAt Microsoft, we're looking for a highly skilled Senior Active Directory Site Reliability Engineer to join our team. As a key member of our Identity team, you will play a critical role in ensuring the availability, latency, performance, and security of our Identity systems.Responsibilities:Design...


  • Redmond, Washington, United States Microsoft Corporation Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Microsoft Corporation. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our cloud services.Key ResponsibilitiesDesign, develop, and deliver software engineering solutions to serve and protect O365...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability Engineer (Starshield)Join SpaceX, a pioneering company in space exploration, as a Site Reliability Engineer (Starshield) in Redmond, WA. This role involves working on top-secret clearance projects, leveraging Starlink technology and launch capability to support national security efforts.About the Role:Develop automation to deploy...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability EngineerSpaceX is a pioneering company that aims to make humanity a multi-planetary species. We are seeking a highly skilled Site Reliability Engineer to join our Starshield team, which leverages our Starlink technology and launch capability to support national security efforts.About the RoleWe are looking for a talented engineer...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability EngineerSpaceX is a pioneering company that aims to make humanity a multi-planetary species. We are seeking a highly skilled Site Reliability Engineer to join our Starshield team, which leverages our Starlink technology and launch capability to support national security efforts.Job SummaryWe are looking for a talented engineer who...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability EngineerJoin SpaceX, a pioneering company in space exploration and development, as a Site Reliability Engineer for our Starshield program. As a key member of our team, you will play a crucial role in ensuring the reliability and efficiency of our satellite systems.Responsibilities:Develop automation to deploy and manage compute...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job DescriptionMicrosoft is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability, scalability, and performance of our cloud services.Key ResponsibilitiesDesign, develop, and deliver software engineering solutions to serve and protect O365...


  • Redmond, Washington, United States SpaceX Full time

    Job Title: Site Reliability EngineerSpaceX is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our systems and infrastructure.Responsibilities:Develop automation to deploy and manage compute resources both on-premises and in the...