Senior Lead Incident Manager – Site Reliability Engineer
4 days ago
The Azure Senior Incident Manager - Site Reliability Engineer is responsible for driving the resolution of complex, multi-service outages across Azure's global infrastructure in our Air Gap Clouds. This role provides operational leadership during high-severity incidents, ensuring timely mitigation, clear stakeholder communication, and adherence to compliance and privacy standards. The position requires technical breadth, demonstrated leadership under pressure, and the ability to coordinate across engineering, operations, and customer-facing teams.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities- Command & Control: Act as the primary incident commander for major Azure outages, ensuring forward progress and clarity throughout the incident lifecycle.
- Incident Leadership: Lead cross-functional teams (engineering, support, operations) to restore services quickly and minimize customer impact.
- Provide timely, accurate updates to executives, internal stakeholders, and customer-facing teams.
- Process Governance: Ensure adherence to incident management protocols, including legal, privacy, and compliance requirements.
- Continuous Improvement: Conduct Post-Incident Reviews (PIRs), identify systemic issues, and drive platform improvements.
- Tooling & Automation: Leverage and enhance incident management tools such as Outage Hub and IcM for real-time visibility and coordination.
- Mentorship: Guide and coach other incident managers and engineers on best practices for incident response
- Rythm of Business: Ensure our Executive Leaders receive regular updates, critical signals and progress reports on cloud-wide initiatives.
- Embody our culture and values.
Required/Minimum Qualifications:
- Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.
Other Requirements:
Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- The successful candidate must have an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination.
- Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance
Preferred/Additional Qualifications:
- Leadership: Proven ability to lead global, distributed teams during high-pressure situations.
- Innovation: Track record of implementing automation and process improvements in incident management
Site Reliability Engineering IC3 - The typical base pay range for this role across the U.S. is USD $100,600 - $199,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $131,400 - $215,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
Microsoft will accept applications for the role until October 27, 2025.
#Silver
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
-
Senior Site Reliability Engineer
4 days ago
Redmond, Washington, United States Microsoft Full time $119,800 - $234,700OverviewDo you want to be at the heart of cloud computing? The Compute team is at the core of Azure and is growing incredibly fast. We build and manage fault tolerant distributed systems on top of commodity datacenter hardware, to deliver an infrastructure for hosting customer applications. The platform is at the core of Azure that provides millions of...
-
Site Reliability Engineer
7 days ago
Redmond, Washington, United States Microsoft Full timeOverviewJoin the Microsoft Specialized Cloud (MSC) team - the next generation of platform and experiences enabling Microsoft and Azure, the fastest-growing cloud platform in the world which makes billions of dollars in revenue. As a Site Reliability Engineer on the MSC team, you will work alongside multiple teams within Azure, Office, and throughout...
-
Site Reliability Engineer II
2 days ago
Redmond, Washington, United States Microsoft Full time $100,600 - $199,000OverviewMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.Microsoft's Azure Data engineering team is leading the transformation of analytics in the world...
-
Site Reliability Engineer II
7 days ago
Redmond, Washington, United States Microsoft Full time $100,600 - $199,000OverviewThe Cloud & AI organization accelerates Microsoft's mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers' heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme...
-
Sr. Site Reliability Engineer
2 weeks ago
Redmond, Washington, United States SpaceX Full time $160,000 - $220,000SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.SR. SITE RELIABILITY ENGINEER (STARSHIELD) - TOP SECRET CLEARANCEStarshield...
-
Site Reliability Engineer II
4 days ago
Redmond, Washington, United States Microsoft Full timeOverviewMicrosoft has an exciting opportunity for a Site Reliability Engineer II in the Cloud+AI Azure Data Team. Microsoft's Azure Data engineering team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence.The products in...
-
Senior Facilities Manager, Reliability
2 weeks ago
Redmond, Washington, United States Evotec Full timeJob Title: Senior Facilities Manager, ReliabilityShifts: Monday to Friday, 8am till 5pmAbout Us: this is who we areAt Just Evotec Biologics, we believe that curiosity is the spark that drives innovation and success. As a forward-thinking team, we thrive on challenging the status quo, learning from each other, and pushing the boundaries of what's possible....
-
Redmond, Washington, United States Providence Health & Services Full timeDescriptionProvidence caregivers are not simply valued – they're invaluable. Join our team at Enterprise Information Services and thrive in our culture of patient-focused, whole-person care built on understanding, commitment, and mutual respect. Your voice matters here, because we know that to inspire and retain the best people, we must empower them.As a...
-
Redmond, Washington, United States Providence Full timeDescriptionProvidence caregivers are not simply valued – they're invaluable. Join our team at Enterprise Information Services and thrive in our culture of patient-focused, whole-person care built on understanding, commitment, and mutual respect. Your voice matters here, because we know that to inspire and retain the best people, we must empower them.As a...
-
Senior Cloud Network Engineer
2 weeks ago
Redmond, Washington, United States Microsoft Full time $119,800 - $234,700 per yearSenior Cloud Network Engineer (Optical)Redmond, Washington, United StatesDate postedOct 16, 2025Job number1897717Work site3 days / week in-officeTravel0-25%Role typeIndividual ContributorProfessionSoftware EngineeringDisciplineCloud Network EngineeringEmployment typeFull-TimeOverviewMicrosoft Digital(MSD) builds and manages the critical products and services...