Site Reliability Engineer
1 hour ago
Overview
Join the Microsoft Specialized Cloud (MSC) team - the next generation of platform and experiences enabling Microsoft and Azure, the fastest-growing cloud platform in the world which makes billions of dollars in revenue. As a Site Reliability Engineer on the MSC team, you will work alongside multiple teams within Azure, Office, and throughout Microsoft to manage and automate a large-scale Commerce platform, provide world-class analytics to customers about their consumption and billing details, and set the stage for enterprises all around the world to interact with Azure and Office seamlessly and effectively, ensuring customers and partners can provision and transact from the Azure Marketplace. We are looking for engineers who bring fresh ideas from all backgrounds, leveraging invaluable experience and perspectives to do and achieve more.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
- Drive reliable, scalable, and high-performance solutions across the Commerce stack, ensuring robust service design.
- Collaborate with cross-functional teams to deliver customer-focused outcomes and broad technical impact.
- Support Commerce + Ecosystems in dedicated clouds by mitigating issues with sustainable, forward-thinking solutions.
- Leverage telemetry data to identify patterns and trends that enable continuous operational improvement through automation.
- Develop scripts and automation to resolve complex problems and prevent recurrence of service issues.
- Partner with internal customers and engineering teams to conduct root cause analyses, uphold security and compliance, and share end-to-end service expertise.
- Identify and implement automation opportunities to monitor and manage distributed systems effectively.
- Embody our culture and values
Qualifications
Required Qualifications:
Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience.
Other Qualifications / Requirements:
Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- The successful candidate must have an ACTIVE U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination.
- Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance
Preferred Qualifications:
Technical Expertise:
- Strong understanding of distributed systems, cloud architecture, and interactions between technology layers.
- Hands-on experience with Azure services, scripting languages (PowerShell, Python), and automation frameworks.
Operational Excellence:
- Ability to leverage telemetry and monitoring tools to identify patterns, trends, and drive proactive improvements.
- Skilled in incident management, root cause analysis, and implementing durable fixes.
Automation & Reliability:
- Demonstrated experience developing scripts and tools to automate repetitive tasks and improve service reliability.
- Familiarity with CI/CD pipelines and deployment automation.
Problem-Solving & Collaboration:
- Strong analytical skills to troubleshoot complex issues across large-scale systems.
- Proven ability to partner with engineering teams and internal customers to deliver sustainable solutions.
Security & Compliance:
- Knowledge of security best practices and compliance standards in cloud environments.
Growth Mindset:
- Passion for learning new technologies and driving continuous operational improvement.
Site Reliability Engineering IC3 - The typical base pay range for this role across the U.S. is USD $100,600 - $199,000 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $131,400 - $215,400 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process.
-
Site Reliability Engineer
2 weeks ago
Redmond, Washington, United States Microsoft Full time $100,600 - $199,000Overview Microsoft has been a leading company in computing for decades. We are a global service, relied on by governments, utilities, schools, and co-operatives to deliver the things they need to work, every day and to make this work for our customers, we need continual effort to make that delivery reliable. This is the core of what our Site Reliability...
-
Site Reliability Engineer
3 days ago
Redmond, Washington, United States Jobs via Dice Full timeOverviewLeverages end-to-end technical expertise in large scale distributed systems' infrastructure, code, inter- and intra-service dependencies, and operations to proactively and continuously improve the reliability, performance, efficiency, latency, and scalability of services and/or products operating at scale. Partners with software engineering product...
-
Sr. Site Reliability Engineer
5 days ago
Redmond, Washington, United States SpaceX Full time $160,000 - $220,000SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.SR. SITE RELIABILITY ENGINEER (STARSHIELD) - TOP SECRET CLEARANCEStarshield...
-
Site Reliability Engineer II
2 weeks ago
Redmond, Washington, United States Jobs via Dice Full timeOverviewThe Cloud & AI organization accelerates Microsoft's mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers' heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme...
-
Site Reliability Engineer II
2 weeks ago
Redmond, Washington, United States Jobs via Dice Full timeOverviewMicrosoft has an exciting opportunity for a Site Reliability Engineer II in the Cloud+AI Azure Data Team. Microsoft's Azure Data engineering team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence.The products in...
-
Site Reliability Engineer II
5 days ago
Redmond, Washington, United States Microsoft Full timeOverviewMicrosoft has an exciting opportunity for a Site Reliability Engineer II in the Cloud+AI Azure Data Team. Microsoft's Azure Data engineering team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence.The products in...
-
Site Reliability Engineer II
2 weeks ago
Redmond, Washington, United States Microsoft Full time $100,600 - $199,000OverviewMicrosoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world.Microsoft's Azure Data engineering team is leading the transformation of analytics in the world...
-
Redmond, Washington, United States Ascendion Full timeAbout AscendionAscendion is a fullservice digital engineering solutions company. We make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. Our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients....
-
Reliability Engineer
1 week ago
Redmond, Washington, United States Meta Full timeAs a Reliability Engineer in Meta Reality Labs, you will take a critical role in bringing reliable new AI-native augmented/virtual reality and wearable products. You will collaborate with a large breadth of cross-functional disciplines to understand emerging designs and technologies. You will be responsible for identifying risks associated with these various...
-
Reliability Engineer
8 minutes ago
Redmond, Washington, United States Meta Full time $118,000 - $170,000As a Reliability Engineer in Meta Reality Labs, you will take a critical role in bringing reliable new AI-native augmented/virtual reality and wearable products. You will collaborate with a large breadth of cross-functional disciplines to understand emerging designs and technologies. You will be responsible for identifying risks associated with these various...