Current jobs related to Site Reliability Engineer II - Redmond, Washington - Microsoft Corporation


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Site Reliability Engineer IIJob Summary:Microsoft's Cloud Operations & Innovation (CO&I) group is seeking a skilled Site Reliability Engineer II to support the Commissioning (Cx) Automation and Global Cx teams in deploying, monitoring, and troubleshooting a distributed test platform. The platform is globally deployed and consists of client and...


  • Redmond, Washington, United States Microsoft Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer II to join our team at Microsoft. As a key member of our Commerce and Ecosystems (C+E) team, you will be responsible for managing and automating a large-scale Commerce platform, providing world-class analytics to customers, and ensuring seamless interactions with Azure and Office.About the...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Transforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a key member of our team, you'll play a crucial role in designing and implementing scenarios for our customers.What You'll DoCollaborate with our team to...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Transforming the Future of Cloud ServicesAt Microsoft Corporation, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job SummaryMicrosoft Corporation is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to complex application problems, system administration issues, or network concerns.Key ResponsibilitiesMonitor and test systems and integration functions to...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Site Reliability EngineerMicrosoft Corporation is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability, scalability, and performance of our cloud services.Key Responsibilities:Design, develop, and deliver software engineering solutions...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Site Reliability EngineeringMicrosoft Corporation is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to complex application problems, system administration issues, or network concerns.Key Responsibilities:Design and develop solutions...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Senior Site Reliability Engineering ManagerAre you passionate about driving innovation and excellence in cloud storage solutions? Do you thrive in complex problem-solving environments? Microsoft Corporation is seeking a seasoned Senior Site Reliability Engineering Manager to lead our Azure Storage team in optimizing fleet availability and...


  • Redmond, Washington, United States Microsoft Corporation Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Microsoft Corporation. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services, particularly Office 365 government cloud.ResponsibilitiesDesign, develop, and deliver engineering...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Senior Site Reliability Engineering ManagerMicrosoft Corporation is seeking a highly skilled and experienced Senior Site Reliability Engineering Manager to lead the delivery of critical cloud services in our Office 365 government cloud offerings.About the RoleThis is a unique opportunity to join our team and contribute to the development of highly...


  • Redmond, Washington, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the field of artificial intelligence, machine learning, and datacenter acceleration. Our company has a rich history of innovation, dating back to the invention of the GPU in 1999. This breakthrough sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. In recent...


  • Redmond, Washington, United States Microsoft Full time

    About the RoleThe Intelligent Conversation and Communications Cloud is the backbone of Office 365, enabling hundreds of scenarios. We're seeking a Senior Site Reliability Engineering Manager to lead the delivery of critical features in Office 365 government cloud offerings. In this role, you'll combine your passion for quality, reliability, and creativity to...


  • Redmond, Washington, United States Microsoft Corporation Full time

    Job Title: Senior Site Reliability Engineering ManagerAre you passionate about building scalable and reliable cloud infrastructure? Do you enjoy leading cross-functional teams to drive innovation and excellence? Microsoft Azure Storage is seeking a Senior Site Reliability Engineering Manager to join our team!About the Role:Lead a team of engineers to design,...


  • Redmond, Washington, United States Microsoft Full time

    Job Title: Senior Active Directory Site Reliability EngineerAt Microsoft, we're on a mission to empower every person and organization on the planet to achieve more. As a Senior Active Directory Site Reliability Engineer, you'll play a critical role in ensuring the availability, latency, performance, and security of our Identity systems.About the RoleWe're...


  • Redmond, Washington, United States Microsoft Full time

    About the RoleMicrosoft is seeking a highly skilled Senior Site Reliability Engineering Manager to lead the delivery of critical cloud services in our Office 365 government cloud offerings. As a key member of our Site Reliability Engineering team, you will be responsible for building and developing a team of software engineers focused on site reliability,...


  • Redmond, Washington, United States Microsoft Full time

    Job Title: Senior Active Directory Site Reliability EngineerAt Microsoft, we're committed to empowering every person and organization on the planet to achieve more. As a Senior Active Directory Site Reliability Engineer, you'll play a critical role in ensuring the availability, latency, performance, and security of our Identity...


  • Redmond, Washington, United States Microsoft Full time

    Job Title: Senior Active Directory Site Reliability EngineerAt Microsoft, we're looking for a highly skilled Senior Active Directory Site Reliability Engineer to join our team. As a key member of our Identity team, you will play a critical role in ensuring the availability, latency, performance, and security of our Identity systems.Responsibilities:Design...


  • Redmond, Washington, United States Microsoft Corporation Full time

    OverviewAzure Storage is one of the largest storage services on the planet, holding Exabytes of data and files for Microsoft's own services and 3rd party customers.This role will focus on managing a growing fleet at scale to maximize efficiency and provide a stable environment for customers.As a Senior Site Reliability Engineering Manager in Azure Storage,...


  • Redmond, Washington, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the field of artificial intelligence, machine learning, and datacenter acceleration. Our company has a rich history of innovation, having invented the GPU in 1999 and revolutionized parallel computing. We continue to push the boundaries of what is possible with our technology, and we are now expanding our leadership into...


  • Redmond, Washington, United States Microsoft Full time

    Job SummaryWe are seeking a highly skilled Senior Active Directory Site Reliability Engineer to join our team at Microsoft. As a key member of our Identity team, you will be responsible for designing, implementing, and operating large-scale Active Directory Domain Services (AD DS) environments.Key ResponsibilitiesDesign and implement scalable and highly...

Site Reliability Engineer II

2 months ago


Redmond, Washington, United States Microsoft Corporation Full time
Job Description

Microsoft is seeking a highly skilled Site Reliability Engineer II to join our Silver Infrastructure and Sovereign Operations team. This critical role involves defining operations for new, existing, and emerging environments, ensuring the reliability and efficiency of our cloud infrastructure.

Key Responsibilities
  • Define and develop standardized, repeatable, and scalable solutions to guarantee quality and efficient operations.
  • Drive the design, optimization, efficiency, and reliability of service management, collaborating with software engineers, project management, and operations teams.
  • Collect, classify, and analyze data on a range of metrics, driving the refinement of products through data analytics and making informed decisions in engineering products.
  • Integrate instrumentation for gathering telemetry data on system behavior, driving sustaining feedback loops from telemetry resulting in subsequent designs.
  • Apply debugging tools and examine logs, telemetry, and other methods to verify assumptions through writing and developing code proactively before issues occur and reactively as issues occur for products.
  • Conduct retrospective debugging of solutions to identify root causes of problems, reviewing and writing issues postmortem and sharing insights with the team.
  • Build, enhance, reuse, contribute to, and identify new software developer tools/processes to support other programs and applications to create, debug, and maintain code for products.
  • Act as a Designated Responsible Individual (DRI) and guide other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions.
Requirements
  • 4+ years technical experience in software engineering, network engineering, or systems administration.
  • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration.
  • OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.
  • Ability to meet Microsoft, customer, and/or government security screening requirements, including an active U.S. Government Top Secret Security Clearance.
Preferred Qualifications
  • 3+ years of experience with PowerShell, C#, or C++.
  • Experience working on large-scale distributed services with on-call responsibilities.
  • Ability to build and influence broadly towards common goals and priorities.
  • Ownership for end-to-end project lifecycle with solid project management and communication skills.
  • Experience applying SRE principles in a large production environment.

Microsoft is an equal opportunity employer. We welcome applications from diverse candidates and are committed to creating an inclusive work environment. If you need assistance and/or a reasonable accommodation due to a disability during the application process, please contact us at [insert contact information].