Staff Site Reliability Engineer

5 days ago


Boston, United States Zscaler Full time

Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your vision and passion to our team of cloud architects, software engineers, security experts, and more who are enabling organizations worldwide to harness speed and agility with a cloud–first strategy.

NOTE: U.S. citizenship is required for this position due to the nature of the customers assigned to this role.

We're looking for an experienced Staff Site Reliability Engineer–Incident Response to join our Shared Platform Engineer team. Reporting to the Director Cloud Operations and Incident Management, you'll be responsible for:

  • Lead and advocate for the transformation to a world–leading SRE organization, promoting SRE principles within the Engineering Department.
  • Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision–making and quick resolution.
  • Promote a customer–focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team.
  • Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability.
  • Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency.
What We're Looking for (Minimum Qualifications)
  • 5+ years of experience as a Site Reliability Engineer, with relevant experience in an Operations or Engineering environment.
  • Hands–on experience troubleshooting Linux–based systems.
  • Networking knowledge and able to troubleshoot TCP/IP, SSL/TLS, DNSSEC, IPsec, and BGP issues.
  • Coding experience (preferably Python) building tools, scripting, or automation.
  • Bachelor's degree in Computer Science, a related technical field involving computer systems engineering, or equivalent practical experience.
What Will Make You Stand Out (Preferred Qualifications)
  • Experience supporting High/Moderate FedRAMP environments.
  • Understanding of Observability practices and Tools – Grafana, DataDog, Splunk, etc.
  • Experience Leading Major Incidents in large scale, high uptime environments.

This role offers remote work option.

#J-18808-Ljbffr

  • Boston, Massachusetts, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that provides a secure platform for enterprises to connect users, devices, and applications. As a Staff Site Reliability Engineer - Federal, you will play a critical role in ensuring the security and reliability of our cloud infrastructure.Key ResponsibilitiesOversee operational tasks for FedRAMP cloud...


  • Boston, United States Zscaler Full time

    We're looking for an experienced Staff Site Reliability Engineer (Federal) to join our ZPA team, reporting to the Senior Manager SRE. This role requires Secret Security Clearance that you must maintain throughout employment. An Information Assurance Technician Level 2 Certification is also required, but you can obtain that within your first few weeks of...


  • Boston, Massachusetts, United States WEX Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Platform Reliability organization. As a key member of our team, you will be responsible for ensuring the reliability and performance of our internal systems and services.As a Site Reliability Engineer, you will work closely with our development teams to design and implement...


  • Boston, Massachusetts, United States StartUs GmbH Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Spotify. As a Site Reliability Engineer, you will be responsible for designing and implementing scalable and reliable systems to support our production infrastructure.Key Responsibilities:Design and document systems, including writing and...


  • Boston, Massachusetts, United States AXON-Networks Full time

    AXON Networks is a leading provider of AI-driven, analytics-based orchestration platforms and next-gen high-speed routers that leverage the latest Wi-Fi technologies.Our innovative solutions empower ISPs to manage and troubleshoot their networks in real-time, delivering an exceptional customer experience.As a trusted strategic partner, AXON Networks helps...


  • Boston, Massachusetts, United States Zscaler Full time

    About ZscalerZscaler is a leading provider of cloud-based security solutions, serving thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, our mission is to make the cloud a safe and secure place for businesses to operate.As the operator of the world's largest security cloud, Zscaler accelerates digital...


  • Boston, Massachusetts, United States Klaviyo Full time

    About KlaviyoKlaviyo is a leading provider of email marketing and customer data platforms. We empower creators to own their destiny by making first-party data accessible and actionable like never before.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring...


  • Boston, Massachusetts, United States FareHarbor Full time

    About FareHarborFareHarbor is a leading provider of innovative solutions for the experiences industry. Our mission is to empower our clients to deliver exceptional experiences to their customers.The RoleWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building,...


  • Boston, Massachusetts, United States Klaviyo Full time

    About the RoleWe're seeking an experienced Site Reliability Engineering Manager to join our team at Klaviyo. As a key member of our engineering organization, you will be responsible for leading a team of Site Reliability Engineers and driving the development of secure, scalable, and reliable systems.Key ResponsibilitiesManage a team of 4-6 Site Reliability...


  • Boston, Massachusetts, United States Klaviyo Full time

    About the RoleWe're seeking a seasoned Site Reliability Engineering Manager to lead our team in Boston and remotely. As a key member of our engineering organization, you'll be responsible for managing a team of 4-6 Site Reliability Engineers and driving the development of secure software architecture and development.Key ResponsibilitiesManage a team of Site...


  • Boston, Massachusetts, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that serves thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, Zscaler's mission is to make the cloud a safe place to do business and provide a seamless experience for enterprise users.As the operator of the world's largest security cloud, Zscaler...


  • Boston, United States Space Executive Full time

    My client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...


  • boston, United States Space Executive Full time

    My client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...


  • Boston, United States Space Executive Full time

    My client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...


  • boston, United States Space Executive Full time

    My client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...


  • Boston, Massachusetts, United States Global InfoTek Full time

    Job Title: Principal Site Reliability EngineerWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Global InfoTek, Inc. The ideal candidate will have a strong background in cloud infrastructure, DevOps, and reliability engineering.Key Responsibilities:Design and implement scalable cloud infrastructure solutionsDevelop and...


  • Boston, Massachusetts, United States Klaviyo Full time

    Klaviyo is committed to empowering creators to own their destiny by making first-party data accessible and actionable like never before. To achieve this goal, we need a talented Site Reliability Engineering Manager to join our team.The Site Reliability Engineering Manager will be responsible for leading a team of Site Reliability Engineers in Klaviyo's...


  • Boston, Massachusetts, United States Klaviyo Full time

    About the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Klaviyo. As a Site Reliability Engineer, you will be responsible for ensuring the availability and scalability of our systems, as well as collaborating with product teams to deliver high-quality software.Key ResponsibilitiesDesign and develop systems and processes to enable...


  • Boston, Massachusetts, United States Klaviyo Full time

    At Klaviyo, we value the unique backgrounds, experiences, and perspectives each team member brings to our workplace every day.We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond traditional job requirements.Want to learn more about life at Klaviyo? Visit our website to see how we empower creators to...


  • Boston, Massachusetts, United States Klaviyo Full time

    Unlock Your Potential as a Senior Site Reliability Engineer at KlaviyoWe're on a mission to empower creators to own their destiny, and we need talented individuals like you to help us achieve it. As a Senior Site Reliability Engineer at Klaviyo, you'll play a critical role in ensuring the reliability, scalability, and security of our platform.Key...