Staff Site Reliability Engineer

2 weeks ago


Boston, United States Zscaler Full time

Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your vision and passion to our team of cloud architects, software engineers, security experts, and more who are enabling organizations worldwide to harness speed and agility with a cloud-first strategy. NOTE: U.S. citizenship is required for this position due to the nature of the customers assigned to this role. We're looking for an experienced Staff Site Reliability Engineer-Technical Duty Officer to join our Shared Platform Engineer team. Reporting to the Director Cloud Operations and Incident Management, you'll be responsible for: Lead and advocate for the transformation to a world-leading SRE organization, promoting SRE principles within the Engineering Department. Provide expert leadership during critical outages, coordinating multiple teams to ensure streamlined decision-making and quick resolution. Promote a customer-focused approach by addressing and mitigating global customer environment issues, and fostering a culture of continuous learning and technical excellence within the SRE team. Develop and implement scalable process frameworks and observability strategies to ensure rapid problem diagnosis, response, and service reliability. Collaborate with product teams to thoroughly analyze failures and integrate insights to improve service reliability, scalability, and operational efficiency. What We're Looking for (Minimum Qualifications) 5+ years of experience as a Site Reliability Engineer, with relevant experience in an Operations or Engineering environment. Hands-on experience troubleshooting Linux-based systems. Networking knowledge and able to troubleshoot TCP/IP, SSL/TLS, DNSSEC, IPsec, and BGP issues. Coding experience (preferably Python) building tools, scripting, or automation. Bachelor's degree in Computer Science, a related technical field involving computer systems engineering, or equivalent practical experience. What Will Make You Stand Out (Preferred Qualifications) Experience supporting High/Moderate FedRAMP environments. Understanding of Observability practices and Tools - Grafana, DataDog, Splunk, etc. Experience Leading Major Incidents in large scale, high uptime environments. This role offers remote work options, and the Eastern Time Zone is highly preferred. At Zscaler, we believe that diversity drives innovation, productivity, and success. We are looking for individuals from all backgrounds and identities to join our team and contribute to our mission to make doing business seamless and secure. We are guided by these principles as we create a representative and impactful team, and a culture where everyone belongs. For more information on our commitments to DEIB, visit the Corporate Responsibility page of our website. Our Benefits program is one of the most important ways we support our employees. Zscaler proudly offers comprehensive and inclusive benefits to meet the diverse needs of our employees and their families throughout their life stages, including: Various health plans. Time off plans for vacation and sick time. Parental leave options. Retirement options. Education reimbursement. In-office perks, and more #J-18808-Ljbffr



  • Boston, United States Zscaler Full time

    Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...


  • Boston, United States Zscaler Full time

    Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...


  • Boston, United States WEX Full time

    We are seeking a motivated and detail-oriented Entry-Level Site Reliability Engineer (SRE) to join our growing team. As an SRE, you will work closely with experienced engineers to ensure the reliability, performance, and scalability of our systems. This is an excellent opportunity for someone looking to build a career in site reliability engineering, with...


  • Boston, United States WEX Full time

    We are seeking a motivated and detail-oriented Entry-Level Site Reliability Engineer (SRE) to join our growing team. As an SRE, you will work closely with experienced engineers to ensure the reliability, performance, and scalability of our systems. This is an excellent opportunity for someone looking to build a career in site reliability engineering, with...


  • Boston, United States WEX Full time

    We are seeking a motivated and detail-oriented Entry-Level Site Reliability Engineer (SRE) to join our growing team. As an SRE, you will work closely with experienced engineers to ensure the reliability, performance, and scalability of our systems. This is an excellent opportunity for someone looking to build a career in site reliability engineering, with...


  • Boston, Massachusetts, United States Chewy Full time

    Our Opportunity:We are looking for a Director, Site Reliability Engineer at our facility in Boston, Massachusetts to establish and manage incident response protocols for SREs, including on-call schedules and post-incident reviews, to minimize downtime and improve system performance.What You'll Do: Develop and execute a comprehensive SRE strategy that aligns...


  • Boston, United States Chewy Full time

    Our Opportunity: We are looking for a Director, Site Reliability Engineer at our facility in Boston, Massachusetts to establish and manage incident response protocols for SREs, including on-call schedules and post-incident reviews, to minimize downtime and improve system performance. What You'll Do: Develop and execute a comprehensive SRE strategy that...


  • Boston, United States Chewy Full time

    Our Opportunity: We are looking for a Director, Site Reliability Engineer at our facility in Boston, Massachusetts to establish and manage incident response protocols for SREs, including on-call schedules and post-incident reviews, to minimize downtime and improve system performance. What You’ll Do:  Develop and execute a comprehensive SRE strategy...


  • Boston, United States Fidelity Investments Full time

    Job Description: Position Description: ***Applicants are permitted to work remotely from an at-home work site anywhere in the United States.*** Deploys and supports highly-distributed, multi-tiered systems at scale within Cloud environments -- Amazon Web Services (AWS). Develops and improves distributed, highly-concurrent, and service-based software systems...

  • Reliability Engineer

    3 weeks ago


    Boston, United States DPS Group Global Full time

    DPS is looking for a Reliability Engineer to support a client in Boston, MA. ResponsibilitiesThe GxP Reliability Engineer will provide reliability engineering support for all facilities, utilities systems and equipment including analytical instrumentation, R&D lab support equipment and GMP systems for the manufacture of small molecule, large molecule and...

  • Reliability Engineer

    2 weeks ago


    Boston, United States DPS Group Global Full time

    DPS is looking for a Reliability Engineer to support a client in Boston, MA. ResponsibilitiesThe GxP Reliability Engineer will provide reliability engineering support for all facilities, utilities systems and equipment including analytical instrumentation, R&D lab support equipment and GMP systems for the manufacture of small molecule, large molecule and...


  • Boston, United States Klaviyo Full time

    At Klaviyo, we value the unique backgrounds, experiences and perspectives each Klaviyo (we call ourselves Klaviyos) brings to our workplace each and every day. We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond the traditional job requirements. If you’re a close but not exact match with the...

  • Reliability Engineer

    2 months ago


    Boston, United States Cirkul Inc Full time

    What is this role? Cirkul Inc. is growing its Quality Engineering team and is looking for a Reliability Engineer in our Boston location . The candidate should have a good understanding of methods such as Reliability Modeling, Prediction, apportionment FME, critical analysis, maintainability analysis, and demonstration. What does an average day look like?...

  • Reliability Engineer

    3 months ago


    Boston, United States Cirkul Inc Full time

    What is Cirkul? Cirkul is a rapidly growing beverage technology company on a mission to make a healthier world by helping people enjoy drinking more water. The team at Cirkul developed an innovative beverage delivery system that makes drinking more water delicious, fun, and personalized. The technology reduces the shipping weight of bottled beverages by...


  • Boston, United States StartUs GmbH Full time

    WHAT YOU’LL DO As a member of a small cross functional squad, you’ll own a particular infrastructure challenge at Spotify Design and document systems, including writing and reviewing code, to automate away problems within your squad’s domain Undertake measured, methodical, troubleshooting of complicated systems under pressure Partake in an on-call...

  • Reliability Engineer

    3 months ago


    Boston, United States Beacon Engineering Resources Full time

    Reliability Engineer provide Design for Reliability, Maintainability and Supportability guidance to the new product introduction teams. Tasks include: Development of Failure Modes Effects Analysis Calculating reliability predictions/ modelling; Develop probabilistic models for cost, reliability, logistics, etc Creating lifecycle support strategies...


  • Boston, United States WHOOP Full time

    At WHOOP, we're on a mission to unlock human performance. WHOOP empowers users to perform at a higher level through a deeper understanding of their bodies and daily lives. As a Senior Reliability Quality Engineer, you will work directly with WHOOP Hardware Engineering, Product Manager and other teams and vendors to create product reliability requirements...


  • Boston, United States Whoop, Inc Full time

    At WHOOP, we're on a mission to unlock human performance. WHOOP empowers users to perform at a higher level through a deeper understanding of their bodies and daily lives.As a Senior Reliability Quality Engineer, you will work directly with WHOOP Hardware Engineering, Product Manager and other teams and vendors to create product reliability requirements...


  • Boston, United States Primary Talent Partners, Inc. Full time

    Primary Talent Partners has a 10 month contract opening for a Reliability Engineer II to join a multinational medical device company for an onsite position in Boston, MA. Assignment Type: W2 with Primary Talent Partners - we cannot provide sponsorship and/or accept C2C, H1B, STEM OPT, or 1099 candidates for this assignment. Responsibilities: Develops,...


  • Boston, United States voltalabs.io Full time

    ABOUT US: Volta Labs is building a suite of genomics applications for our first-of-its-kind digital fluidics sample prep automation platform. Our technology will remove the need for laborious sample setup, provide a technology-agnostic suite, and shorten sample processing time from hours to seconds. We believe our tech is the next step in unlocking the...