Senior Site Reliability Engineer 4 Atlanta

2 weeks ago


San Francisco, United States Pager Full time

PagerDuty empowers teams of all kinds to do the critical work that moves business forward through the PagerDuty Operations Cloud.

PagerDuty is seeking a Senior Site Reliability Engineer to join our SRE-Platform team. In this role you will be a key contributor to building, maintaining and scaling the Kubernetes platform that powers PagerDuty. We build solutions that accelerate developer productivity, improve reliability and help PagerDuty scale for today and tomorrow. If you’re passionate about platform engineering, developer experience and all things Kubernetes, we’d love to hear from you

Key Responsibilities

  • You help maintain the overall health of the platform including triaging and troubleshooting production issues, monitoring system capacity, and working with other technical teams to ensure adherence to compliance and security best practices.
  • You partner with Engineering stakeholders to design and deliver a reliable, scalable, secure, and performant platform.
  • You continuously strive to improve the developer experience: Full lifecycle support (creation, development, deployment, retirement), observability, flexible connectivity, and monitoring.
  • You share your expertise with the entire Engineering organization.
  • You participate in a 24/7 on-call rotation. And yes, we use PagerDuty to manage our on-call schedules.

Basic Qualifications

  • 5+ years of experience in Platform Engineering, Site Reliability Engineering or DevOps roles.
  • Experience managing multiple Kubernetes clusters in a production environment.
  • Experience working on cloud-native infrastructure (e.g. AWS, GCP, Azure).
  • Experience deploying web applications on Kubernetes (Helm, ArgoCD).
  • Experience with infrastructure as code (i.e. Terraform or CloudFormation).
  • Knowledge of a dynamic language like (i.e. Ruby or Python).

Preferred Qualifications

  • Experience with monitoring, observability and logging platforms (e.g. DataDog, New Relic, SumoLogic, Splunk).
  • Knowledge of configuration management systems (e.g. Ansible, Chef, Puppet).
  • Experience in automating releases, continuous integration/delivery systems and relevant tools (e.g. Jenkins, CircleCI, Travis CI, Buildkite).

PagerDuty is committed to creating a diverse environment and is an equal opportunity employer. PagerDuty does not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, parental status, veteran status, or disability status.

#J-18808-Ljbffr

  • San Francisco, United States Autodesk Full time

    Senior Site Reliability Engineer Apply Location: San Francisco, CA, USA Time Type: Full time Posted On: Posted 3 Days Ago Job Requisition ID: 24WD81384 Position Overview At Autodesk, we're not just a world leader in 3D design, engineering, and entertainment software; we're a hub of innovation committed to solving complex design and real-world problems. Our...


  • San Francisco, California, United States RevenueCat Full time

    About RevenueCatWe are a leading provider of mobile subscription infrastructure, handling over $3 billion in in-app purchases annually across thousands of apps. Our mission is to build a standard for mobile subscription infrastructure, and we're looking for a Senior Site Reliability Engineer to help us achieve this goal.About the RoleWe're seeking a highly...


  • San Francisco, United States Fieldguide.ai Full time

    [Full Time] Senior Site Reliability Engineer at Fieldguide (United States) | BEAMSTART Jobs Senior Site Reliability Engineer Fieldguide United States Date Posted: 31 Oct, 2022 Work Location: San Francisco, United States Salary Offered: Not Specified Job Type: Full Time Experience Required: 3+ years Remote Work: Yes Stock Options: No Vacancies: 1...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Outdefine. As a key member of our Infrastructure team, you will be responsible for ensuring the reliability and scalability of our blockchain-based systems.Key ResponsibilitiesRun internal Chainlink and Blockchain nodes to ensure seamless connectivity and data...


  • San Francisco, California, United States Centene Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Centene. As a key member of our technology organization, you will play a critical role in ensuring the reliability, performance, and security of our platform infrastructure.Key ResponsibilitiesLead Projects and Initiatives: Help lead projects focused on...


  • San Francisco, California, United States Circle Full time

    About CircleCircle is a leading financial technology company that is revolutionizing the way value is transferred globally. Our innovative infrastructure enables businesses, institutions, and developers to harness the power of blockchain technology and capitalize on the emerging internet of money.Job SummaryWe are seeking a highly skilled Senior Site...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS provider and a prominent leader in the Salesforce DevSecOps platform tailored for regulated sectors such as finance, insurance, and healthcare. Our solutions empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, United States RevenueCat Full time

    About us: RevenueCat makes building, analyzing, and growing mobile subscriptions easy. We launched as part of Y Combinator's summer 2018 batch and today are handling more than $3B of in-app purchases annually across thousands of apps. We are a mission driven, remote-first company that is building the standard for mobile subscription infrastructure. Top apps...


  • San Francisco, United States Doppler Full time

    [Full Time] Senior Site Reliability Engineer at Doppler (United States) Senior Site Reliability Engineer Doppler United States Date Posted: 31 Oct, 2022 Work Location: San Francisco, United States Salary Offered: Not Specified Job Type: Full Time Experience Required: 6+ years Remote Work: Yes Stock Options: No Vacancies: 1 available ABOUT DOPPLER Doppler's...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our offerings empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering...


  • San Francisco, United States PicnicHealth Full time

    [Full Time] Site Reliability Engineer at PicnicHealth (United States) Site Reliability Engineer PicnicHealth United States Date Posted: 10 Aug, 2023 Work Location: San Francisco, United States Salary Offered: $160 — $190 yearly Job Type: Full Time Experience Required: 6+ years Remote Work: Yes Stock Options: No Vacancies: 1 available Healthcare needs good...


  • San Francisco, United States AutoRABIT Holding, Inc. Full time

    About AutoRABIT: AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare. AutoRABIT solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team,...


  • San Francisco, United States PicnicHealth Full time

    [Full Time] Site Reliability Engineer at PicnicHealth (United States) Site Reliability Engineer PicnicHealth United States Date Posted: 10 Aug, 2023 Work Location: San Francisco, United States Salary Offered: $160 $190 yearly Job Type: Full Time Experience Required: 6+ years Remote Work: Yes Stock Options: No Vacancies: 1 available Healthcare needs good...


  • San Francisco, California, United States Operant AI Full time

    Job OverviewSenior Site Reliability EngineerAs the inaugural SRE within our organization, we are looking for an individual to establish Operant's SRE strategy and operations aimed at ensuring the resilience and security of our platforms and services. If you are enthusiastic about the prospect of being an early engineer at a startup ready to revolutionize...


  • San Francisco, United States GRNET S.A. Full time

    About GRNETGRNET - National Infrastructures for Research and Technology, is an entity of the Greek Government, operating under the Ministry of Digital Governance. It provides advanced network and cloud computing services to academic and research institutions, educational entities at all levels, as well as to public, broader public, and private sector...


  • San Francisco, United States Tampa Gardens Senior Living Full time

    Team Culture Great things happen when people can bring their authentic selves to work. We empower all of our employees to share their perspectives, passions, and experiences because collectively we make a better, stronger team. Our team members collaborate closely with peers and cross-functional stakeholders throughout the business, our clients on the...


  • San Jose, United States Zscaler Full time

    Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...


  • San Francisco, United States AutoRABIT Holding, Inc. Full time

    About AutoRABIT:AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare. AutoRABIT solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team,...


  • San Francisco, California, United States Crusoe Full time

    About This Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Crusoe Energy Systems. As a Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our infrastructure.Key Responsibilities:Collaborate with the SRE team to detect, analyze, and prevent issues to maintain high Service...


  • San Francisco, United States Tampa Gardens Senior Living Full time

    Team CultureGreat things happen when people can bring their authentic selves to work. We empower all of our employees to share their perspectives, passions, and experiences because collectively we make a better, stronger team. Our team members collaborate closely with peers and cross-functional stakeholders throughout the business, our clients on the...