Sr. Site Reliability Engineer

1 month ago


San Francisco CA, United States hims & hers Full time

About the Role:

We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things.

You Will:
  • Design and implement SRE practices ensuring availability, scalability and observability of production systems with a strong focus on excellent customer experience
  • Actively seek and identify opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.
  • Use automation extensively to design, configure, manage, and monitor systems in support of our product development teams
  • Manage Infrastructure through automation (Infrastructure as Code)
  • Manage incidents and emergency response, track outages, ensure data integrity and engineer releases to promote safe, efficient and rapid deployments
  • Handle emergency response either by being on-call or by reacting to symptoms according to monitoring and escalation when needed
  • Improve the codebase by resolving logic issues, deprecating unused code, etc.
  • Implement monitoring, logging, alerting and SLO Reporting
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and performance objectives.
  • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent incident reoccurrence.
You Have:
  • 8+ years as a software engineer, shipping production code.
  • 5+ years of experience as a Site Reliability Engineer.
  • Experience with service-oriented architectures and microservices at scale
  • Strong proficiency with RDBMS databases (PostgreSQL, MySQL, SQL Server, etc.)
  • Strong proficiency in SQL scripting
  • Proficiency developing in one or more languages such as Java, Kotlin, Python, and/or others
  • Ability to use containers and orchestration frameworks (Kubernetes, Docker, Container registries etc.)
  • Proficiency in Git or other VCS
  • Experience with configuring, customizing, and extending monitoring tools (Datadog, Prometheus, New Relic etc.)
  • Excellent debugging and troubleshooting skills
  • Strong technical competency, with a data-driven analytical approach towards solving complex challenges
  • Have a systematic problem-solving approach, coupled with strong and effective communication skills and a sense of drive
  • Nice-to-have: Experience with Terraform or other IAC tools such as Chef, Puppet or Ansible
Our Benefits (there are more but here are some highlights):
  • Competitive salary & equity compensation for full-time roles
  • Unlimited PTO, company holidays, and quarterly mental health days
  • Comprehensive health benefits including medical, dental & vision, and parental leave
  • Employee Stock Purchase Program (ESPP)
  • Employee discounts on hims & hers & Apostrophe online products
  • 401k benefits with employer matching contribution
  • Offsite team retreats

#LI-Remote

#J-18808-Ljbffr

  • San Francisco, United States hims & hers Full time

    About the Role: We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things. You Will: Design and implement SRE practices ensuring availability,...


  • San Francisco, United States Vertisystem Full time

    Duration: 6 months contract Pay rate: $90/hr on W2 Job Summary: It is an exciting time to be part of the organization’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make the organization highly reliable, scalable, operable and...


  • San Francisco, United States Vertisystem Full time

    Duration: 6 months contractPay rate: $90/hr on W2Job Summary:It is an exciting time to be part of the organization’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make the organization highly reliable, scalable, operable and...


  • San Francisco, United States Vertisystem Full time

    Duration: 6 months contractPay rate: $90/hr on W2Job Summary:It is an exciting time to be part of the organization’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make the organization highly reliable, scalable, operable and...


  • San Francisco, United States Vertisystem Full time

    Duration: 6 months contract Pay rate: $90/hr on W2 Job Summary: It is an exciting time to be part of the organizations CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make the organization highly reliable, scalable, operable and...


  • San Francisco, California, United States Observable Full time

    Observable is seeking a full-time infrastructure and site reliability engineer to help improve, administrate, and grow Observable systems as we scale to meet our customer's needs.What you will doPerform site reliability and ops work for Observable production and staging environments. (Manage servers Tweak WAF rules Optimize SQL queries And more)Design and...


  • San Diego, United States ObjectWin Technology Full time

    Job Title: Site Reliability Engineer Location: San Diego, CA or Remote in CA Duration: 6 Months Description: It is an exciting time to be part of SIEs CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make PlayStation highly reliable,...


  • San Diego, CA, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...


  • San Diego, CA, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates only Job Title: Site Reliability Engineer Location: San Diego, CA (Open to other locations in California) Job Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates only Job Title: Site Reliability Engineer Location: San Diego, CA (Open to other locations in California) Job Description: It is an exciting time to be part of SIEs CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates only Job Title: Site Reliability Engineer Location: San Diego, CA (Open to other locations in California) Is this the role you are looking for If so read on for more details, and make sure to apply today. Job Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates onlyJob Title: Site Reliability EngineerLocation: San Diego, CA (Open to other locations in California)Job Description:It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates onlyJob Title: Site Reliability EngineerLocation: San Diego, CA (Open to other locations in California)Job Description:It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team...


  • San Diego, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Po...


  • San Jose, United States Antora Energy Full time

    Job DescriptionJob DescriptionAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry.Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewables as heat for days on end, delivering...


  • San Jose, United States Antora Energy Full time

    At Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry. Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewables as heat for days on end, delivering that stored energy as heat and power at...


  • San Jose, United States Antora Energy Full time

    At Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry. Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewables as heat for days on end, delivering that stored energy as heat and power at...


  • San Francisco, United States Best Secret Full time

    About BestSecretGroup We are a leading European members-only online destination for premium and luxury off-price fashion. Partnering with over 3,000 international brands, our tech-focused mindset and strong commitment to sustainability drives a truly unique experience for our members. With almost 100 years of experience behind us, and a major tech...


  • San Diego, United States PEAK Technical Staffing USA Full time

    Hiring Senior Site Reliability Engineer; primary responsibilities will include contributing to the implementation and delivery of the end-to-end automation platform, to support continuous integration and continuous delivery (CI/CD), with a focus on developer self-service capabilities. NOTE: Must have build out experience with Kubernetes. This position...