Site Reliability Engineer

3 weeks ago


San Mateo, United States Zoox Full time

Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through deployment, operation, and continual improvement. Zoox is a robotics company and our ethos of automation extends throughout the infrastructure components we build. Be prepared to work with systems handling large volumes of data and data-processing pipelines performing compute-intensive tasks on CPUs and GPUs.

Qualifications

    • Experience in supporting production service infrastructure and utilizing configuration management tools like Ansible, Terraform, or Salt
    • Proficiency with microservice architecture and tooling around Kubernetes
    • Ability to extract and report useful performance or service metrics using ELK, prometheus, grafana
    • Linux, no matter the flavor
    • Familiarity with Python or C/C++
    • Bachelor's degree in an engineering, mathematics, or related field and 2+ years of relevant experience
Bonus Qualifications
    • AWS Architecture and operational experience with a range of tech like OS, RDS, ECS, EKS
    • Deploying and managing Kafka / MSK as a service
    • Establishing and supporting CI / CD best practices
    • Experience handling large data sets
    • Master's degree in an engineering, mathematics, or related field


Compensation

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is $160,000 to $256,000. A sign-on bonus may be offered as part of the compensation package. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to accommodations@zoox.com or your assigned recruiter.

A Final Note:

You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

  • San Francisco, United States Vertisystem Full time

    Duration: 6 months contract Pay rate: $90/hr on W2 Job Summary: It is an exciting time to be part of the organization’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make the organization highly reliable, scalable, operable and...


  • San Francisco, United States Apollo Solutions Full time

    Principal Site Reliability Engineer Apollo Solutions have partnered with a groundbreaking Fintech start-up backed by top tier venture capital. They are looking to significantly disrupt how we view, store and invest our personal finance and have already made significant waves in the industry. The Principal Site Reliability Engineer will be working closely...


  • San Diego, United States TalentBurst Full time

    SENIOR SITE RELIABILITY ENGINEER Location: San Diego, CA 92127 - 100% onsite (San Diego site preferred, open to other sites located in San Francisco 94107, San Mateo 94404, Los Angeles 90045 or Aliso Viejo 92656) Duration: 6 months **W2 Acceptable It is an exciting time to be part of Continuous Integration/Continuous Deployment (CI/CD) and Cloud Site...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates only Job Title: Site Reliability Engineer Location: San Diego, CA (Open to other locations in California) Job Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates onlyJob Title: Site Reliability EngineerLocation: San Diego, CA (Open to other locations in California)Job Description:It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates onlyJob Title: Site Reliability EngineerLocation: San Diego, CA (Open to other locations in California)Job Description:It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team...


  • San Diego, United States ACL Digital Full time

    W2 Contract/ Local candidates only Job Title: Site Reliability Engineer Location: San Diego, CA (Open to other locations in California) Is this the role you are looking for If so read on for more details, and make sure to apply today. Job Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs...


  • San Jose, United States Myriad Consulting Inc Full time

    This role also open for junior (3+ yoe) candidates, and SRE lead (7+ yoe).Site Reliability Engineering(SRE) team combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. In our team, you ll have the opportunity to manage the complex challenges of scale, while using expertise in coding,...


  • San Mateo, United States Geico - Government Employees Insurance Company Full time

    Have strong technical expertise and leadership, you are able to lead from the trenches and have proven knowledge in your field Be able to drive infrastructure as code and show proficiency in field-appropriate programming languages, lead by example ?W Reliability Engineer, Manager, Liability, Hardware, Engineer, Reliability, Technology, Insurance


  • San Diego, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Po...


  • San Francisco, United States Resource Informatics Group Full time

    Job Title: Site Reliability Engineer Work Location: San Francisco, CA (Hybrid after showing successful engagement) Duration: 18+ months Most important skills:10 years of Oracle database administration experience on large production environment Database hands on skills especially around database and system troubleshooting and administration GoldenGate setup,...


  • san diego, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE)Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension.Po...


  • San Francisco, United States Cypress HCM Full time

    Job DescriptionJob DescriptionSite Reliability Engineer (Grafana)Responsibilities:Collaborate with Service Owners and Observability Leaders to develop a strategy for monitoring the technology stack using Grafana.Initiate data ingestion by deploying Telegraf and exporters (if necessary), utilizing discovery to feed data into Grafana Mimir.Establish initial...


  • San Diego, United States PEAK Technical Staffing USA Full time

    Hiring Senior Site Reliability Engineer;primary responsibilities will include contributing to the implementation and delivery of the end-to-end automation platform, to support continuous integration and continuous delivery (CI/CD), with a focus on developer self-service capabilities. NOTE: Must have build out experience with Kubernetes.This position...


  • San Diego, United States PEAK Technical Staffing USA Full time

    Hiring Senior Site Reliability Engineer; primary responsibilities will include contributing to the implementation and delivery of the end-to-end automation platform, to support continuous integration and continuous delivery (CI/CD), with a focus on developer self-service capabilities. NOTE: Must have build out experience with Kubernetes. This position...


  • San Diego, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...


  • San Diego, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of the...


  • San Francisco, California, United States Observable Full time

    Observable is seeking a full-time infrastructure and site reliability engineer to help improve, administrate, and grow Observable systems as we scale to meet our customer's needs.What you will doPerform site reliability and ops work for Observable production and staging environments. (Manage servers Tweak WAF rules Optimize SQL queries And more)Design and...


  • San Francisco, United States hims & hers Full time

    About the Role: We are seeking a Site Reliability Engineer to help build a reliable web experience for our users. We believe that moving fast is our competitive advantage, and enables us to better serve our users. We also know that the faster we move, the more likely we are to break things. You Will: Design and implement SRE practices ensuring availability,...


  • San Diego, CA, United States Talent Software Services Full time

    Site Reliability Engineer - Senior (NE) Job Summary: Talent Software Services is in search of a Site Reliability Engineer - Senior (NE) for a contract position in San Diego, CA. The opportunity will be one year with a strong chance for a long-term extension. Position Summary: As a member of the CICD and Cloud Reliability team you'll work at the heart of...