Site Reliability Engineer

3 weeks ago


Foster City, United States Zoox Full time
Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through deployment, operation, and continual improvement. Zoox is a robotics company and our ethos of automation extends throughout the infrastructure components we build. Be prepared to work with systems handling large volumes of data and data-processing pipelines performing compute-intensive tasks on CPUs and GPUs. Qualifications
  • Experience in supporting production service infrastructure and utilizing configuration management tools like Ansible, Terraform, or Salt
  • Proficiency with microservice architecture and tooling around Kubernetes
  • Ability to extract and report useful performance or service metrics using ELK, prometheus, grafana
  • Linux, no matter the flavor
  • Familiarity with Python or C/C++
  • Bachelor's degree in an engineering, mathematics, or related field and 2+ years of relevant experience
Bonus Qualifications
  • AWS Architecture and operational experience with a range of tech like OS, RDS, ECS, EKS
  • Deploying and managing Kafka / MSK as a service
  • Establishing and supporting CI / CD best practices
  • Experience handling large data sets
  • Master's degree in an engineering, mathematics, or related field 
Compensation There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. The salary range for this position is $160,000 to $256,000. A sign-on bonus may be offered as part of the compensation package. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.   Zoox also offers a comprehensive package of benefits including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.
About Zoox Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.
Follow us on LinkedIn
Accommodations If you need an accommodation to participate in the application or interview process please reach out to accommodations@zoox.com or your assigned recruiter.
A Final Note: You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.

  • Foster City, United States Bayone Full time

    As a Site Reliability Engineer, you will: Keep a large production service up and running including: Host OS upgrades Docker image upgrades SSL certificate upgrades Define and refine metrics to track service health and performance. Automate software releases and service failovers. Requirements Bachelor's degree in Engineering, Mathematics or...


  • Foster City, United States Bayone Full time

    As a Site Reliability Engineer, you will: Keep a large production service up and running including: Host OS upgrades Docker image upgrades SSL certificate upgrades Define and refine metrics to track service health and performance. Automate software releases and service failovers. Requirements Bachelor's degree in Engineering, Mathematics or...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for an experienced leader to lead our Site Reliability Engineering team. Infrastructure is key in building, validating, and running our autonomous driving software, and the team you’ll be running supports it all. In this highly impactful role, you will closely work with partners in many teams including the driving AI teams, safety...


  • Foster City, United States Zoox Full time

    Foster City, CA • Full-time Staff/Senior Staff Site Reliability Engineer Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from...


  • Foster City, United States Zoox Full time

    Foster City, CA • Full-time Staff/Senior Staff Site Reliability Engineer Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from...

  • Gorilla Logic

    1 week ago


    Foster City, United States Sovereign Realty & Management Full time

    Gorilla Logic is looking for a Mid Site Reliability Engineer (SRE) responsible for automation, instrumentation, and stability of our client's platforms to achieve health and performance with a focus on cloud services and API management, including AWS, Azure, and GCP. This role involves ensuring the optimal performance and security of API management platforms...


  • Jersey City, United States Iris Software Inc. Full time

    IRIS direct end client which is one of the Financial Service client is urgently looking to hire Site Reliability Engineer - Jersey City, NJ (Hybrid). This is a Contract opportunity.Site Reliability EngineerJersey City, NJ 07302 (Hybrid)Nature of Contract – Contract RoleTo be effective in the position, a SRE must have strong AWS, Terraform and GitHub skills...


  • Jersey City, United States Iris Software Inc. Full time

    IRIS direct end client which is one of the Financial Service client is urgently looking to hire Site Reliability Engineer - Jersey City, NJ (Hybrid). This is a Contract opportunity.Site Reliability EngineerJersey City, NJ 07302 (Hybrid)Nature of Contract – Contract RoleTo be effective in the position, a SRE must have strong AWS, Terraform and GitHub skills...

  • Reliability Engineer

    4 weeks ago


    Foster City, United States Zoox Full time

    At Zoox we have set the goal to provide our customers with the highest level of safety and a best-in-class experience while using our fully autonomous vehicles. You will work with a team of world-class engineers with diverse backgrounds such as robotics, control, and vehicle engineering, to deliver the vehicle performance using virtual tools and...


  • Foster City, United States Zoox Full time

    At Zoox we have set the goal to provide our customers with the highest level of safety and a best-in-class experience while using our fully autonomous vehicles. You will work with a team of world-class engineers with diverse backgrounds such as robotics, control, and vehicle engineering, to deliver the vehicle performance using virtual tools and...


  • Jersey City, United States Syntricate Technologies Full time

    Job Title : Site Reliability Engineer (AWS) (SRE)- Location : Jersey city ,NJ -( 3 days WFO, 2 days WFH) Duration : 6 +Months Position Responsibilities: Site Reliability Engineer (AWS) (SRE) Work Location: Jersey city New Jersey Only near by candidate will be considered ( 3 days WFO, 2 days WFH) 1 Zoom / tech interview and 1 onsite interview with...


  • Oklahoma City, United States BJ's Wholesale Club Full time

    Lead Site Reliability Engineer page is loaded Lead Site Reliability Engineer Apply locations BJ's Club Support Center Marlborough, MA #5997 time type Full time posted on Posted 2 Days Ago job requisition id R147855 Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight...


  • Oklahoma City, United States BJ's Wholesale Club Full time

    Lead Site Reliability Engineer page is loaded Lead Site Reliability Engineer Apply locations BJ's Club Support Center Marlborough, MA #5997 time type Full time posted on Posted 2 Days Ago job requisition id R147855 Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight...


  • Jersey City, United States Trigyn Technologies Full time

    Immediate long-term contract to hire opportunity for Sr. Site Reliability Support Engineer with direct client in Jersey City. Trigyn's financial services client has an immediate need for a Site Reliability Engineer in Jersey City. This is a long-term contract assignment, that could


  • Jersey City, United States Trigyn Technologies Full time

    Immediate long-term contract to hire opportunity for Sr. Site Reliability Support Engineer with direct client in Jersey City. Trigyn’s financial services client has an immediate need for a Site Reliability Engineer in Jersey City. This is a long-term contract assignment, that could....


  • Arizona City, United States Diverse Lynx Full time

    Role: Site Reliability Engineer Location Remote Job Description: 4-6 year of experience required . Production Support experience required. Cloud Platforms: Proficiency in working with cloud platform such as GCP, Azure, or AWS. Experience Monitoring tools Splunk. Experience with Dynatrace. Understanding on Grafana New Relic Monitoring tools, Alert...