Senior Site Reliability Engineer

7 days ago


Foster City, California, United States Zoox Full time

Zoox is seeking a site reliability engineer to ensure the uptime of services critical to the development of autonomous vehicles.

This role involves designing systems that are easy to maintain and fault-tolerant, as well as deploying, operating, and continually improving services.

The ideal candidate will have experience with configuration management tools like Ansible, Terraform, or Salt, and proficiency with microservice architecture and tooling around Kubernetes.

Additionally, the candidate should be able to extract and report useful performance or service metrics using ELK, Prometheus, and Grafana.

Linux experience is required, as well as familiarity with Python or C/C++.

A bachelor's degree in an engineering, mathematics, or related field, along with 2+ years of relevant experience, is necessary for this position.

Bonus qualifications include AWS Architecture and operational experience with a range of technologies, deploying and managing Kafka/MSK as a service, establishing and supporting CI/CD best practices, and experience handling large data sets.

Master's degree in an engineering, mathematics, or related field is a plus.

Compensation for this position includes a salary range of $160,000 to $256,000, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights.

Benefits include paid time off, unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Zoox:

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market.

Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments.

We're looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.



  • Foster City, California, United States Zoox Full time

    Job DescriptionZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical services.In this role, you will work closely with our development team to design and implement systems that are easy...


  • Foster City, California, United States Omega Solutions Inc Full time

    Job Title: Site Reliability EngineerAt Omega Solutions Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our critical platforms and applications.Key Responsibilities:* 8+ years of experience in Site Reliability...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining the systems that support our autonomous vehicle fleet.Key ResponsibilitiesDesign and implement scalable, fault-tolerant systems to support our autonomous...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a skilled Site Reliability Engineer to join our team. As a key member of our operations team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical services.Key ResponsibilitiesDesign and implement fault-tolerant systems for our servicesCollaborate with cross-functional teams...


  • Foster City, California, United States Bayone Full time

    Job SummaryAt Bayone, we are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and performance of our large production service.Key ResponsibilitiesHost OS upgradesDocker image upgradesSSL certificate upgradesRequirementsBachelor's degree in...


  • Foster City, California, United States Omega Solutions Inc Full time

    Job Description and ResponsibilitiesWe are seeking a highly skilled Site Reliability Engineer to join our team at Omega Solutions Inc.The ideal candidate will have a strong background in Unix/Linux administration, Bash scripting, and experience with configuration management automation tools like Chef and Ansible.Key Responsibilities:Design and implement...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical services.Key ResponsibilitiesDesign and implement fault-tolerant systems for our autonomous vehicle fleetCollaborate with...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will be responsible for ensuring the smooth operation of our large production service. This includes:Key ResponsibilitiesPerforming host OS upgrades, Docker image upgrades, and SSL certificate upgradesDefining and refining metrics to track service health and performanceAutomating software releases...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will be responsible for ensuring the smooth operation of our production services. This includes:Key ResponsibilitiesUpgrading and maintaining the host OS, Docker images, and SSL certificates to ensure optimal performance and security.Defining and refining metrics to track service health and...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will:Ensure the smooth operation of our large-scale production service, encompassing:• Host OS upgrades• Docker image upgrades• SSL certificate upgradesKey Responsibilities:• Define and refine metrics to track service health and performance• Automate software releases and service...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will:Ensure the smooth operation of our large-scale production service by:Performing regular host OS upgradesUpdating Docker images and SSL certificatesYou will also be responsible for:Defining and refining metrics to track service health and performanceAutomating software releases and service...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will be responsible for ensuring the smooth operation of our large production service. This includes:Key ResponsibilitiesService Maintenance: Perform regular host OS upgrades, Docker image upgrades, and SSL certificate upgrades to ensure the service remains up-to-date and secure.Metrics and...


  • Redwood City, California, United States Box Full time

    About BoxBox is the market leader for Cloud Content Management, empowering businesses to accelerate their digital transformation. Our mission is to power how the world works together, and we're seeking a talented Senior Software Engineer to join our Site Reliability Engineering team.Job SummaryWe're looking for a highly skilled Senior Software Engineer to...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical systems.Key ResponsibilitiesDesign and implement scalable and fault-tolerant systems for our autonomous vehicle...


  • Redwood City, California, United States Box Full time

    Transforming the Way the World Works TogetherAt Box, we're revolutionizing Cloud Content Management, and we need a talented Senior Software Engineer, Site Reliability Engineering to join our team. As a key member of our SRE organization, you'll play a crucial role in bringing AI to our content cloud, ensuring the reliability and scalability of our...


  • Redwood City, California, United States 1872 Consulting Full time

    Site Reliability EngineerAt 1872 Consulting, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our systems, working closely with developer teams to identify and resolve issues.Key Responsibilities:Be on-call rotation to respond to...


  • Redwood City, California, United States Moloco Full time

    About MolocoMoloco is a pioneering machine learning company that empowers organizations to unlock the full value of their unique first-party data, revolutionizing the traditional path to performance advertising. By harnessing the power of cutting-edge machine learning technologies, we play a unique and visible role in shaping the digital economy, allowing...


  • Redwood City, California, United States Box Full time

    Transform the Future of Content ManagementAt Box, we're revolutionizing the way organizations work with content. As a Senior Engineering Manager, Site Reliability Operations, you'll play a critical role in ensuring the seamless operation of our cloud infrastructure. Join our team and be part of shaping the future of content management.Key...


  • Redwood City, California, United States Zilliz Full time

    About ZillizZilliz is a fast-growing startup that specializes in developing cutting-edge vector database technologies for enterprise-grade AI applications. Our mission is to democratize AI by simplifying data management and making vector databases accessible to every organization.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join...


  • Foster City, California, United States Conviva Full time

    Job Title: Senior Manager, EngineeringConviva is a leading provider of big data streaming analytics solutions. We are seeking an experienced Senior Manager, Engineering to lead our engineering team and drive the development of our next-generation big data platform.About the RoleWe are looking for a highly technical and experienced engineering leader to join...