Current jobs related to Site Reliability Engineer - Culver City, California - ICON Consultants, LP


  • Foster City, California, United States Omega Solutions Inc Full time

    Job Title: Site Reliability EngineerAt Omega Solutions Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our critical platforms and applications.Key Responsibilities:* 8+ years of experience in Site Reliability...


  • Foster City, California, United States Bayone Full time

    Job SummaryAt Bayone, we are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and performance of our large production service.Key ResponsibilitiesHost OS upgradesDocker image upgradesSSL certificate upgradesRequirementsBachelor's degree in...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will be responsible for ensuring the smooth operation of our production services. This includes:Key ResponsibilitiesUpgrading and maintaining the host OS, Docker images, and SSL certificates to ensure optimal performance and security.Defining and refining metrics to track service health and...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will:Ensure the smooth operation of our large-scale production service by:Performing regular host OS upgradesUpdating Docker images and SSL certificatesYou will also be responsible for:Defining and refining metrics to track service health and performanceAutomating software releases and service...


  • Foster City, California, United States Bayone Full time

    Job DescriptionAs a Site Reliability Engineer at Bayone, you will be responsible for ensuring the smooth operation of our large production service. This includes:Key ResponsibilitiesService Maintenance: Perform regular host OS upgrades, Docker image upgrades, and SSL certificate upgrades to ensure the service remains up-to-date and secure.Metrics and...


  • Culver City, California, United States Apple Full time

    Hardware Reliability EngineerAt Apple, we're committed to delivering exceptional products that meet the highest standards of quality and reliability. As a Hardware Reliability Engineer, you'll play a critical role in ensuring the durability and reliability of our products.Key Responsibilities:Develop and implement creative reliability tests on new hardware...


  • Foster City, California, United States Zoox Full time

    Job DescriptionZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical services.In this role, you will work closely with our development team to design and implement systems that are easy...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining the systems that support our autonomous vehicle fleet.Key ResponsibilitiesDesign and implement scalable, fault-tolerant systems to support our autonomous...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a skilled Site Reliability Engineer to join our team. As a key member of our operations team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical services.Key ResponsibilitiesDesign and implement fault-tolerant systems for our servicesCollaborate with cross-functional teams...


  • Redwood City, California, United States Box Full time

    About BoxBox is the market leader for Cloud Content Management, empowering businesses to accelerate their digital transformation. Our mission is to power how the world works together, and we're seeking a talented Senior Software Engineer to join our Site Reliability Engineering team.Job SummaryWe're looking for a highly skilled Senior Software Engineer to...


  • Redwood City, California, United States Box Full time

    Transforming the Way the World Works TogetherAt Box, we're revolutionizing Cloud Content Management, and we need a talented Senior Software Engineer, Site Reliability Engineering to join our team. As a key member of our SRE organization, you'll play a crucial role in bringing AI to our content cloud, ensuring the reliability and scalability of our...


  • Foster City, California, United States Zoox Full time

    Zoox is seeking a site reliability engineer to ensure the uptime of services critical to the development of autonomous vehicles.This role involves designing systems that are easy to maintain and fault-tolerant, as well as deploying, operating, and continually improving services.The ideal candidate will have experience with configuration management tools like...


  • Redwood City, California, United States Zilliz Full time

    About ZillizZilliz is a pioneering startup that specializes in developing cutting-edge vector database technologies for enterprise-grade AI applications.As the company behind the world's most popular open-source vector database, Milvus, Zilliz is committed to simplifying data management for AI applications and making vector databases accessible to every...


  • Culver City, California, United States Apple Full time

    Job Title: Live Stream Reliability EngineerAt Apple, we're looking for a skilled Live Stream Reliability Engineer to join our team. As a key member of our Live Stream Engineering team, you'll play a crucial role in ensuring the reliability and quality of live streaming experiences on the Apple TV app.Key Responsibilities:Design and implement features, tools,...


  • Foster City, California, United States Zoox Full time

    About the RoleZoox is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our autonomous vehicle fleet's critical systems.Key ResponsibilitiesDesign and implement scalable and fault-tolerant systems for our autonomous vehicle...


  • Culver City, California, United States Apple Full time

    Job Title: Live Stream Reliability EngineerAt Apple, we're looking for a skilled Live Stream Reliability Engineer to join our team. As a key member of our Live Stream Engineering team, you'll play a crucial role in ensuring the reliability and quality of live streaming experiences across the AppleTV app ecosystem.About the RoleWe're seeking a highly...


  • Culver City, California, United States Apple Full time

    Job SummaryApple is seeking a skilled Hardware Reliability Engineer to join our Ecosystem Accessories team. As a key member of our team, you will be responsible for ensuring the durability and reliability of our products.Key ResponsibilitiesDevelop and implement creative reliability tests on new hardware programsQuantify reliability risk and support failure...


  • Redwood City, California, United States Zilliz Full time

    About ZillizZilliz is a fast-growing startup that specializes in developing the industry's leading vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world's most popular open-source vector database, the company builds next-generation database technologies to help organizations quickly create AI applications. Our...


  • Culver City, California, United States Apple Full time

    Job SummaryAs a Hardware Reliability Engineer on Apple's Ecosystem Accessories team, you will play a critical role in ensuring the durability and reliability of our products. You will work closely with diverse engineering teams to develop and implement creative reliability tests, analyze data, and advise the executive team on the best path forward.Key...


  • Culver City, California, United States Apple Full time

    Job Title: Live Stream Reliability EngineerAt Apple, we're committed to delivering exceptional live streaming experiences across our platforms. We're seeking a skilled Live Stream Reliability Engineer to join our team and contribute to the design and implementation of features, tools, and processes that ensure operational and engineering excellence in...

Site Reliability Engineer

2 months ago


Culver City, California, United States ICON Consultants, LP Full time
About the Role

We are seeking a highly skilled Site Reliability Engineer to join our team at ICON Consultants, LP. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure.

Key Responsibilities
  • Data Monitoring and Alerting: Design and implement data monitoring and alerting systems to ensure timely detection and response to issues.
  • Process Documentation: Develop and maintain documentation of team processes and policies, including methods of engagement and Service Level Objectives (SLOs).
  • System Optimization: Analyze and design solutions to remove bottlenecks and improve edge service performance at the system level.
  • Monitoring and Alerting: Implement monitoring and alerting systems to improve issue detection and response.
  • On-Call Rotations: Participate in on-call rotations, responsible for resolving or escalating incoming events.
  • Environment Maintenance: Maintain and operate a Linux and Kubernetes environment.
Requirements
  • Unix/Linux Experience: 3+ years of experience working with Unix/Linux systems from kernel to shell and beyond, with experience working with system libraries, file systems, and client-server protocols.
  • Python Scripting: 2+ years of experience coding Python scripts for platform operations.
  • Networking Experience: Experience in networking technologies such as TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
  • Cloud Experience: Experience in developing and operating one or more of the following systems: OpenStack, Kubernetes, Nginx, ipvs, ELK stack, Hadoop, etc.
  • Education: Bachelor's degree or above, majoring in Computer Science or related fields, with at least 2 years of related work experience.
About ICON Consultants, LP

ICON Consultants, LP is a leading consulting firm that provides expert services to clients across various industries. We are committed to delivering high-quality solutions that meet the evolving needs of our clients.