Current jobs related to Site Reliability Engineer - Palo Alto, California - X (formerly Twitter)


  • Palo Alto, California, United States Rubrik Full time

    About The Role:As a Site Reliability Engineer at Rubrik, you will play a critical role in ensuring the smooth operation of our infrastructure services. You will work closely with product managers, designers, and other engineers to define the next generation of products for Rubrik.Key Responsibilities:Ensure high availability and durability of our...


  • Palo Alto, California, United States Plume Design Inc Full time

    Job Title: Technical Manager, Site Reliability EngineeringWe're seeking a seasoned Technical Manager with expertise in Customer Facing environments to lead our Site Reliability Engineering Team. This team focuses on deployments, fixes, and sustainability. The ideal candidate will have strong technical knowledge in key areas while prioritizing customer...

  • Technical Manager

    4 weeks ago


    Palo Alto, California, United States Plume Full time

    Job OverviewAt Plume, we're seeking a seasoned Technical Manager to lead our Site Reliability Engineering Team. This team is responsible for ensuring the smooth operation of our cloud infrastructure, deploying new features, and resolving production issues.The ideal candidate will have a strong technical background, experience managing teams, and excellent...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable and highly available systems that support our business-critical applications.You will work closely with our engineering teams to identify and resolve...


  • Palo Alto, California, United States Tesla Full time

    Reliability Engineer, CamerasWe are seeking a skilled Reliability Engineer with a strong background in cameras and sensors to join our reliability team at Tesla.In this role, you will play a key role in designing reliability into our groundbreaking products. You will follow the reliability lifecycle of these products from concept to design, development,...


  • Palo Alto, California, United States Tesla Full time

    Reliability Engineer, CamerasWe are seeking a skilled Reliability Engineer with expertise in cameras and sensors to join our reliability team at Tesla.In this role, you will play a key part in designing reliability into our groundbreaking products. You will follow the reliability lifecycle of these products from concept to design, development, manufacturing,...


  • Palo Alto, California, United States Tesla Full time

    Job Title: Power Electronics Reliability Engineer, Energy and Charging SystemsJob Summary:As a Reliability Engineer at Tesla, you will play a key role in designing reliability into our Energy Power Conversion Systems. You will investigate failures through physics of failure analysis and physical testing to accurately predict the charging system's robustness...


  • Palo Alto, California, United States Foundry Technologies, Inc. Full time

    About FoundryFoundry Technologies, Inc. is a leading provider of AI infrastructure solutions. We are seeking a highly skilled Senior Infrastructure Reliability Engineer to join our team.Job SummaryWe are looking for a talented engineer to design, deploy, and maintain our AI infrastructure. The ideal candidate will have a strong background in cloud...


  • Palo Alto, California, United States lever - ATS Full time

    Job SummaryWe are seeking a highly skilled Principal Cloud Reliability Engineer to join our team at Luma AI. As a key member of our Infrastructure and Research teams, you will be responsible for ensuring the health and scalability of our GPU clusters.Key ResponsibilitiesCollaborate with researchers and engineers to specify the availability, performance,...


  • Palo Alto, California, United States Tesla Full time

    Cell Reliability SpecialistAt Tesla, we're pushing the boundaries of innovation in battery technology. As a Cell Reliability Specialist, you'll play a critical role in ensuring the reliability of our battery cells. Your expertise will be instrumental in guiding the development of new cell technologies, managing critical cell requirements, and addressing...


  • Palo Alto, California, United States Wing Full time

    About Wing:Wing is a pioneer in drone delivery technology, offering a safe, fast, and sustainable solution for last mile logistics. Our mission is to create the preferred means of delivery for the planet. To achieve this, we need a team of experts who can help us design, build, and operate our aircraft. We're looking for a Hardware Reliability Engineer to...


  • Palo Alto, California, United States Wing Full time

    About Wing:Wing is a pioneer in drone delivery technology, offering a safe, fast, and sustainable solution for last mile logistics. Our mission is to create the preferred means of delivery for the planet.As a key player in the logistics industry, we are committed to building a workforce that is representative of the global communities we serve. To achieve...


  • Palo Alto, California, United States Wing Inflatables Inc Full time

    About WingWing is a pioneering company that offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Our mission is to create the preferred means of delivery for the planet. We design, build, and operate our aircraft, and offer drone delivery services on three continents. Our technology is designed to be easy to integrate into...


  • Palo Alto, California, United States Recor Medical Full time

    About the RoleWe are seeking a highly skilled Senior Process Control Engineer to join our team at Recor Medical. As a key member of our process control group, you will be responsible for driving advancements in process and reliability control across both on-site and off-site manufacturing environments.Key ResponsibilitiesConduct hands-on characterization,...


  • Palo Alto, California, United States Tesla Full time

    Job SummaryTesla is seeking a highly skilled Test Technician to join our Optimus Test Team. As a Test Technician, you will be responsible for supporting the component, sub-system and system level testing of Tesla's Optimus Bot. You will work closely with reliability and test engineers in a fast-paced and ever-changing environment to assemble test fixtures,...

  • Backend Engineer

    4 weeks ago


    Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly motivated and talented Backend Engineer to join our Foundations Inference Infrastructure team as an intern. As a member of this team, you will design and implement backend services and tools that power Tesla Bot and Full Self-Driving software and processes.ResponsibilitiesDesign and implement backend services and tools...


  • Palo Alto, California, United States Tesla Full time

    Design Thermal Systems for Tesla Energy ProductsThe Tesla Energy Products team is working to bring battery and solar technology to the grid to facilitate a renewable energy ecosystem and redefine the way the world uses energy in the future. As part of the Mechanical Design Team, you will design and package the thermal systems that make Tesla Energy products...


  • Palo Alto, California, United States Luma AI Full time

    Job Description:Luma AI is seeking a highly skilled Senior Backend Engineer to join our team. As a key member of our engineering team, you will be responsible for designing and building the development and production platforms that power our new products, enabling reliability and security at scale.Responsibilities:Design and build the development and...

  • Platform Engineer

    1 month ago


    Palo Alto, California, United States Palantir Technologies Full time

    About the RoleWe are seeking a highly skilled Platform Engineer to join our Identity Security team at Palantir Technologies. As a key member of our team, you will design, build, and run secure-by-design identity infrastructure and tooling.Key ResponsibilitiesDevelop automation for corporate and customer-facing identity platforms across multiple compliance...

  • Database Engineer

    4 days ago


    Palo Alto, California, United States Rubrik Full time

    About The TeamRubrik's engineering team is comprised of talented individuals who strive to build efficient, reliable, and cost-effective products.We believe in empowering our teams, giving engineers autonomy and responsibility, not just tasks.Our goal is to motivate and challenge you to do your best work.

Site Reliability Engineer

1 month ago


Palo Alto, California, United States X (formerly Twitter) Full time
About the Role

We're seeking a highly skilled Site Reliability Engineer to join our Command Center Team at X (formerly Twitter). As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our services, which are used by millions of users worldwide.

Key Responsibilities
  • Triage and troubleshoot complex issues at massive scale, ensuring high availability and reliability of our services.
  • Develop and implement strategies to continuously improve system performance, reliability, and resiliency.
  • Create and maintain software for load testing, failure detection, traffic management, and data analysis using Python, Go, Scala, JavaScript, and Superset.
  • Lead incident management efforts, providing clear and effective communication and collaborating with engineering teams to mitigate impact and ensure timely resolution.
  • Refine service-level objectives (SLOs) across the stack, ensuring we meet/exceed our error budgets and user expectations.
  • Implement high-fidelity metrics to measure and improve the user experience across our services.
  • Use distributed tracing to understand and manage service dependencies, facilitating debugging and improving latencies.
Requirements
  • Bachelor's degree or above in Computer Science, Engineering, or related field.
  • 2+ years of experience in large-scale software development with a focus on site reliability engineering.
  • Profound understanding of computer science fundamentals, including data structures, algorithms, and concurrency principles.
  • Expertise with observability and monitoring, incident management, load testing, microservice architecture, and design patterns.
  • Proficiency in one or more object-oriented programming languages (e.g. Scala, Java, C++). Additional knowledge of Python or Golang is a significant asset.
  • Strong knowledge of Unix/Linux system administration at scale.
About X

X is a global digital public square, committed to protecting freedom of speech and building the future of unlimited interactivity. We're a lean, high-impact team operating in a reverse startup mode, with a focus on enhancing the reliability and performance of our diverse service areas.

We're looking for talented individuals who share our passion for building a better internet and are excited about the opportunity to join our team.