Reliability Engineer

6 days ago


Hawthorne, California, United States SpaceX Full time
About the Role

At SpaceX, we're pushing the boundaries of what's possible in space exploration and development. As a Site Reliability Engineer - Data, you'll play a critical role in ensuring the reliability and scalability of our data systems, enabling us to accelerate launch vehicle production and flight, as well as support the growth of our Starlink network.

Key Responsibilities
  • Design and implement sharded and geo-redundant distributed systems in multiple data centers
  • Advance existing deployment, monitoring, and alerting infrastructure to support a multi-region environment
  • Manage petabyte-scale bare metal compute clusters
  • Collaborate with engineers across programs to create highly operable, scalable, and maintainable products
  • Engage throughout the software development lifecycle of services, from inception to design, deployment, operation, and iterative refinement
  • Focus on performance bottlenecks and performance improvement techniques
Requirements
  • Bachelor's degree in computer science, engineering, math, or scientific discipline and 5 years of software development experience, or 7+ years of professional experience building software with site reliability or DevOps in lieu of a degree
  • Experience with Linux operating systems
Preferred Skills and Experience
  • 5+ years of rigorous experience with site reliability or DevOps
  • Experience with Kubernetes and Istio for on-premise deployment
  • Experience with in-stream data processing and analytics using open source platforms such as Apache Kafka, Spark, HBase, HDFS, Flink
  • Experience troubleshooting hardware and network-layer issues
  • Programming experience in Python, C#, Java, Scala, or similar languages
  • Good understanding of version control, testing, continuous integration, build, deployment, and monitoring
Additional Requirements
  • Willingness to work extended hours and weekends when needed


  • Hawthorne, California, United States SpaceX Full time

    About SpaceX:SpaceX was established with the vision that a future where humanity explores the cosmos is far more thrilling than one where we remain confined to Earth. Currently, SpaceX is advancing the technologies necessary to make this vision a reality, with the ultimate aim of facilitating human life on Mars.Position Overview:The Build Reliability team at...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality. Our ultimate goal is to enable human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, multi-disciplinary group focused on ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality, with the ultimate aim of enabling human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, cross-functional group focused on ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos. Our mission is to develop the technologies that will make this a reality, ultimately aiming to enable human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, cross-functional group dedicated to ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.APPLICATION RELIABILITY ENGINEERThe application software division serves as the backbone of SpaceX, crafting essential applications that facilitate the...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability and scalability of our systems.Key ResponsibilitiesDesign, develop, and test automation tools to deploy and manage applications on-premises and in the cloud.Deploy and manage core...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are committed to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.APPLICATION RELIABILITY ENGINEERThe application software division serves as the backbone of SpaceX, crafting essential applications that enhance launch...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are committed to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.APPLICATION RELIABILITY ENGINEERThe application software division serves as the backbone of SpaceX, crafting essential applications that facilitate the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos, and we are actively working to turn that vision into reality. As a Production Reliability Engineer, you will be an integral part of the Build Reliability team, dedicated to ensuring the dependability of flight components and production systems. Your role will provide you...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, you'll play a critical role in designing, developing, and testing key aspects of our in-house solution for analysis, simulation, and prototyping of software in support of all SpaceX flight systems.Key ResponsibilitiesAutomation and...


  • Hawthorne, California, United States SpaceX Full time

    Job SummaryWe are seeking a highly skilled Starship Flight Reliability Engineer to join our team at SpaceX. As a key member of our Starship program, you will be responsible for ensuring the safety and reliability of our spacecraft systems.Key ResponsibilitiesPerform quantitative and qualitative reliability analyses for complex systemsReview company-wide...


  • Hawthorne, California, United States SpaceX Full time

    Job DescriptionJob Title: Hardware Reliability EngineerJob Summary:SpaceX is seeking a highly skilled Hardware Reliability Engineer to join our Starshield team. As a key member of our team, you will be responsible for ensuring the reliability and quality of our satellite systems.Key Responsibilities:Support on-call shifts and respond to satellite and gateway...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    SENIOR AEROSPACE RELIABILITY ENGINEER (DRAGON)SpaceX is on the lookout for a Senior Aerospace Reliability Engineer to become a vital part of our forward-thinking team. In this position, you will play a crucial role in influencing the Dragon spacecraft that facilitate missions to the International Space Station. Your duties will encompass evaluating...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, Data, you'll play a critical role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems in multiple data centersAdvance...


  • Hawthorne, California, United States SpaceX Full time

    Job SummarySpaceX is seeking a highly skilled and experienced Senior Flight Reliability Engineer to join our team. As a key member of our Dragon Vehicle Reliability team, you will play a critical role in ensuring the safe and successful operation of our spacecraft.Key ResponsibilitiesAssess pre-flight risks and develop mitigation strategies in collaboration...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos, and we are dedicated to making that vision a reality. As a Senior Vehicle Reliability Engineer, you will play a crucial role in ensuring the safety and performance of our Dragon spacecraft, which are vital for transporting science, supplies, and crew to the International...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    SpaceX was established with the vision that a future where humanity explores the cosmos is significantly more thrilling than one where we remain confined to Earth. Currently, SpaceX is at the forefront of developing technologies that will make this vision a reality, with the ultimate aim of facilitating human life on Mars.SENIOR FLIGHT RELIABILITY ENGINEER...