Application Reliability Engineer

1 week ago


Hawthorne, California, United States SpaceX Full time

At SpaceX, we believe in a future where humanity explores the cosmos, and we are committed to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.

APPLICATION RELIABILITY ENGINEER

The application software division serves as the backbone of SpaceX, crafting essential applications that facilitate the production and flight of launch vehicles, as well as supporting the expansion of Starlink into a global, high-speed Internet service.

We are seeking a seasoned Application Reliability Engineer to manage and enhance our bespoke, mission-critical software solutions for engineering, testing, and launch operations. These solutions are integral to the software that powers rockets, spacecraft, satellites, and more. Each Falcon 9 launch, Dragon capsule docking with the ISS, or Starlink satellite connection is supported by the software you will help build and maintain.

SpaceX is dedicated to ensuring that our vehicle software is developed swiftly and accurately, rigorously tested, and continuously improved. This approach allows us to innovate technologies that were once considered science fiction. You will play a crucial role in ensuring that software delivery at SpaceX keeps pace with our engineering initiatives, driving us toward our goal of making humanity multi-planetary.

While aerospace experience is not a prerequisite for success in this role, we value intelligent, motivated, and collaborative engineers who are passionate about problem-solving and making a significant impact on an inspiring mission. We seek engineers who treat their colleagues with fairness, respect, and support. You will take full ownership of complex challenges, collaborating with a team of enthusiastic engineers to design and implement solutions that propel SpaceX toward its objectives at an accelerated pace. The success of our missions relies heavily on the software produced by you and your team.

KEY RESPONSIBILITIES:

  • Deploy, upgrade, operate, maintain, and scale our suite of mission-critical products and services.
  • Manage our underlying infrastructure as code and utilize modern observability tools to provide a comprehensive view of application health.
  • Collaborate closely with software engineers to develop highly operable and maintainable products.
  • Engage in and enhance the entire software development lifecycle of services—from inception and design to deployment, operation, and refinement.
  • Implement sustainable incident response practices and conduct blameless postmortems.
  • Provide end-user support to vehicle software engineers for our products.
  • Participate in the team's on-call rotation as needed.
  • Focus on identifying performance bottlenecks and implementing performance improvement strategies.

BASIC QUALIFICATIONS:

  • Bachelor's degree in computer science, information systems, or a related engineering discipline; OR 2+ years of professional experience in site reliability or DevOps in lieu of a degree.
  • Proficiency with Linux operating systems.

PREFERRED SKILLS AND EXPERIENCE:

  • 5+ years of experience in DevOps, site reliability engineering, or system administration.
  • 3+ years of experience with Python and Python-based development frameworks.
  • Familiarity with source code and version control tools such as Git or Subversion.
  • Experience with infrastructure as code (IaC) tools for managing server fleets automatically.
  • Knowledge of build systems (Make, Bazel/Pants/Buck, Gradle, etc.) and package management tools (pip, npm, etc.).
  • Experience with container and virtualization technologies (VirtualBox, KVM, Docker, Kubernetes, vSphere, EC2, GCE).
  • Familiarity with automation frameworks such as Terraform, Ansible, or Puppet.
  • Understanding of TCP/IP networking.
  • Experience with databases and data modeling.
  • Familiarity with workflow and issue management tools like JIRA.
  • Ability to work with mission-critical and sensitive systems, demonstrating a sense of urgency appropriate to the responsibilities.
  • Strong communication skills to interact with customers, peers, and management in both formal and informal settings.

ADDITIONAL REQUIREMENTS:

  • Willingness to work extended hours and weekends as required.

COMPENSATION AND BENEFITS:

Pay Range:
Application Reliability Engineer/Level I: $120,000 - $145,000 per year
Application Reliability Engineer/Level II: $140,000 - $170,000 per year

Your actual level and base salary will be determined on a case-by-case basis and may vary based on job-related knowledge, skills, education, and experience. Base salary is just one component of your total rewards package at SpaceX, which may also include long-term incentives, potential discretionary bonuses, and comprehensive benefits.

ITAR REQUIREMENTS:

  • To comply with U.S. Government export regulations, applicants must be U.S. citizens, lawful permanent residents, refugees, or asylees, or be eligible to obtain the required authorizations from the U.S. Department of State.

SpaceX is an Equal Opportunity Employer; employment is based on merit, competence, and qualifications, without regard to race, color, religion, gender, national origin, veteran status, disability status, age, sexual orientation, gender identity, marital status, or any other legally protected status.



  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are committed to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.APPLICATION RELIABILITY ENGINEERThe application software division serves as the backbone of SpaceX, crafting essential applications that enhance launch...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality. Our mission is to enable human life on Mars.APPLICATION RELIABILITY ENGINEERThe application software division serves as the backbone of SpaceX, crafting essential applications that facilitate the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality. Our ultimate goal is to enable human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, multi-disciplinary group focused on ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we believe in a future where humanity explores the cosmos, and we are dedicated to developing the technologies that will make this vision a reality, with the ultimate aim of enabling human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, cross-functional group focused on ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability and scalability of our systems.Key ResponsibilitiesDesign, develop, and test automation tools to deploy and manage applications on-premises and in the cloud.Deploy and manage core...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos. Our mission is to develop the technologies that will make this a reality, ultimately aiming to enable human life on Mars. ELECTRICAL RELIABILITY ENGINEER (FALCON)The Build Reliability team at SpaceX is a diverse, cross-functional group dedicated to ensuring the...


  • Hawthorne, California, United States SpaceX Full time

    About SpaceX:SpaceX was established with the vision that a future where humanity explores the cosmos is far more thrilling than one where we remain confined to Earth. Currently, SpaceX is advancing the technologies necessary to make this vision a reality, with the ultimate aim of facilitating human life on Mars.Position Overview:The Build Reliability team at...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, you'll play a critical role in designing, developing, and testing key aspects of our in-house solution for analysis, simulation, and prototyping of software in support of all SpaceX flight systems.Key ResponsibilitiesAutomation and...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos, and we are actively working to turn that vision into reality. As a Production Reliability Engineer, you will be an integral part of the Build Reliability team, dedicated to ensuring the dependability of flight components and production systems. Your role will provide you...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of space exploration and development. As a Site Reliability Engineer, Data, you'll play a critical role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems in multiple data centersAdvance...


  • Hawthorne, California, United States SpaceX Full time

    SENIOR AEROSPACE RELIABILITY ENGINEER (DRAGON)SpaceX is on the lookout for a Senior Aerospace Reliability Engineer to become a vital part of our forward-thinking team. In this position, you will play a crucial role in influencing the Dragon spacecraft that facilitate missions to the International Space Station. Your duties will encompass evaluating...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Data team at SpaceX. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our data systems.Key ResponsibilitiesDesign and implement scalable and reliable data systems to support our mission-critical...


  • Hawthorne, California, United States SpaceX Full time

    Job SummaryWe are seeking a highly skilled Starship Flight Reliability Engineer to join our team at SpaceX. As a key member of our Starship program, you will be responsible for ensuring the safety and reliability of our spacecraft systems.Key ResponsibilitiesPerform quantitative and qualitative reliability analyses for complex systemsReview company-wide...


  • Hawthorne, California, United States SpaceX Full time

    Job SummarySpaceX is seeking a highly skilled and experienced Senior Flight Reliability Engineer to join our team. As a key member of our Dragon Vehicle Reliability team, you will play a critical role in ensuring the safe and successful operation of our spacecraft.Key ResponsibilitiesAssess pre-flight risks and develop mitigation strategies in collaboration...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleSpaceX is a pioneering space exploration company that aims to make humanity a multi-planetary species. As a Site Reliability Engineer, Data, you will play a crucial role in ensuring the reliability and scalability of our mission-critical applications.Key ResponsibilitiesDesign and implement sharded and geo-redundant distributed systems to...


  • Hawthorne, California, United States SpaceX Full time

    At SpaceX, we are driven by the vision of a future where humanity explores the cosmos, and we are dedicated to making that vision a reality. As a Senior Vehicle Reliability Engineer, you will play a crucial role in ensuring the safety and performance of our Dragon spacecraft, which are vital for transporting science, supplies, and crew to the International...


  • Hawthorne, California, United States SpaceX Full time

    Job DescriptionJob Title: Hardware Reliability EngineerJob Summary:SpaceX is seeking a highly skilled Hardware Reliability Engineer to join our Starshield team. As a key member of our team, you will be responsible for ensuring the reliability and quality of our satellite systems.Key Responsibilities:Support on-call shifts and respond to satellite and gateway...


  • Hawthorne, California, United States SpaceX Full time

    About the RoleAt SpaceX, we're pushing the boundaries of what's possible in space exploration and development. As a Site Reliability Engineer - Data, you'll play a critical role in ensuring the reliability and scalability of our data systems, enabling us to accelerate launch vehicle production and flight, as well as support the growth of our Starlink...