Production Systems Engineer

4 weeks ago


Menlo Park, California, United States META Full time
Production Systems Engineer, Fleet AI Systems

Meta is seeking a highly skilled Production Systems Engineer to join our Release to Production (RTP) team. As a key member of our team, you will be responsible for the Hardware Lifecycle of all Meta servers, including pre-production hands-on system and hardware debugging and stress testing, enabling production-ready system monitoring, automated provisioning, and automated remediation of issues.

Key Responsibilities:
  • Interface with external vendors and internal hardware, mechanical, power, thermal, manufacturing, and software engineers to understand system architecture and develop and execute test suites for various architectures.
  • Proactively create experiments and tooling to detect and diagnose hardware/firmware/software health issues.
  • Develop a test framework for large-scale test automation inside fleet during product development and after mass production.
  • Implement remediations across the software and hardware stack according to plan, while keeping a thorough procedural record and data log.
  • Develop and publish updates on resolutions and communicate findings internally.
  • Troubleshoot, diagnose, and root cause system failures and isolate the components/failure scenarios while working with internal and external stakeholders.
  • Develop visibility through data visualization and implement systemic solutions to hardware health issues.
  • Drive necessary discussion with external and internal teams on test specification and methodologies to improve test quality continuously.
  • Contribute to Meta's 2030 Net Zero targets by evaluating sustainability and carbon footprint of new hardware design and infrastructure design. Partner with Net Zero teams to implement strategies across infrastructure for reuse, recycling, energy-aware computing, and quality practices for deployed and decommissioned hardware.
Requirements:
  • Bachelor's degree in Computer Science, Computer Engineering, or a relevant technical field, or equivalent practical experience.
  • 4+ years of experience in hardware system support, knowledge of server architecture and components.
  • Experience with Energy Aware Computing and/or Sustainable Infrastructure Design.
  • Experience with Linux and scripting. Experience in changing system configurations and measuring change impact.
  • Experience working in a matrix organization. Engineering for different server system/data center products.
Preferred Qualifications:
  • 4+ years of experience in Production support at scale.
  • 4+ years of experience in full system technologies, full system lifecycle.
  • Experience supporting AI/HPC systems and/or related components at scale. Experience in post-production hyperscale post-production environments, solutions.

Meta is committed to providing reasonable support in our recruiting processes for candidates with disabilities, long-term conditions, mental health conditions, or sincerely held religious beliefs, or who are neurodivergent or require pregnancy-related support. If you need support, please reach out to .

Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity, or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.



  • Menlo Park, California, United States META Full time

    Job Title: Production Systems Engineer, Fleet AI SystemsMeta is seeking a highly skilled Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services.Responsibilities:Interface with external...


  • Menlo Park, California, United States META Full time

    Meta Hardware Systems EngineerMeta is seeking a skilled Hardware Systems Engineer to join our Release to Production (RTP) team. As a key member of this team, you will be responsible for the end-to-end Hardware Lifecycle of all Meta servers, including prototyping of experimental HW, pre-production hands-on system and hardware debugging and stress testing,...


  • Menlo Park, California, United States Avails Medical, Inc. Full time

    Job Title: Senior Systems Engineering ManagerJob Summary:We are seeking a highly skilled Senior Systems Engineering Manager to join our team at Avails Medical, Inc. The ideal candidate will have a strong background in systems engineering, with experience in leading complex system analysis from concept through commercialization.Key Responsibilities:Provide...


  • Menlo Park, California, United States META Full time

    Job Summary:You will be part of the team responsible for ensuring the end-to-end health of Meta's backbone networks, focusing on performance and reliability. Your role will involve building tools and using automation to efficiently scale mitigation of real-time network impacts, identify and investigate long-term trends in performance and risks, and drive...


  • Menlo Park, California, United States PacBio Full time

    PacBio Senior Systems EngineerPacBio is a leading life science technology company that designs, develops, and manufactures advanced sequencing solutions to help scientists and clinical researchers resolve genetically complex problems. Our mission is to enable the promise of genomics to better human health.We are now entering the century of biology and...


  • Menlo Park, California, United States Entegee Full time

    Job SummaryEntegee is seeking a highly skilled Manufacturing Engineer to lead and coordinate manufacturing activities for pharmaceutical product development. The ideal candidate will have a strong background in chemical engineering, biomedical engineering, or mechanical engineering, with 5+ years of experience in pharmaceutical product manufacturing and...


  • Menlo Park, California, United States META Full time

    About the RoleWe are seeking a highly skilled Data Engineer to join our Product Analytics team at Meta. As a Data Engineer, you will play a critical role in shaping the future of our people-facing and business-facing products.Key Responsibilities:Design and build scalable data solutions to optimize growth, strategy, and user experience for our 3 billion plus...


  • Menlo Park, California, United States SLAC National Accelerator Laboratory Full time

    Job Title: System Safety EngineerSLAC National Accelerator Laboratory is seeking a highly skilled System Safety Engineer to join our team in the System Safety Engineering Division (SSED) in the Accelerator Directorate.Job Summary:The System Safety Engineer will be responsible for designing, implementing, and maintaining industrial controls and safety systems...


  • Menlo Park, California, United States Mainspring Energy Full time

    About the RoleMainspring Energy is seeking a Senior Mechanical Engineer to join our team in Menlo Park, CA. As a key member of our engineering team, you will design and develop innovative thermal systems for our linear generator technology.Key ResponsibilitiesDesign and optimize key thermal subsystems of the linear generator to drive down cost, improve...


  • Menlo Park, California, United States PacBio Full time

    Job SummaryThe Staff Engineer, New Product Introduction, is responsible for designing, implementing, and maintaining the Instrument infrastructure using their broad technical expertise. This role works in tandem with R&D System Engineering to bring up new platforms.The expectation of a Staff Engineer is to represent NPI on project core teams, be a leader of...


  • Menlo Park, California, United States Cyngn Full time

    About CyngnCyngn is a leading autonomous vehicle company based in Menlo Park, CA. We're a collaborative and diverse team that's passionate about innovation and continuous learning.Our self-driving technology can be deployed in various commercial domains across different vehicle form factors. We're seeking experienced leaders to join our team and help move...


  • Menlo Park, California, United States Robinhood Full time

    About the RoleAs a Product Manager at Robinhood, you will be responsible for leading teams to develop products that improve the everyday lives of our customers. You will work with experts in data, design, engineering, marketing, operations, and research to bring ideas to life that will help us democratize our financial system.We are seeking curious thinkers...


  • Menlo Park, California, United States Vello Systems Full time

    Job DescriptionVello Systems is seeking a skilled Quality Assurance Specialist to join our team.Key Responsibilities:Design and implement test plans for network solutions incorporating virtualization, network management tools, and hardware components.Conduct technical reviews of features to ensure feasibility and design.Communicate effectively with...


  • Menlo Park, California, United States SLAC National Accelerator Laboratory Full time

    Job Title: Electrical Protection and Control Systems EngineerSLAC National Accelerator Laboratory is seeking a highly skilled Electrical Protection and Control Systems Engineer to join our team. As a key member of our Facilities and Operations Division, you will be responsible for designing, constructing, operating, and maintaining electrical control systems...


  • Menlo Park, California, United States Mainspring Energy, Inc. Full time

    Job Title: Control Systems Modeling and Simulation EngineerMainspring Energy, Inc. is seeking a highly skilled Control Systems Modeling and Simulation Engineer to join our team. As a key member of our controls engineering team, you will be responsible for designing and implementing simulation models that replicate the behavior of our Linear Generators,...


  • Menlo Park, California, United States SLAC National Accelerator Laboratory Full time

    Job Title: Facilities Engineer 3SLAC National Accelerator Laboratory is seeking a highly skilled Facilities Engineer 3 to join our team. As a key member of our Facilities and Operations Division, you will be responsible for designing, constructing, operating, and maintaining electrical control systems for power distribution.Job Summary:The successful...


  • Menlo Park, California, United States META Full time

    Job Summary:Meta is seeking a highly skilled AI/HPC Systems Performance Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, deploying, and operating high-performance networks to support our rapidly growing AI workloads.This is an exciting opportunity to work on cutting-edge technologies and contribute...


  • Menlo Park, California, United States SLAC National Accelerator Laboratory Full time

    Job DescriptionThe SLAC National Accelerator Laboratory is seeking a talented Electronics Engineer to contribute to the development of state-of-the-art electronics systems used in various experiments and facilities. The RF and Analog Systems department of the Technology Innovation directorate at SLAC National Accelerator Laboratory is responsible for...


  • Menlo Park, California, United States Mindlance Full time

    Job Title: Verification Engineer for Consumer ProductsJob Description:The Verification Engineer for Consumer Products will work closely with researchers, architects, and designers to own the electrical system level verification of Consumer products. This role involves architecting methods of electrical verification for multiple state-of-the-art systems,...


  • Menlo Park, California, United States META Full time

    Job Summary:The PyTorch Compiler team at Meta is dedicated to making PyTorch run faster and more resource-efficient without sacrificing its flexibility and ease of use. We are seeking a highly skilled Software Engineer, Systems ML to join our team and contribute to the development of the PT2 compiler.The ideal candidate will have a strong background in...