Hardware Systems Engineer, RAS

4 weeks ago


Austin, Texas, United States META Full time

Job Summary:

META is seeking a Hardware Systems Engineer to join our Release to Production (RTP) team working on new NPI hardware.

Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services.

The RTP team is responsible for the end-to-end Hardware Lifecycle of all META servers including prototyping of experimental HW, pre-production hands-on system and hardware debugging and stress testing, enabling production-ready system monitoring, automated provisioning and automated remediation of issues.

RTP team also helps in exploring, developing and productizing high-performance software and hardware technologies for AI at datacenter scale.

HHardware Systems Engineer in RTP work closely with HW/SW co-design teams, hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers.

We also work across service and hardware architectures for new AI systems, build prototypes to demonstrate the value, enable go/no-go decisions and optimize these systems in production.

Key Responsibilities:

Interface with external vendors and internal hardware, mechanical, power, thermal, manufacturing and software engineers to understand system architecture to guide and develop Hardware Fault Management for various server products.

Leverage deep understanding RAS (reliability, availability, serviceability) to improve error reporting and error handling mechanism for better operation quality and cost/efficiency.

Champion engineering and operational excellence, establishing metrics and process for regular assessment and improvement.

Develop visibility through data visualization and implement systemic solutions to hardware health issues.

Proactively create experiments and tooling to detect and diagnose hardware/firmware/software health issues.

Troubleshoot, diagnose and root cause of system failures and isolate the components/failure scenarios while working with internal & external stakeholders.

Drive necessary discussion with external and internal teams on test specification and methodologies to improve test quality continuously.

Requirements:

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.

5+ years of work experience in one or more domains such as: ASIC development (Silicon design or bringup or characterization), compute (ARM, x86), AI-ML hardware/software (GPUs, TPUs).

Knowledge of architecture and components on one of the following products: server/PC/Laptop.

Development or debug experience in one or more following areas: hardware fault management, error reporting, error handling on hardware products.

Preferred Qualifications:

7+ years of experience with one subset of the following AI systems: Accelerator (GPU/ASIC), Kernel development, Performance optimization (e.g., NVIDIA, AMD, Intel, or other misc accelerator), computer architecture, HPC communication libraries (e.g., NCCL, MPI), performance enablement, tracing, profiling and debugging.

Experience with architecture of disaggregated systems at scale.

Understanding of hardware development process and how to scope out test plans accordingly.

Experience troubleshooting problems at system level, crossing across multiple components, as well as hardware/firmware/software boundaries.

Compensation:

$132,000/year to $191,000/year + bonus + equity + benefits

Industry:

Internet

Equal Opportunity:

META is proud to be an Equal Employment Opportunity and Affirmative Action employer.

We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law.

META participates in the E-Verify program in certain locations, as required by law.

Please note that META may leverage artificial intelligence and machine learning technologies in connection with applications for employment.

META is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process.

If you need any assistance or accommodations due to a disability, please let us know at accommodations-



  • Austin, Texas, United States Meta Inc Full time

    Job Summary:We are seeking a skilled Hardware Systems Engineer to join our Release to Production (RTP) team at Meta Inc. The successful candidate will be responsible for the end-to-end Hardware Lifecycle of all Meta servers, including prototyping, debugging, and stress testing.The RTP team works closely with hardware designers, networking teams, system...


  • Austin, Texas, United States META Full time

    Meta Hardware Engineer Job DescriptionMeta is seeking a skilled Hardware Engineer to join our Compute Hardware team. Our mission is to develop cutting-edge hardware infrastructure that supports our massive computational challenges.Key Responsibilities:Collaborate with local and remote teams to define product roadmaps and programs.Design and develop...


  • Austin, Texas, United States ZT Systems Full time

    About the RoleThe Server Hardware Technician will be responsible for supporting ZT Systems Engineering group on several projects in various stages of development.This is a challenging and rewarding role that requires a strong understanding of computer architecture and components, as well as excellent analytical and problem-solving skills.The ideal candidate...


  • Austin, Texas, United States Apple Full time

    The Mac hardware team is seeking seasoned electrical engineers to tackle big challenges. As a Hardware Systems Engineer, you will use your skills in electrical engineering, system integration, computer systems architecture, and more to make design decisions that benefit millions of Mac users worldwide.We sit at the center of Apple's wider engineering team,...


  • Austin, Texas, United States Apple Full time

    Hardware Systems Engineer RoleAustin,Texas,United StatesJob SummaryWe are seeking a seasoned electrical engineer to join our Mac hardware team. As a hardware systems engineer, you will be responsible for designing the next generations of sophisticated Mac desktops and more.ResponsibilitiesComponent data-sheet analysis and selection for schematic...


  • Austin, Texas, United States Apple Full time

    At Apple, we are seeking a seasoned and enthusiastic electrical engineer to join our Mac hardware team. As a Hardware Systems Engineer, you will be responsible for designing the next generations of sophisticated Mac desktops and more. You will use your skills in electrical engineering, system integration, computer systems architecture, schematic design,...


  • Austin, Texas, United States Zebra Full time

    Job DescriptionWe are seeking an experienced Principal Hardware Systems Engineer to work in our small, fast-paced team within a large organization. The successful candidate will develop, design, evaluate, and/or modify rugged tablet and point of sale systems with a focus on Android and Microsoft Windows products.Responsibilities:End-to-end development, from...


  • Austin, Texas, United States Zebra Full time

    Job Summary:Zebra is seeking a highly skilled Principal Hardware Systems Engineer to join our team. As a key member of our engineering team, you will be responsible for developing, designing, evaluating, and modifying rugged tablet and point of sale systems with a focus on Android and Microsoft Windows products.Key Responsibilities:Develop system...


  • Austin, Texas, United States Apple Full time

    iPhone Hardware Systems Validation EngineerAre you passionate about creating innovative consumer products? As an iPhone Hardware Systems Validation Engineer at Apple, you will play a crucial role in delivering electrical validation milestones for our iconic products.You will work closely with the iPhone System Integration Team to drive projects forward and...


  • Austin, Texas, United States BAE Systems USA Full time

    Job SummaryWe are seeking a highly skilled Principal Systems Engineer to join our team at BAE Systems USA. As a key member of our engineering team, you will be responsible for developing complex system architectures and performing system design activities.Key Responsibilities:Develop, allocate, and manage complex system, hardware, and software...


  • Austin, Texas, United States Apple Full time

    Job DescriptionAt Apple, we're committed to delivering innovative products that elevate the customer experience. As a Hardware System Integration Engineer, you'll play a key role in shaping the future of AirPods by driving product architectures, designing PCBs and flexible circuits, and creating optimized system validation plans.As an experienced member of...


  • Austin, Texas, United States Apple Full time

    Job DescriptionApple is seeking a seasoned Electrical Engineer to join our Mac Hardware team. As a Hardware Systems Engineer, you will be responsible for designing and developing the next generation of Mac desktops.Key Responsibilities:Design and develop electrical systems for Mac desktopsCollaborate with cross-functional teams to ensure design meets...


  • Austin, Texas, United States Zebra Technologies Corporation Full time

    Remote Work: At Zebra Technologies Corporation, we are a community of innovators who come together to create new ways of working to make everyday life better. United by curiosity and care, we develop dynamic solutions that anticipate our customer's and partner's needs and solve their challenges. Being a part of Zebra Nation means being seen, heard, valued,...


  • Austin, Texas, United States BAE Systems USA Full time

    Job SummaryWe are seeking a highly skilled Senior Principal Systems Engineer to join our team at BAE Systems USA. As a key member of our Systems Engineering team, you will be responsible for designing, architecting, and integrating complex electronic control systems.As a Senior Principal Systems Engineer, you will lead systems engineering functions,...


  • Austin, Texas, United States Zebra Technologies Full time

    Job OverviewZebra Technologies is seeking a highly skilled Principal Hardware Systems Engineer to join our team in Austin, TX. As a key member of our small, fast-paced team, you will work closely with internal and external partners to deliver innovative, leadership products that solve our enterprise customers' problems.This is an exciting opportunity to...


  • Austin, Texas, United States Shift Robotics Full time

    We're pushing the boundaries of innovation at Shift Robotics, and we're looking for a talented Electrical Hardware Engineer to join our team.As an Electrical Hardware Engineer, you will be responsible for designing, prototyping, and testing reliable and innovative hardware for our robotic shoes. Your expertise in PCB design, motor control, and system...


  • Austin, Texas, United States Apple Full time

    Job Title: Hardware System Integration SpecialistAbout the Role: We are seeking a skilled Hardware System Integration Specialist to join our team at Apple. This role is responsible for designing, developing, and integrating hardware systems for our cutting-edge products.Key Responsibilities:Develop and implement hardware system architectures for our...


  • Austin, Texas, United States Apple Full time

    Electrical Validation SpecialistAs an Electrical Validation Specialist at Apple, you will play a crucial role in ensuring the electrical integrity of our iPhone products. You will work closely with the iPhone System Integration Team to develop and execute electrical validation plans, leveraging your technical expertise and interpersonal skills to drive...


  • Austin, Texas, United States Apple Full time

    Job SummaryWe are seeking a highly skilled Senior Hardware Engineer to join our team at Apple. The ideal candidate will have a strong understanding of CPU architecture, microarchitecture, and system design, as well as experience with I/O subsystem architectures, high-speed link technologies, and storage technologies.The successful candidate will be...


  • Austin, Texas, United States Optiver Full time

    Accelerate Trading Systems with High-Speed Hardware DesignsAt Optiver, we're pushing the boundaries of trading system performance with cutting-edge hardware designs. As a Junior FPGA Engineer, you'll work closely with our team of traders, software developers, and infrastructure engineers to accelerate network infrastructure and trading system components.Key...