Hardware Systems Engineer
23 hours ago
Meta is seeking a skilled Hardware Systems Engineer to join our Release to Production (RTP) team working on new NPI hardware.
Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services.
The RTP team is responsible for the end-to-end Hardware Lifecycle of all Meta servers, including prototyping of experimental HW, pre-production hands-on system and hardware debugging and stress testing, enabling production-ready system monitoring, automated provisioning and automated remediation of issues.
Hardware Systems Engineer in RTP work closely with HW/SW co-design teams, hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers.
We also work across service and hardware architectures for new AI systems, build prototypes to demonstrate the value, enable go/no-go decisions and optimize these systems in production.
Responsibilities- Interface with external vendors and internal hardware, mechanical, power, thermal, manufacturing and software engineers to understand system architecture to guide and develop Hardware Fault Management for various server products.
- Leverage deep understanding RAS (reliability, availability, serviceability) to improve error reporting and error handling mechanism for better operation quality and cost/efficiency.
- Champion engineering and operational excellence, establishing metrics and process for regular assessment and improvement.
- Develop visibility through data visualization and implement systemic solutions to hardware health issues.
- Proactively create experiments and tooling to detect and diagnose hardware/firmware/software health issues.
- Troubleshoot, diagnose and root cause of system failures and isolate the components/failure scenarios while working with internal & external stakeholders.
- Drive necessary discussion with external and internal teams on test specification and methodologies to improve test quality continuously.
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
- 5+ years of work experience in one or more domains such as: ASIC development (Silicon design or bringup or characterization), compute (ARM, x86), AI-ML hardware/software (GPUs, TPUs).
- Knowledge of architecture and components on one of the following products: server/PC/Laptop.
- Development or debug experience in one or more following areas: hardware fault management, error reporting, error handling on hardware products.
- 7+ years of experience with one subset of the following AI systems: Accelerator (GPU/ASIC), Kernel development, Performance optimization (e.g., NVIDIA, AMD, Intel, or other misc accelerator), computer architecture, HPC communication libraries (e.g., NCCL, MPI), performance enablement, tracing, profiling and debugging.
- Experience with architecture of disaggregated systems at scale.
- Understanding of hardware development process and how to scope out test plans accordingly.
- Experience troubleshooting problems at system level, crossing across multiple components, as well as hardware/firmware/software boundaries.
For those who live in or expect to work from California if hired for this position, please click here for additional information.
Meta is committed to providing reasonable support (called accommodations) in our recruiting processes for candidates with disabilities, long term conditions, mental health conditions or sincerely held religious beliefs, or who are neurodivergent or require pregnancy-related support.
If you need support, please reach out to Meta's Disability Accommodations team.
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable.
In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.
-
Hardware Systems Engineer, RAS
4 days ago
Bellevue, Washington, United States META Full timeHardware Systems Engineer, RASWe are seeking a skilled Hardware Systems Engineer to join our Release to Production (RTP) team at Meta. As a key member of our team, you will be responsible for the end-to-end Hardware Lifecycle of all Meta servers, including prototyping, debugging, and stress testing.Key Responsibilities:Interface with external vendors and...
-
Hardware Validation Specialist
5 days ago
Bellevue, Washington, United States ZT Systems Full timeAbout the RoleThe Validation Support Technician 3 will be responsible for providing technical support to Validation projects in various stages of development. The Technician must be able to handle multiple projects simultaneously with minimal supervision to maintain project schedules and deadlines.Key ResponsibilitiesHands-on building, testing, upgrading,...
-
Senior Hardware Development Engineer
1 week ago
Bellevue, Washington, United States Amazon Full timeAbout the RoleWe are seeking a highly skilled and experienced Senior Hardware Development Engineer to join our Last Mile Technology team at Amazon. As a key member of our team, you will be responsible for designing and developing innovative embedded solutions for our fleet of vehicles and lockers.Key ResponsibilitiesDevelop and lead new hardware innovations...
-
Senior Systems Engineer
1 week ago
Bellevue, Washington, United States Volant Partners Inc. Full timeAbout Volant PartnersVolant Partners is a consulting firm that specializes in providing business and engineering services to companies across the United States. Our team thrives on solving complex problems, particularly in regulated industries. We prioritize attracting and retaining top talent, which is why we've structured ourselves as an employee-owned...
-
**Technical Program Manager
3 weeks ago
Bellevue, Washington, United States Fresh Consulting Full timeAbout the RoleFresh Consulting is seeking an experienced Technical Program Manager to lead complex, multi-disciplinary hardware and software engineering projects. As a Technical Program Manager at Fresh Consulting, you will collaborate with external clients to understand the problem/opportunity space, define requirements, and usher projects through the...
-
Fleet AI Systems Production Engineer
4 weeks ago
Bellevue, Washington, United States META Full timeAbout the RoleMeta is on the lookout for a skilled Production Systems Engineer to become a vital part of our Release to Production (RTP) team. Our infrastructure, which includes servers and data centers, is essential for the efficient operation of our rapidly expanding services. The RTP team plays a crucial role in managing the Hardware Lifecycle of all Meta...
-
Interdisciplinary Systems Engineer
1 week ago
Bellevue, Washington, United States Amazon Full timeAbout the RoleWe are seeking an experienced Interdisciplinary Systems Engineer to join our Amazon Fulfillment Technologies (AFT) team. As a key member of our Quality team, you will be responsible for designing and developing hardware and software systems that utilize imaging sensors and computer vision to achieve perception-based decisions.Key...
-
Production Systems Engineer
2 weeks ago
Bellevue, Washington, United States META Full timeJob DescriptionMeta is seeking a highly skilled Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services.Key ResponsibilitiesCollaborate with cross-functional teams to understand system...
-
Production Systems Engineer
2 weeks ago
Bellevue, Washington, United States META Full timeJob DescriptionMeta is seeking a highly skilled Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services.Key ResponsibilitiesCollaborate with cross-functional teams to understand system...
-
Senior Systems Engineer
24 hours ago
Bellevue, Washington, United States Volant Partners Inc. Full timeAbout Volant PartnersVolant Partners is a business and engineering consulting services company that provides expertise to companies across the United States. Our focus is on solving complex problems in regulated industries, and we achieve this by attracting and retaining top consultants. As an employee-owned company, we prioritize collaboration and teamwork...
-
Hardware Test Engineer
2 weeks ago
Bellevue, Washington, United States Actalent Full timeJob Title: Hardware Test EngineerJob DescriptionAs a Hardware Test Engineer at Actalent, you will play a critical role in ensuring the quality and performance of our clients' hardware products. Your primary responsibility will be to develop and execute comprehensive test plans, test cases, and test scripts to validate hardware product performance,...
-
Senior Embedded Systems Software Engineer
1 week ago
Bellevue, Washington, United States MediaTek Full timeAbout the RoleMediaTek is seeking a skilled software engineer to join our team, specializing in embedded systems and semiconductor industry expertise. Our team collaborates with customers to deliver high-quality consumer products with competitive computation capabilities.The successful candidate will work closely with customer engineers to resolve system...
-
Bellevue, Washington, United States Amazon Full timeJoin Our Team as an Industrial Systems Engineer! Have you ever marveled at the speed of delivery for products ordered online? At Amazon, our engineers are dedicated to minimizing the time between order placement and delivery for our customers. Amazon Fulfillment Technologies (AFT) is the driving force behind our global fulfillment operations. We design and...
-
Bellevue, Washington, United States Amazon Full timeAbout the RoleWe are seeking a highly skilled Senior Hardware Development Engineer to join our Fulfillment Technology & Robotics team. As a key member of our team, you will be responsible for conceiving, designing, and developing innovative products for e-commerce operations.Key ResponsibilitiesServe as technical lead within a multi-disciplinary team,...
-
Mechanical Design Engineer
1 week ago
Bellevue, Washington, United States Amazon Full timeAbout the RoleWe are seeking a skilled Mechanical Design Engineer to join our team at Amazon Fulfillment Technologies (AFT). As a key member of our team, you will be responsible for conceiving, developing, and deploying modular data-collection systems at scale.Key ResponsibilitiesDesign mechanical systems for mounting cameras, optics, and other...
-
Systems Development Engineer
4 days ago
Bellevue, Washington, United States Amazon Full timeJob SummaryWe are seeking a highly skilled Systems Development Engineer to join our team at Amazon. As a key member of our Central Services team, you will play a critical role in ensuring the reliability and performance of our equipment and systems.Key ResponsibilitiesCoordinate and plan work activities for site Control Systems Engineers to achieve goals and...
-
Production Systems Engineer
3 weeks ago
Bellevue, Washington, United States Meta Platforms, Inc. Full timeJob DescriptionMeta Platforms, Inc. is seeking a highly skilled Production Systems Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, developing, and maintaining large-scale cloud-based systems that support our social media platforms.Key ResponsibilitiesDesign and Develop Cloud-Based Systems:...
-
Production Systems Engineer
3 weeks ago
Bellevue, Washington, United States Meta Platforms, Inc. Full timeJob DescriptionMeta Platforms, Inc. is seeking a highly skilled Production Systems Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, developing, and maintaining large-scale cloud-based systems that support our social media platforms.ResponsibilitiesDesign and Develop Cloud-Based Systems: Collaborate...
-
Bellevue, Washington, United States Amazon Full timeAbout the RoleWe are seeking a highly skilled Senior Systems Development Engineer to join our Fulfillment Optimization team at Amazon. As a key member of our team, you will be responsible for designing, developing, and deploying large-scale infrastructure solutions that support our global fulfillment network.Key ResponsibilitiesIdentify and solve complex...
-
Lead Systems Engineer in Robotics and Automation
4 weeks ago
Bellevue, Washington, United States Amazon Full timeAre you driven by innovation? Is collaborative problem-solving part of your core values? Do you envision how your contributions can influence the broader landscape? If you answered yes, you may find your place with us at Amazon Robotics. We are a dedicated team of innovators committed to leveraging advanced robotics and software technologies to tackle...