Reliability Engineer
2 months ago
About Etched
Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning.
Reliability Engineer
We are seeking a skilled and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer at Etched, you will play a critical role in ensuring that all components and systems meet our rigorous reliability standards, essential for our datacenter applications. This position requires a deep understanding of reliability engineering principles, as well as experience working with suppliers, ODMs, and JDMs.
Representative Projects:
- Lead the development, implementation, and management of reliability standards for all suppliers working with Etched. Ensure that all components and systems meet or exceed the required reliability benchmarks.
- Review and verify reliability reports from suppliers, ensuring accuracy and adherence to Etched's standards. Provide guidance and feedback to suppliers to ensure continuous improvement in reliability performance.
- Collaborate with cross-functional teams to review and recommend component selection criteria based on reliability performance. Ensure that all selected components are capable of meeting the long-term reliability requirements of our datacenter applications.
- Evaluate and approve reliability test plans proposed by external vendors. Ensure that the test methodologies and conditions are sufficient to validate long-term reliability under expected operating conditions.
- Conduct in-depth analysis of reliability data provided by suppliers and vendors. Identify trends, potential issues, and areas for improvement to enhance overall reliability.
- Work closely with ODMs (Original Design Manufacturers) and JDMs (Joint Design Manufacturers) to ensure that all products meet Etched quality and reliability standards. Provide technical guidance and support to maintain maximum operational uptime and long-term reliability.
- Review and establish reliability metrics and standards for silicon components, ensuring they meet the stringent requirements for long-term reliability in data center environments.
You maybe a good fit if you have
- Bachelor's or Master's degree in Reliability Engineering, Electrical Engineering, or a related field.
- 5+ years of experience in reliability engineering, with a focus on datacenter applications preferred.
- Strong understanding of reliability standards, testing methodologies, and data analysis techniques. DFMEA / PFMEA / SPC Engineering analysis experience desired.
- Experience working with suppliers, ODMs, and JDMs in a high-tech environment.
- Excellent communication skills, with the ability to convey complex technical concepts to diverse stakeholders.
- Proven ability to manage multiple projects and deliver results in a fast-paced environment.
We encourage you to apply even if you do not believe you meet every single qualification.
How we're different:
Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.
We are a fully in-person team in Cupertino, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.
Benefits:
- Full medical, dental, and vision packages, with 100% of premium covered, 90% for dependents
- Housing subsidy of $2,000/month for those living within walking distance of the office
- Daily lunch and dinner in our office
- Relocation support for those moving to Cupertino
-
Reliability Engineer
2 weeks ago
Cupertino, United States ZipRecruiter Full timeJob DescriptionJob Description About Etched Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video...
-
Reliability Engineer
7 hours ago
Cupertino, United States Etched Full timeJob DescriptionJob DescriptionAbout EtchedEtched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video...
-
Software Reliability Engineering Manager
2 weeks ago
Cupertino, California, United States Apple Full timeCompany Overview:At Apple, we're committed to crafting experiences that our customers love and depend on. Our Software and Services team builds and supports the systems that make many of these daily experiences possible. If you've used Apple products, you've likely interacted with us. iCloud Services SRE teams are responsible for the systems and services...
-
Reliable Product Quality Engineer
2 weeks ago
Cupertino, California, United States Apple Full timeJob OverviewWe are seeking a skilled Reliable Product Quality Engineer to join our team at Apple.
-
Cupertino, United States OSI Engineering Full timeThe Hardware Reliability Engineering team is looking for an engineer, who can execute reliability test on Main Logic Boards, identify issues with Hardware module integration through FMEA, system reliability testing, failure analysis and finally quantify the reliability of a design. Responsibilities:Development and execution of detailed test plans and...
-
Cupertino, United States OSI Engineering Full timeThe Hardware Reliability Engineering team is looking for an engineer, who can execute reliability test on Main Logic Boards, identify issues with Hardware module integration through FMEA, system reliability testing, failure analysis and finally quantify the reliability of a design. Responsibilities:Development and execution of detailed test plans and...
-
Site Reliability Software Engineer
1 week ago
Cupertino, California, United States Apple Full timeCompany OverviewCupertino, California, United StatesApple is a technology company that designs, manufactures, and markets consumer electronics, computer software, and online services. As a Site Reliability Engineer, you will be part of the Hardware Technology Compute and Storage Group, responsible for designing and manufacturing high-performance,...
-
Lead Silicon Reliability Engineer
2 weeks ago
Cupertino, California, United States Apple Full timeWe are seeking an experienced Lead Silicon Reliability Engineer to join our team at Apple.This role involves leading the introduction and qualification of new products with high quality, ensuring seamless performance in Apple devices.The ideal candidate will have extensive experience in JEDEC reliability and quality-related standards, ISO, and the science...
-
Cupertino, California, United States Apple Inc. Full timeUnlock Data-Driven Insights for Apple's Engineering ExperienceAt Apple Inc., we're seeking a highly skilled Data Scientist to join our AI/ML Integration and Delivery team. As a key member of our organization, you'll play a crucial role in optimizing engineering workflows and ensuring the reliability of our AI models.This is an exciting opportunity to work...
-
Operations Reliability Engineer
4 weeks ago
Cupertino, CA, United States Apple Full timeOperations Reliability Engineer Cupertino, California, United States Operations and Supply Chain Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at...
-
Datacenter Reliability Specialist
1 month ago
Cupertino, California, United States Etched Full timeAbout EtchedWe are building AI chips that are hard-coded for individual model architectures, providing an order of magnitude more throughput and lower latency than traditional GPUs. This enables the creation of products that would be impossible with general-purpose computing hardware.Job SummaryWe are seeking a skilled Datacenter Reliability Specialist to...
-
Reliability Systems Specialist
3 weeks ago
Cupertino, California, United States Apple Full timeAs a Senior Site Reliability Engineer at Apple, you'll have the opportunity to make a significant impact on our cloud services. We're seeking a highly skilled professional to join our team and help us deliver exceptional customer experiences.In this role, you'll be responsible for designing and implementing innovative solutions to accelerate our ability to...
-
Reliability Technician
3 weeks ago
Cupertino, United States Ryzen Solutions Full timeAs a lab technician, you will be responsible for setting up test equipment and fixtures, performing data collection, and analysis reporting. The test lab is a fast-paced, hands-on work environment where technicians are expected to be self-motivated and able to multitask. Some of the tests may be repetitive in nature, but reliable and consistent data...
-
Reliability Technician
3 weeks ago
Cupertino, United States Disability Solutions Full timeAs a lab technician, you will be responsible for setting up test equipment and fixtures, performing data collection, and analysis reporting. The test lab is a fast-paced, hands-on work environment where technicians are expected to be self-motivated and able to multitask. Some of the tests may be repetitive in nature, but reliable and consistent data...
-
Reliability Technician
3 weeks ago
Cupertino, United States Disability Solutions Full timeJob Summary: Assist in mechanical testing, imaging, measurement and inspection of samples, and general laboratory tasks. This position is to support a team of Product Design FEA engineers in carrying out tests on materials/parts/components/products for FEA model development, as well as improving the reliability and performance of our products throughout the...
-
Silicon Component Reliability Manager
2 weeks ago
Cupertino, California, United States ZipRecruiter Full timeEstimated Salary: $160,000 - $200,000 per year\About EtchedWe're a forward-thinking technology company focused on creating innovative AI solutions. Our goal is to push the boundaries of what's possible with model-specific hardware and deliver exceptional results.\Silicon Component Reliability Management RoleThis critical position involves managing...
-
Site Reliability Engineer 3
4 weeks ago
Cupertino, United States Juniper Networks Full timeJuniper is seeking a full-time SRE to join our talented team and support high quality technology solutions that revolutionize wireless and wired networks, powered by Artificial Intelligence in the cloud. Juniper provides services through SaaS applications to several enterprises, including Fortune 100 and Fortune 500 customers. You will be responsible for...
-
Mac MLB Reliability Engineer
4 weeks ago
Cupertino, CA, United States Apple Inc. Full timeDo you ever wonder what goes into making Apple products an amazing user experience? The engineers on Apple’s innovative reliability team are responsible for insuring that our products exceed our customer’s expectations for robustness, durability, and reliability. We are looking for talented contributors who enjoy working on revolutionary products and...
-
Sr. Module Reliability Engineer
4 weeks ago
Cupertino, CA, United States Apple Inc. Full timeAre you a technologist looking for a career choice that provides ever-evolving challenges? This position is offered in Apple’s Hardware Module Reliability Group. We guide development teams toward generating reliable designs for Apple's new technology components. We achieve this objective by identifying high-risk failure modes early in the design life cycle...
-
Reliability Expert Leader
2 weeks ago
Cupertino, California, United States Apple Full timeJoin a dynamic team at Apple as a Reliability Expert Leader. As part of our Silicon Technologies group, you will play a pivotal role in designing and manufacturing next-generation processors.This is an exciting opportunity to lead the introduction and qualification of new products with high quality. You will work closely with multi-functional groups and...