Reliability Engineer

2 weeks ago


Cupertino, United States Etched Full time
Job DescriptionJob Description

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning.

Reliability Engineer

We are seeking a skilled and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer at Etched, you will play a critical role in ensuring that all components and systems meet our rigorous reliability standards, essential for our datacenter applications. This position requires a deep understanding of reliability engineering principles, as well as experience working with suppliers, ODMs, and JDMs.

Representative Projects:

  • Lead the development, implementation, and management of reliability standards for all suppliers working with Etched. Ensure that all components and systems meet or exceed the required reliability benchmarks.
  • Review and verify reliability reports from suppliers, ensuring accuracy and adherence to Etched's standards. Provide guidance and feedback to suppliers to ensure continuous improvement in reliability performance.
  • Collaborate with cross-functional teams to review and recommend component selection criteria based on reliability performance. Ensure that all selected components are capable of meeting the long-term reliability requirements of our datacenter applications.
  • Evaluate and approve reliability test plans proposed by external vendors. Ensure that the test methodologies and conditions are sufficient to validate long-term reliability under expected operating conditions.
  • Conduct in-depth analysis of reliability data provided by suppliers and vendors. Identify trends, potential issues, and areas for improvement to enhance overall reliability.
  • Work closely with ODMs (Original Design Manufacturers) and JDMs (Joint Design Manufacturers) to ensure that all products meet Etched quality and reliability standards. Provide technical guidance and support to maintain maximum operational uptime and long-term reliability.
  • Review and establish reliability metrics and standards for silicon components, ensuring they meet the stringent requirements for long-term reliability in data center environments.

You maybe a good fit if you have

  • Bachelor's or Master's degree in Reliability Engineering, Electrical Engineering, or a related field.
  • 5+ years of experience in reliability engineering, with a focus on datacenter applications preferred.
  • Strong understanding of reliability standards, testing methodologies, and data analysis techniques. DFMEA / PFMEA / SPC Engineering analysis experience desired.
  • Experience working with suppliers, ODMs, and JDMs in a high-tech environment.
  • Excellent communication skills, with the ability to convey complex technical concepts to diverse stakeholders.
  • Proven ability to manage multiple projects and deliver results in a fast-paced environment.

We encourage you to apply even if you do not believe you meet every single qualification.

How we're different:

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in Cupertino, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Benefits:

  • Full medical, dental, and vision packages, with 100% of premium covered, 90% for dependents
  • Housing subsidy of $2,000/month for those living within walking distance of the office
  • Daily lunch and dinner in our office
  • Relocation support for those moving to Cupertino



  • Cupertino, California, United States Apple Full time

    Job SummaryApple is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for developing processes, tools, and automation for managing distributed systems in production environments. You will work closely with our software and systems engineering teams to build and run large-scale, massively...


  • cupertino, United States Diamondpick Full time

    Kindly Note : This would be a W2 OpportunityPrimary Focus: This role involves close collaboration with the customer&tech leads and teams from development, infrastructure, cloud deployment,and DevOps. The SRE Tech Lead presents findings, works throughchallenges, and provides solutions for reliability improvements.Required Skills:o Communication &...


  • Cupertino, United States Diamondpick Full time

    Kindly Note : This would be a W2 OpportunityPrimary Focus: This role involves close collaboration with the customer&tech leads and teams from development, infrastructure, cloud deployment,and DevOps. The SRE Tech Lead presents findings, works throughchallenges, and provides solutions for reliability improvements.Required Skills:o Communication &...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Apple Services Engineering (ASE) team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesDesign, implement, and maintain scalable and reliable cloud...


  • Cupertino, California, United States Apple Full time

    About the RoleData Platform Site Reliability Engineering at Apple is responsible for managing infrastructure and applications on bare-metal and cloud computing platforms to deliver data processing, governance, and storage for many of Apple's global products and organizations.We are looking for a passionate and talented Site Reliability Engineer to continue...


  • Cupertino, California, United States Apple Full time

    Job SummaryApple is seeking a highly skilled Site Reliability Engineer to join our Silicon Technologies group. As a key member of our team, you will be responsible for designing and implementing software solutions to ensure the seamless operation of our high-performance processors.Key ResponsibilitiesDesign and implement software solutions to ensure the...


  • Cupertino, California, United States Futran Tech Solutions Pvt. Ltd. Full time

    Job Title: Reliability Assurance SpecialistJob Summary:Futran Tech Solutions Pvt. Ltd. is seeking a skilled Reliability Assurance Specialist to ensure the durability and performance of our products through rigorous testing and analysis.Key Responsibilities:Design and execute mechanical stress tests, and shock/drop/vibration tests to assess product...


  • Cupertino, California, United States Apple Full time

    About the RoleCraft sophisticated solutions to highly complex challenges as part of our Silicon Technologies group. You'll help design and manufacture our next-generation, high-performance, power-efficient processor, system-on-chip (SoC).As a key member of our team, you'll be responsible for crafting and building the technology that operates Apple's devices....


  • Cupertino, California, United States Apple Full time

    Job SummaryApple is seeking a highly skilled Site Reliability Engineer to join our Edge & Messaging SRE team. As a key member of our team, you will be responsible for building and running the services that hundreds of millions of customers use every day.About the RoleThis team provides systems that are foundational for many of Apple's services such as...

  • DevOps Engineer

    2 weeks ago


    Cupertino, United States System Soft Technologies Full time

    DevOps position:We are seeking a highly motivated and innovative an exceptional DevOps engineer to join our worldwide sales team, Data Solutions & Initiatives (DSI). This is a unique opportunity to help the growth of one of our global initiatives and contribute to launching ground-breaking new features in support of our sales strategy. In this position, you...

  • DevOps Engineer

    3 weeks ago


    Cupertino, United States System Soft Technologies Full time

    DevOps position:We are seeking a highly motivated and innovative an exceptional DevOps engineer to join our worldwide sales team, Data Solutions & Initiatives (DSI). This is a unique opportunity to help the growth of one of our global initiatives and contribute to launching ground-breaking new features in support of our sales strategy. In this position, you...

  • DevOps Engineer

    2 weeks ago


    cupertino, United States System Soft Technologies Full time

    DevOps position:We are seeking a highly motivated and innovative an exceptional DevOps engineer to join our worldwide sales team, Data Solutions & Initiatives (DSI). This is a unique opportunity to help the growth of one of our global initiatives and contribute to launching ground-breaking new features in support of our sales strategy. In this position, you...

  • Quality Engineer III

    1 month ago


    cupertino, United States ATR International Full time

    In this position, you will have responsibility for Client products prior to launch, with the goal of enabling strong quality during the development cycle. This position partners with engineering, product development, Reliability engineering operations, ClientCare, and the Watch component teams. Strong Quality engineering skills in technical, statistical...

  • Quality Engineer III

    1 month ago


    Cupertino, United States ATR International Full time

    In this position, you will have responsibility for Client products prior to launch, with the goal of enabling strong quality during the development cycle. This position partners with engineering, product development, Reliability engineering operations, ClientCare, and the Watch component teams. Strong Quality engineering skills in technical, statistical...

  • Quality Engineer III

    1 month ago


    cupertino, United States ATR International Full time

    In this position, you will have responsibility for Client products prior to launch, with the goal of enabling strong quality during the development cycle. This position partners with engineering, product development, Reliability engineering operations, ClientCare, and the Watch component teams. Strong Quality engineering skills in technical, statistical...


  • Cupertino, United States Etched Full time

    Job DescriptionJob DescriptionAbout EtchedEtched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video...


  • Cupertino, California, United States Amazon Full time

    About the RoleWe are seeking a highly skilled Senior System Development Engineer to join our Hardware Engineering team at Amazon. As a key member of our team, you will be responsible for designing and building software and enabling tools to support the development of our hardware products.Key ResponsibilitiesLead the Hardware Engineering System Development...

  • DevOps Engineer

    1 month ago


    Cupertino, California, United States Apple Full time

    DevOps EngineerCupertino, California, United StatesSoftware and ServicesApple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn't have imagined — and now can't imagine living without. If you're excited by the idea of making a real impact, and joining a team where we pride...

  • STA Engineer

    4 weeks ago


    cupertino, United States Acceler8 Talent Full time

    We are seeking an STA Engineer to join a startup focused on developing cutting-edge AI chips to compete with industry giants like Nvidia. With recent venture funding of $120 million, they are on a mission to revolutionize AI model training and deployment through their custom chip designed for transformer architecture. This role will be critical in ensuring...

  • STA Engineer

    1 month ago


    cupertino, United States Acceler8 Talent Full time

    We are seeking an STA Engineer to join a startup focused on developing cutting-edge AI chips to compete with industry giants like Nvidia. With recent venture funding of $120 million, they are on a mission to revolutionize AI model training and deployment through their custom chip designed for transformer architecture. This role will be critical in ensuring...