Senior System Reliability Engineer

7 days ago


Santa Clara, California, United States NVIDIA Full time
Reliability Engineer for NVIDIA's System Products

NVIDIA is a leader in the field of artificial intelligence and high-performance computing, and we're looking for a skilled Reliability Engineer to join our team. As a Reliability Engineer, you will be responsible for ensuring the reliability of our system products, including graphics cards, servers, and data center solutions.

Key Responsibilities:

  • Provide expertise in hardware reliability engineering for electronics and server systems
  • Establish and maintain product reliability standards and metrics
  • Participate in product and engineering design reviews
  • Interface with engineering groups, suppliers, and partners to ensure desired reliability
  • Define and implement reliability plans and specifications
  • Provide reliability predictions and test plans
  • Perform and lead testing with associated failure analysis

Requirements:

  • BS (or equivalent experience) in Engineering, Material Science, Physics, or a related field
  • 5+ years in a hardware validation/reliability environment
  • Strong understanding of statistical concepts and models
  • Good verbal and writing skills
  • Self-motivating and independent

NVIDIA is an Equal Opportunity Employer

We are committed to fostering a diverse work environment and proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.



  • Santa Clara, California, United States Omni Vision Inc Full time

    Job Title: Senior Reliability EngineerOmni Vision Inc is seeking a highly skilled Senior Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for ensuring the quality and reliability of our CMOS Image Sensor products.Key Responsibilities:Review reliability qualification testing results and determine whether...


  • Santa Clara, California, United States Geospatial And Cloud Analytics Inc Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Geospatial And Cloud Analytics Inc. As a key member of our engineering team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large-scale cloud infrastructure.Key ResponsibilitiesDesign and implement...


  • Santa Clara, California, United States Apollo Professional Solutions Full time

    Job Title: Senior Systems EngineerWe are seeking a highly skilled Senior Systems Engineer to join our team at Apollo Professional Solutions. As a key member of our engineering team, you will be responsible for ensuring the reliability and robustness of our next generation sequencing and sample prep platforms.Key Responsibilities:Develop and implement...


  • Santa Clara, California, United States Anello Photonics Full time

    About Anello PhotonicsAnello Photonics is a pioneering technology company based in Santa Clara, California. We have developed cutting-edge integrated photonic system-on-chip technology for next-generation navigation. Our SIPHOG gyroscope is based on patented photonic integrated circuit technology, offering higher performance, smaller size and weight, and...


  • Santa Clara, California, United States Anello Photonics Full time

    About Anello PhotonicsAnello Photonics is a leading-edge technology company based in Santa Clara, CA. We have developed integrated photonic system-on-chip technology for next-generation navigation. Our SIPHOGTM gyroscope is based on our patented photonic integrated circuit technology.This innovative technology enables a product that is higher performance,...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionNVIDIA is seeking a highly skilled Senior Cloud Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large scale Kubernetes clusters.Key ResponsibilitiesDesign and implement operational and reliability aspects...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Site Reliability EngineerNVIDIA is a leader in AI, machine learning, and datacenter acceleration. Our company is expanding its leadership into datacenter networking with ethernet switches, NICs, and DPUs. We have continuously reinvented ourselves over two decades.Our invention of the GPU in 1999 sparked the growth of the PC gaming market,...


  • Santa Clara, California, United States Apollo Professional Solutions Full time

    Job Title: Senior Systems EngineerApollo Professional Solutions is seeking a highly skilled Senior Systems Engineer to join our team. As a key member of our engineering team, you will be responsible for improving the reliability and robustness of our next generation sequencing and sample prep platforms.Key Responsibilities:Conduct failure analysis and root...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionNVIDIA is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large-scale Observability & Telemetry collection platforms.Key Responsibilities:Design and implement operational and...


  • Santa Clara, California, United States Omnivision Technologies Full time

    Qualifications:Bachelor's degree in Physics, Electrical Engineering, Materials Science, or a related engineering field, with coursework focused on semiconductor physics and electronic systems. Familiarity with electronic component reliability standards such as JEDEC and AEC-Q100 is advantageous. Experience in wafer-level reliability testing is also...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Senior Staff Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, building, and operating reliable, secure cloud infrastructure.Key ResponsibilitiesDevelop expertise in new technologies and contribute to the success of SRE and...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Site Reliability EngineerNVIDIA is a leader in AI, machine learning, and datacenter acceleration. Our company is expanding its leadership into datacenter networking with ethernet switches, NICs, and DPUs. We have continuously reinvented ourselves over two decades.Our invention of the GPU in 1999 sparked the growth of the PC gaming market,...


  • Santa Clara, California, United States Innova Solutions Full time

    Innova Solutions is actively seeking a Reliability Engineer. Position Type: Full Time Location: Santa Clara, CA As a Reliability Engineer, your responsibilities will include: Key Responsibilities:Engaging in Board Level Reliability laboratory activities, establishing functional test hardware and software for various NV products, including large server...


  • Santa Clara, California, United States Omnivision Technologies Full time

    Qualifications:Bachelor's degree in Physics, Electrical Engineering, Materials Science, or a related engineering field, with coursework focused on semiconductor physics and electronics. Familiarity with electronic component reliability standards such as JEDEC and AEC-Q100 is advantageous. Experience in wafer-level reliability testing is also beneficial.Key...


  • Santa Clara, California, United States Innova Solutions Full time

    Innova Solutions is actively seeking a Reliability Engineer. Position Type: Full Time Location: Santa Clara, CA As a Reliability Engineer, your responsibilities will include: Key Responsibilities:Engaging in Board Level Reliability laboratory operations, establishing functional testing hardware and software for various NV products, including extensive server...


  • Santa Clara, California, United States Innova Solutions Full time

    Innova Solutions is currently seeking a Reliability Engineer. Position Type: Full Time Location: Santa Clara, CA As a Reliability Engineer, your responsibilities will include: Key Responsibilities:Engaging in the Board Level Reliability laboratory setting, establishing functional test hardware and software for various NV products, including extensive server...

  • Reliability Engineer

    3 weeks ago


    Santa Clara, California, United States Comtech Full time

    Job SummaryComtech Telecommunications Corp. is seeking a highly skilled Reliability Engineer to join our team in a critical role that requires collaboration with technical professionals and interaction with customers to provide solutions to technical problems of moderate scope and complexity.Key ResponsibilitiesAnalyze and evaluate product reliability and...


  • Santa Clara, California, United States Omnivision Technologies Full time

    Qualifications:A Bachelor’s degree in Physics, Electrical Engineering, Materials Science, or a related engineering field, with coursework focused on semiconductor physics and electronics is required. Familiarity with electronic component reliability standards such as JEDEC and AEC-Q100 is advantageous. Experience in wafer-level reliability testing is also...