GPU / Server Failure Analysis Engineer

1 month ago


Santa Clara, United States Ledgent Technology Full time

Location: Santa Clara, Ca.
Rate: $43 - $45/hr
Contract to Hire - 6+ months, Onsite

Local Candidates Highly Desirable

Summary
The Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally and with the client company. Responsible for failure analysis on Customer returned Server/ GPU boards.
Essential Duties and Responsibilities include the following. Other Duties not listed may be assigned.

Data Analysis/Communication/Issue Resolution - Prevention - 90% of Job
* Have ability to system and board level testing and debugging down to components level.
* Have knowledge to do component swapping, removal to isolate failures and overall deeper FA
* Visual mechanical inspection (VMI) of Server/GPU board components (Motherboards, GPU, GPU baseboard, CPU, DIMM, NIC, SSD, Power Supply, etc.) and/or electronics components
* Completes component level trouble shooting (capacitor, resistors, fuse, IC, diode, etc.) and failure analysis
* Complete sample analysis for equipment and process qualifications.
* Conduct innovative use of new analytical tools, equipment, and methodologies.
* Develop records and perform various failure analysis on systems and components to identity root cause.
* Develop formal failure analysis data reporting and present internally and to client.
* Requires communication with Asia team in emails and conference calls.
* Be proficient in a large number of applications used for database management and reporting.
* Follows procedures and diagrams in performing engineering change orders as required.
* Other duties as assigned.

Team Leadership - 10% of Job
* Responsible for maintaining a positive work environment while supporting the company's culture.
* Responsible for fostering a healthy and safe work environment, focusing on the well-being of all associates.
* Maintains and strengthens internal, external customer, and supplier relationships. Communicates and works well with
all business partners.
* Ensures inventory management and merchandise allocation flows product through the warehouse in the most cost
efficient and productive manner.

Education and/or Experience
* Bachelor's degree in Electrical Engineering
* 4 years' experience in Failure Analysis
* Comprehensive server knowledge is essential. (BIOS / BMC / CPLD / FPGA, etc.)
* Ability to use electronic test equipment (oscilloscopes / multi-meters / thermal imaging camera, etc.)
* Must be able to speak English. Able to speak Mandarin Chinese is a plus.

Essential Skills:
* Requires excellent written and oral communication skills.
* Knowledge of basic Linux environment and commands
* Familiarity with Microsoft Office software (advanced excel)
* Familiar with computer software and operating systems and possess the ability to identify and perform software updates (BIOS, BMC, Components Firmware, etc.)
* Ability to read and interpret schematics/block diagrams with detailed understanding of server and subassembly functionality.
* Ability to read and interpret system board views and board layout details
* Effectively communicate concepts and solutions with various levels of the organization.
* Effectively liaison between Company and Client.
* Must be able to work cross-functionally with minimal supervision.
* Requires strong analytical and statistical skills.
* Requires flexibility to work overtime for special projects or business supports.

Competencies:
* Shows determination to achieve excellent results
* Finds better ways
* Demands top performance
* Inspires commitment

Working Condition
Environment is climate controlled. Should be able to reach over shoulder heights; bend or stoop below the waist; repetitive wrist, hand, or finger movement; occasional lifting up to 45 pounds. Able to work in an environment with loud noise.

Desired Skills and Experience

Failure Analysis, Server, GPU Board Components, Troubleshooting, Assembly and disassembly of units / computer components / subcomponents

All qualified applicants will receive consideration for employment without regard to race, color, national origin, age, ancestry, religion, sex, sexual orientation, gender identity, gender expression, marital status, disability, medical condition, genetic information, pregnancy, or military or veteran status. We consider all qualified applicants, including those with criminal histories, in a manner consistent with state and local laws, including the California Fair Chance Act, City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, and Los Angeles County Fair Chance Ordinance. For unincorporated Los Angeles county, to the extent our customers require a background check for certain positions, the Company faces a significant risk to its business operations and business reputation unless a review of criminal history is conducted for those specific job positions.



  • santa clara, United States Ledgent Technology Full time

    Location: Santa Clara, Ca.Rate: $43 - $45/hrContract to Hire - 6+ months, OnsiteLocal Candidates Highly Desirable!SummaryThe Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally and...


  • santa clara, United States Ledgent Technology Full time

    Location: Santa Clara, Ca.Rate: $43 - $45/hrContract to Hire - 6+ months, OnsiteLocal Candidates Highly Desirable!SummaryThe Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally and...


  • Santa Clara, United States Ledgent Technology Full time

    Location: Santa Clara, Ca.Rate: $43 - $45/hrContract to Hire - 6+ months, OnsiteLocal Candidates Highly Desirable!SummaryThe Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally and...


  • Santa Clara, United States ManpowerGroup Full time

    What we need to see: 3+ years of experience in the server and electronic industry with exposure to development of system (hardware/software) and data analysis. A minimum of B.S. degree or equivalent experience in Electrical/Computer engineering or Computer Science. Knowledge of mechanical tolerancing.BS EE - Worked on board bring up and debugging or Failure...


  • Santa Clara, California, United States Apple Full time

    GPU Performance Analysis EngineerAt Apple, we're looking for a skilled GPU Performance Analysis Engineer to join our Silicon Engineering Group. As a key member of our team, you'll be responsible for delivering high-quality, low-power graphics IP that meets our performance and power goals.Key Responsibilities:Analyze unit and system-level performance...


  • Santa Clara, United States TalentBurst Full time

    Job Title:- Sr. Failure Analysis Engineer Location:- Santa Clara, CA (On-Site 5 days/week) Duration:- 12 months contract (W2 Accepted) Primary Tasks: Drive returned product Failure Analysis, characterizing failures, and escalating issues and trends to the Hardware Quality Engineering team Resolve escalated RMA's by determining the hardware root...


  • santa clara, United States TalentBurst Full time

    Job Title:- Sr. Failure Analysis Engineer Location:- Santa Clara, CA (On-Site 5 days/week) Duration:- 12 months contract (W2 Accepted) Primary Tasks: Drive returned product Failure Analysis, characterizing failures, and escalating issues and trends to the Hardware Quality Engineering team Resolve escalated RMA's by determining the hardware root...


  • Santa Clara, United States Talentburst Full time

    Job Title:- Sr. Failure Analysis EngineerAll potential candidates should read through the following details of this job with care before making an application. Location:- Santa Clara, CA (On-Site 5 days/week) Duration:- 12 months contract (W2 Accepted) Primary Tasks: Drive returned product Failure Analysis, characterizing failures, and escalating...

  • Reliability Engineer

    6 months ago


    Santa Clara, United States Comtech Full time

    Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of technical problems of moderate scope and complexity. The successful candidate...


  • Santa Clara, California, United States Advanced Micro Devices , Inc. Full time

    Job SummaryWe're seeking a highly motivated and skilled GPU Performance Optimization Engineer to join our team at Advanced Micro Devices, Inc. The ideal candidate will have expertise in GPU performance and familiarity with performance monitoring and tuning tools. Key Responsibilities• Define performance suite and best practices for measuring...


  • Santa Clara, California, United States NVIDIA Full time

    A key part of NVIDIA's strength is our sophisticated analysis and debugging tools that empower NVIDIA engineers to improve performance and power efficiency of our products and the running applications.We are seeking a forward-thinking, hard-working, and creative software engineer to join a multifaceted software team with high standards.This role involves...

  • Reliability Engineer

    2 months ago


    Santa Clara, United States Wipro Full time

    Work in the Board Level Reliability lab environment and setup functional test hardware and software for various products including large server systems and perform various functional tests for GPU/Tegra products.Bachelors in EEBoard level dubbing skillsServer/GPU knowledgeFamiliar with Linux environmentHardware Functional testFailure Analysis-Hardware and...


  • Santa Clara, United States Advanced Micro Devices , Inc. Full time

    Overview: WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....


  • Santa Clara, United States Advanced Micro Devices , Inc. Full time

    Overview: WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....


  • Santa Clara, CA, United States TalentBurst Full time

    Job Title:- Sr. Failure Analysis Engineer Location:- Santa Clara, CA (On-Site 5 days/week) Duration:- 12 months contract (W2 Accepted) Primary Tasks: Drive returned product Failure Analysis, characterizing failures, and escalating issues and trends to the Hardware Quality Engineering team Resolve escalated RMA's by determining the hardware root...


  • santa clara, United States Wipro Full time

    Work in the Board Level Reliability lab environment and setup functional test hardware and software for various products including large server systems and perform various functional tests for GPU/Tegra products.Bachelors in EEBoard level dubbing skillsServer/GPU knowledgeFamiliar with Linux environmentHardware Functional testFailure Analysis-Hardware and...


  • santa clara, United States Wipro Full time

    Work in the Board Level Reliability lab environment and setup functional test hardware and software for various products including large server systems and perform various functional tests for GPU/Tegra products.Bachelors in EEBoard level dubbing skillsServer/GPU knowledgeFamiliar with Linux environmentHardware Functional testFailure Analysis-Hardware and...


  • santa clara, United States Xscape Photonics Inc Full time

    We are seeking a skilled Laser Reliability and Failure Analysis Engineer to join our team in Santa Clara, CA. The successful candidate will be responsible for assessing the reliability of semiconductor lasers through various testing and analysis methods. Key responsibilities include performing Failure Modes and Effects Analysis (FMEA), defining reliability...


  • santa clara, United States Xscape Photonics Inc Full time

    We are seeking a skilled Laser Reliability and Failure Analysis Engineer to join our team in Santa Clara, CA. The successful candidate will be responsible for assessing the reliability of semiconductor lasers through various testing and analysis methods. Key responsibilities include performing Failure Modes and Effects Analysis (FMEA), defining reliability...


  • Santa Clara, United States Xscape Photonics Inc Full time

    We are seeking a skilled Laser Reliability and Failure Analysis Engineer to join our team in Santa Clara, CA. The successful candidate will be responsible for assessing the reliability of semiconductor lasers through various testing and analysis methods. Key responsibilities include performing Failure Modes and Effects Analysis (FMEA), defining reliability...