Package Reliability Engineer
5 days ago
Job Description
About Celestial AI
As Generative AI continues to advance, the performance drivers for data center infrastructure are shifting from systems-on-chip (SOCs) to systems of chips. In the era of Accelerated Computing, data center bottlenecks are no longer limited to compute performance, but rather the system's interconnect bandwidth, memory bandwidth, and memory capacity. Celestial AI's Photonic Fabric™ is the next-generation interconnect technology that delivers a tenfold increase in performance and energy efficiency compared to competing solutions.
The Photonic Fabric™ is available to our customers in multiple technology offerings, including optical interface chiplets, optical interposers, and Optical Multi-chip Interconnect Bridges (OMIB). This allows customers to easily incorporate high bandwidth, low power, and low latency optical interfaces into their AI accelerators and GPUs. The technology is fully compatible with both protocol and physical layers, including standard 2.5D packaging processes. This seamless integration enables XPUs to utilize optical interconnects for both compute-to-compute and compute-to-memory fabrics, achieving bandwidths in the tens of terabits per second with nanosecond latencies.
This innovation empowers hyperscalers to enhance the efficiency and cost-effectiveness of AI processing by optimizing the XPUs required for training and inference, while significantly reducing the TCO2 impact. To bolster customer collaborations, Celestial AI is developing a Photonic Fabric ecosystem consisting of tier-1 partnerships that include custom silicon/ASIC design, system integrators, HBM memory, assembly, and packaging suppliers.
ABOUT THE ROLE
We are seeking an experienced Package Reliability Engineer with expertise in 2.5D/3D advanced packaging. The ideal candidate will have a strong background in physics of failure, materials science, and experience working closely with OSATs to drive package reliability improvements. This role requires collaboration with external assembly and test partners, internal design, process, and failure analysis teams, and suppliers to ensure the reliability and manufacturability of cutting-edge semiconductor packages.
ESSENTIAL DUTIES AND RESPONSIBILITIES
- Reliability Analysis & Risk Assessment:
- Conduct physics of failure (PoF)-based reliability modeling for 2.5D/3D advanced packaging.
- Assess package reliability risks from thermal, mechanical, and electrical stressors.
- Define and execute stress test plans (e.g., thermal cycling, humidity, electromigration) to validate package robustness.
- OSAT Management & Collaboration:
- Work closely with OSAT partners to drive package reliability improvements, process optimizations, and yield enhancements.
- Define reliability requirements, review test methodologies, and ensure OSAT compliance with JEDEC and industry standards.
- Monitor and evaluate OSAT performance in executing reliability qualifications and failure analysis.
- Support supplier audits and technical reviews to assess manufacturing capabilities and reliability processes.
- Material Characterization & Selection:
- Evaluate and select materials (substrates, dielectrics, adhesives, underfills) for optimal reliability.
- Analyze CTE mismatches, warpage, delamination, and interfacial adhesion issues.
- Work with material suppliers and OSATs to qualify new materials for advanced packaging applications.
- Failure Analysis & Root Cause Identification:
- Lead failure mode analysis (FMEA), model-based problem solving (MBPS) and determine root causes of package failures using techniques such as FIB, X-ray, SEM, and TEM.
- Identify and mitigate interfacial failures, cracking, voiding, electromigration, and stress-induced damage.
- Drive OSATs and internal teams to implement corrective and preventive actions (CAPA).
- Process & Design Collaboration:
- Work cross-functionally with internal design, process, and manufacturing teams to define assembly test vehicles and optimize package architectures.
- Develop and refine design guidelines, process improvements, and reliability best practices.
- Stay up to date with industry standards (JEDEC, IPC, IEEE, etc.) and implement best practices in package reliability.
QUALIFICATIONS
- Education: Master's or Ph.D. in Materials Science, Mechanical Engineering, Electrical Engineering, Applied Physics, or a related field.
- Experience: 5-10 years of hands-on experience in 2.5D/3D advanced packaging reliability.
- Technical Expertise:
- Deep understanding of physics of failure (PoF) methodologies for package reliability.
- Strong knowledge of materials science, particularly in interconnects, substrates, and interfaces.
- Proficiency in stress modeling tools (ANSYS, Abaqus, COMSOL, etc.) for thermo-mechanical analysis.
- Experience with failure analysis techniques such as C-SAM, X-ray CT, SEM, TEM, FIB, and EBSD.
- OSAT Collaboration Experience:
- Proven track record of working with and driving OSAT partners for package reliability, yield, and continuous quality improvements.
- Experience managing OSAT qualifications, failure analysis, and corrective actions.
- Familiarity with supplier engagement, reliability testing at OSATs, and package process flows.
- Industry Knowledge: Familiarity with JEDEC, IPC, IEEE, and MIL-STD reliability standards.
- Soft Skills: Strong analytical, problem-solving, and cross-functional collaboration skills.
PREFERRED QUALIFICATIONS
- Experience in heterogeneous integration, fan-out packaging, chiplet architectures.
- Knowledge of electrical reliability mechanisms (e.g., electromigration, time-dependent dielectric breakdown).
- Expertise in AI-driven reliability modeling or machine learning for failure prediction.
LOCATION : Santa Clara, CA
For California Location:
As an early stage start up, we offer an extremely attractive total compensation package inclusive of competitive base salary, bonus and a generous grant of our valuable early-stage equity. The target base salary for this role is approximately $185,000.00 - $225,000.00. The base salary offered may be slightly higher or lower than the target base salary, based on the final scope as determined by the depth of the experience and skills demonstrated by candidate in the interviews.
We offer great benefits (health, vision, dental and life insurance), collaborative and continuous learning work environment, where you will get a chance to work with smart and dedicated people engaged in developing the next generation architecture for high performance computing.
Celestial AI Inc. is proud to be an equal opportunity workplace and is an affirmative action employer.
#LI-Onsite
-
Principal Site Reliability Engineer
1 day ago
Santa Clara, CA, United States Fortinet Full timeJob Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b...
-
Principal Site Reliability Engineer
3 days ago
Santa Clara, CA, United States Fortinet Full timeJob Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b...
-
Principal Site Reliability Engineer
8 hours ago
Santa Clara, CA, United States Fortinet Full timeJob Description At Fortinet, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of b2b...
-
Reliability Engineer, Mechanical Systems, NA
7 days ago
Santa Clara, CA, United States Vantage Data Centers Full timeAbout Vantage Data Centers Vantage Data Centers powers, cools, protects and connects the technology of the world's well-known hyperscalers, cloud providers and large enterprises. Developing and operating across North America, EMEA and Asia Pacific, Vantage has evolved data center design in innovative ways to deliver dramatic gains in reliability, efficiency...
-
Reliability Engineer, Mechanical Systems, NA
2 weeks ago
Santa Clara, CA, United States Vantage Data Centers Full timeAbout Vantage Data Centers Vantage Data Centers powers, cools, protects and connects the technology of the world's well-known hyperscalers, cloud providers and large enterprises. Developing and operating across North America, EMEA and Asia Pacific, Vantage has evolved data center design in innovative ways to deliver dramatic gains in reliability, efficiency...
-
Site Reliability Engineer
5 days ago
Santa Clara, CA, United States QualiTest Group Full timeAre you interested in working with the World's leading AI-powered Quality Engineering Company? Ready to advance your career, team up with global thought leaders across industries and make a difference every day? Join us at Qualitest! We are looking for a Site Reliability Engineer to join our growing team in Riverwoods, IL (hybrid - 2 to 3 days/week onsite)....
-
Site Reliability Engineer
1 week ago
Santa Clara, CA, United States QualiTest Group Full timeAre you interested in working with the World's leading AI-powered Quality Engineering Company? Ready to advance your career, team up with global thought leaders across industries and make a difference every day? Join us at Qualitest! We are looking for a Site Reliability Engineer to join our growing team in Riverwoods, IL (hybrid - 2 to 3 days/week onsite)....
-
Site Reliability Engineer
1 week ago
Santa Clara, CA, United States QualiTest Group Full timeAre you interested in working with the World's leading AI-powered Quality Engineering Company? Ready to advance your career, team up with global thought leaders across industries and make a difference every day? Join us at Qualitest! We are looking for a Site Reliability Engineer to join our growing team in Riverwoods, IL (hybrid - 2 to 3 days/week onsite)....
-
Package Design Engineer
2 weeks ago
Santa Clara, CA, United States Celestial AI Full timeAbout Celestial AI As Generative AI continues to advance, the performance drivers for data center infrastructure are shifting from systems-on-chip (SOCs) to systems of chips. In the era of Accelerated Computing, data center bottlenecks are no longer limited to compute performance, but rather the system's interconnect bandwidth, memory bandwidth, and memory...
-
IC Package Integration Engineer
6 days ago
Santa Clara, CA, United States Apple Full timeRole Number: 200576913-3760 Summary Do you like to work on ground breaking technologies that enable amazing new products? Do you have the attention for details and love for excellence to work towards an extraordinary result? Envision what you could do here! At Apple, we believe new ideas have a way of becoming extraordinary products, services, and customer...