HPC Cluster Engineer
6 days ago
We are seeking a highly skilled HPC Performance Engineer to join our team at 1000 KLA Corporation. As a key member of our HPC team, you will be responsible for designing, implementing, and supporting high-performance compute clusters.
Key Responsibilities- Design and Implementation: Design and implement high-performance compute clusters, ensuring optimal performance and efficiency.
- System Knowledge: Possess in-depth knowledge of HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and cloud-based computing architectures.
- Performance Optimization: Identify and resolve processing efficiencies, drive optimizations to improve cluster utilization, and evaluate the detailed timing of cluster operation.
- Linux Configuration: Use strong skills with the Linux OS to configure operating systems for the HPC system.
- Project Management: Understand and assemble project specifications and performance requirements, adhere to project timelines, and ensure program achievements complete on time.
- Product Support: Support the design and release of new products to manufacturing and the customer, providing quality golden images, procedures, scripts, and documentation.
We are looking for individuals who are inquisitive, thrive on challenge, enjoy problem-solving, and have excellent written and verbal skills.
Required Qualifications- Linux Knowledge: Validated in-depth and flavor-agnostic knowledge of Linux systems (SuSE, RedHat, Rocky, Ubuntu).
- Parallel Programming: In-depth knowledge of parallel programming, vector-based processing, distributed computing, code optimization on CPU and GPUs.
- Vector Processing: Experience in vector processing and multi-threading related technologies and libraries (SIMD, AVX, IPP, MKL, openCV, openMP, OpenCL, MPI, TBB, CUDA).
- Performance Profiling: Knowledge of performance profilers (Intel vTune, Nvidia Nsight compute, AMD uProf, perf) and custom profiling and telemetry tools.
- HPC Job Schedulers: Knowledge of HPC job schedulers and how they function.
- Bottleneck Identification: Ability to find bottlenecks and drive closure of them, whether in data movement, code execution timing, or job scheduling optimization.
- HPC Hardware Knowledge: Strong HPC HW knowledge, especially in server, GPU, networking, storage, BIOS, and BMC arenas.
- Scripting Skills: Ability to code and develop Shell and Python scripts for developing test environments.
- Kubernetes Experience: Experience with Kubernetes, Harbor, Prometheus, and Grafana.
- Education and Experience: BS or MS degree + 3 to 5 years validated experience in Computer Engineering or Electrical Engineering related fields.
- Team Orientation: Highly motivated teammate with ability to develop and maintain collaborative relationships.
- Organization and Time Management: Able to plan, schedule, organize, and follow up on tasks to achieve goals within or ahead of established time frames.
- Multi-tasking: Ability to expeditiously organize, coordinate, manage, prioritize, and perform multiple tasks simultaneously.
- Adaptability to Change: Able to be flexible and supportive, and able to assimilate change positively and proactively in a rapid growth environment.
- Excellent Communication Skills: Outstanding teammate with excellent written and verbal communications skills.
Typically requires a Doctorate (Academic) Degree and 0 years related work experience; Master's Level Degree and related work experience of 3 years; Bachelor's Level Degree and related work experience of 5 years.
We offer a total rewards package that is competitive and comprehensive, including medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.
KLA is proud to be an Equal Opportunity Employer. We do not discriminate on the basis of race, religion, color, national origin, sex, gender identity, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other status protected by applicable law. We will ensure that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.
-
HPC Student Intern
2 days ago
Milpitas, California, United States KLA Full timeJob SummaryWe are seeking a highly motivated and detail-oriented HPC Student Intern to join our team at KLA. As a member of our Global Products Group, you will have the opportunity to work on cutting-edge projects and develop your skills in data pipeline development, scripting, and visualization.Key ResponsibilitiesDevelop and maintain data pipelines for IMC...
-
Advanced Compute Systems Engineer
2 weeks ago
Milpitas, California, United States Talent Groups Full timePosition: Advanced Compute Systems EngineerLocation: OnsiteEmployment Type: ContractPosition Overview:Engaging in the development and management of high-performance computing clusters, which includes the assembly, installation, upkeep, enhancement, documentation, and procedural writing for server hardware and software systems utilized in organizational...
-
Senior Cloud Operations Engineer
1 week ago
Milpitas, California, United States Tarana Wireless Full timeSenior Site Reliability Engineer at Tarana WirelessAs a **Senior Site Reliability Engineer** at **Tarana Wireless**, you will play a crucial role in overseeing software operations in the cloud and managing a vast network of remote radio devices. Collaborating closely with your team, you will serve as a key contact during off-peak hours and supervise all...
-
Lead Site Reliability Engineer
2 weeks ago
Milpitas, California, United States Tarana Wireless Full timeSenior Site Reliability Engineer at Tarana WirelessAs a **Senior Site Reliability Engineer** at **Tarana Wireless**, you will play a crucial role in overseeing software operations in the cloud and managing remote radio devices on a large scale. Collaborating with a dedicated team, you will serve as a key contact during off-peak hours and supervise all...
-
Senior AI/MLOps Engineer
6 days ago
Milpitas, California, United States Tarana Wireless Full timeJob DescriptionOverviewTarana Wireless is seeking a highly skilled Senior AI/MLOps Engineer to join our team. As a key member of our organization, you will be responsible for designing and implementing end-to-end AI/ML systems, ensuring high data quality and reliability in our data warehouse, and developing improved model workflows.Key ResponsibilitiesDesign...
-
Cloud AI/MLOps Engineer
6 days ago
Milpitas, California, United States Tarana Wireless Full timeJob DescriptionOverviewTarana Wireless is seeking a highly skilled Senior AI/MLOps Engineer to join our team. As a key member of our organization, you will be responsible for designing and implementing end-to-end AI/ML systems, ensuring high data quality and reliability in our data warehouse, and developing improved model workflows.Key ResponsibilitiesDesign...
-
Senior Cloud Reliability Specialist
1 week ago
Milpitas, California, United States Tarana Wireless Full timeSenior Site Reliability Engineer at Tarana WirelessAs a **Senior Site Reliability Engineer** at **Tarana Wireless**, you will play a crucial role in overseeing software operations in the cloud and managing millions of remote radio devices. Collaborating closely with your team, you will serve as the primary contact during off-peak hours and will be...
-
AI/MLOps Engineer
6 days ago
Milpitas, California, United States Tarana Wireless Inc Full timeAbout the RoleThis position focuses on the end-to-end workflow for our AI/ML models and data. It will require our team members to wear different hats as we scale our deployments, our maturity and our organization.Key ResponsibilitiesData Engineering: Work with our cloud system data engineers and our data scientists to ensure high data quality and reliability...
-
Lead Electromagnetic Research Scientist
2 weeks ago
Milpitas, California, United States KLA Full timeCompensation Overview: $139,000 - $236,800 AnnuallyLocation: USA-CA-Milpitas-KLAKLA offers a comprehensive total rewards package that may include participation in performance incentive programs and eligibility for various benefits outlined below. Interns may also qualify for some of these benefits. Our compensation ranges are determined by role, level, and...
-
Electromagnetic Research Specialist
2 weeks ago
Milpitas, California, United States KLA Full timeCompensation Overview: Base Pay Range: $124,000.00 - $211,000.00Location: USACompany Overview:KLA stands at the forefront of diversified electronics within the semiconductor manufacturing sector. Our technologies are integral to the production of virtually every electronic device in existence. From laptops to smartphones, and from wearables to smart cars,...
-
Storage Solutions Architect
2 weeks ago
Milpitas, California, United States Western Digital Full timeJob OverviewThe Storage Solutions Architect will be responsible for the design, architecture, and implementation of advanced storage systems and backup solutions across our global infrastructure. This role will focus on technologies such as Dell PowerMax, NetApp Cluster mode, and Nasuni Cloud Appliance, alongside Dell Isilon, Dell Unity, Dell ECS, and...
-
Storage Solutions Architect
1 week ago
Milpitas, California, United States Western Digital Full timeJob OverviewThe Storage Solutions Architect will be responsible for the design, architecture, and implementation of enterprise storage systems and backup solutions across our global data centers and corporate offices. This role will focus on technologies such as Dell PowerMax, NetApp Cluster mode, Nasuni Cloud Appliance, along with Dell Isilon, Dell Unity,...
-
Enterprise Storage Solutions Architect
1 week ago
Milpitas, California, United States Western Digital Full timeJob OverviewCompany Overview:At Western Digital, we are dedicated to driving global innovation and expanding the horizons of technology, transforming the seemingly impossible into reality.As a company built on problem-solving, we empower individuals to achieve remarkable feats through the right technology. Our legacy includes pivotal contributions to...
-
Electromagnetic Research Specialist
2 weeks ago
Milpitas, California, United States KLA Full timeCompensation Range: $124,000.00 - $211,000.00Location: USA-CA-Milpitas KLA offers a comprehensive total rewards package for its employees, which may include participation in performance incentive programs and eligibility for additional benefits outlined below. Interns may also qualify for certain benefits. The displayed compensation range represents the...
-
Enterprise Storage Solutions Architect
1 week ago
Milpitas, California, United States Western Digital Full timeJob OverviewCompany Overview:At Western Digital, we are driven by a vision to fuel global innovation and redefine the limits of technology, transforming what was once deemed impossible into reality.As a company of innovators, we empower individuals to achieve remarkable feats through advanced technology. Our contributions have been pivotal in significant...
-
Storage Solutions Architect
1 week ago
Milpitas, California, United States Western Digital Full timeJob OverviewCompany Overview:At Western Digital, we are dedicated to driving global innovation and expanding the horizons of technology, transforming what was once deemed impossible into reality.As a company rooted in problem-solving, we empower individuals to achieve remarkable feats through advanced technology. Our innovations have played a pivotal role in...
-
Enterprise Storage Solutions Architect
1 week ago
Milpitas, California, United States Western Digital Full timeJob OverviewCompany Overview:At Western Digital, we strive to drive global innovation and redefine technological boundaries, transforming the seemingly impossible into reality.We are fundamentally a company of innovators. With the right technology, remarkable achievements are within reach. Our legacy includes pivotal contributions, such as supporting...
-
HPC Engineer
4 weeks ago
Milpitas, United States E-Solutions INC Full timeJob DescriptionJob DescriptionRole: HPC Engineer Location; Milpitas, CA We need a low level rack and stack person, that will also do server installations & cabling of the racks. Maybe some configuration of switches, PDUs, and some manual OS installs onto servers. If they have some python / bash experience that would be good. This is a very physical activity...
-
HPC Engineer
4 weeks ago
Milpitas, United States Intellectt Inc Full timePosition:: HPC Engineer Location:: Milpitas,CA 95035 - Onsite Duration::Long TermMandatory Skills:: High Performance Compute clustersClient Note::We need a low level rack and stack person, that will also do server installations & cabling of the racks. Maybe some configuration of switches, PDUs, and some manual OS installs onto servers.If they have some...
-
HPC Engineer
4 weeks ago
Milpitas, United States Intellectt Inc Full timePosition:: HPC Engineer Location:: Milpitas,CA 95035 - Onsite Duration::Long TermMandatory Skills:: High Performance Compute clustersClient Note::We need a low level rack and stack person, that will also do server installations & cabling of the racks. Maybe some configuration of switches, PDUs, and some manual OS installs onto servers.If they have some...
-
HPC Performance Engineer
3 months ago
Milpitas, United States 1000 KLA Corporation Full timeDescription /Preferred Qualifications Responsibilities for this exciting role will include: Design, implementation & support of high-performance compute clusters Solid knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inter-connects, and a knowledge of cloud-based computing architectures Ability to...
-
HPC Engineer
4 weeks ago
Milpitas, United States Talent Groups Full timeKeywords : High Performance Compute clusters, Rack ConfigurationsJob Description:working on high performance compute clusters: specifically, constructing, installing, maintaining, upgrading, documenting, and writing procedures for server hardware and software systems used on company products. Projects involve hands on working with high performance Linux...
-
HPC Engineer
4 weeks ago
Milpitas, United States Talent Groups Full timeKeywords : High Performance Compute clusters, Rack ConfigurationsJob Description:working on high performance compute clusters: specifically, constructing, installing, maintaining, upgrading, documenting, and writing procedures for server hardware and software systems used on company products. Projects involve hands on working with high performance Linux...
-
HPC Engineer
1 month ago
Milpitas, United States HCL USAAvance Consulting Full timeworking on high performance compute clusters: specifically, constructing, installing, maintaining, upgrading, documenting, and writing procedures for server hardware and software systems used on company products. Projects involve hands on working with high performance Linux compute clusters which includes bios configuration, configuration and testing...
-
Senior Site Reliability Engineer
3 months ago
Milpitas, United States Tarana Wireless Full timeJob DescriptionJob DescriptionAs a Senior Site Reliability Engineer, you will help us manage software that runs on the cloud and remotely manages millions of radio devices. You will work on a team and be a main point of contact during off shore hours and responsible for all aspects of cloud operations, such as:Infrastructure as CodeManage environments in...
-
Senior AI/MLOps Engineer
3 months ago
Milpitas, United States Tarana Wireless Full timeJob DescriptionJob DescriptionThis position focuses on the end-to-end workflow for our AI/ML models and data. It will require our team members to wear different hats as we scale our deployments, our maturity and our organization Data engineering - work with our cloud system data engineers and our data scientists to ensure high data quality and reliability...
-
Senior Operation Manager
1 week ago
milpitas, United States Venture Corporation Limited Full timeVenture, a public listed company in SGX, is a leading global provider of technology services, products and solutions with established capabilities spanning marketing research, design, research and development. Over the years, Venture has built know-how and intellectual property with expertise in several technology domains. These include life science &...
-
Senior Operation Manager
1 week ago
milpitas, United States Venture Corporation Limited Full timeVenture, a public listed company in SGX, is a leading global provider of technology services, products and solutions with established capabilities spanning marketing research, design, research and development. Over the years, Venture has built know-how and intellectual property with expertise in several technology domains. These include life science &...
-
Senior Operation Manager
3 weeks ago
Milpitas, United States Venture Corporation Limited Full timeVenture, a public listed company in SGX, is a leading global provider of technology services, products and solutions with established capabilities spanning marketing research, design, research and development. Over the years, Venture has built know-how and intellectual property with expertise in several technology domains. These include life science &...
-
Senior Operation Manager
3 weeks ago
Milpitas, United States Venture Corporation Limited Full timeVenture, a public listed company in SGX, is a leading global provider of technology services, products and solutions with established capabilities spanning marketing research, design, research and development. Over the years, Venture has built know-how and intellectual property with expertise in several technology domains. These include life science &...
-
Architect - Storage/Backup
1 month ago
Milpitas, United States Western Digital Full timeJob DescriptionJob DescriptionCompany DescriptionAt Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible.At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing...