Current jobs related to Senior HPC Systems Engineer - Santa Clara - NVIDIA


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the field of computer graphics, PC gaming, and accelerated computing. With a legacy of innovation spanning over 25 years, we're committed to pushing the boundaries of what's possible with AI and GPU computing.Job SummaryWe're seeking an exceptional Senior HPC Systems Engineer to join our team. As a key player in our AI...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the technology world, renowned for its innovative products and services. As a pioneer in the field of accelerated computing, NVIDIA has been transforming computer graphics, PC gaming, and AI for over 25 years.Job SummaryWe are seeking an exceptional Senior HPC Systems Engineer to join our team. As a key player in our...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA has been a pioneer in computer graphics, PC gaming, and accelerated computing for over 25 years. Our legacy of innovation is fueled by great technology and amazing people. Today, we're pushing the boundaries of AI to define the next era of computing.Job SummaryWe're seeking an exceptional Senior HPC Systems Engineer to join our team. As a...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior AI-HPC Storage EngineerNVIDIA is seeking a highly skilled Senior AI-HPC Storage Engineer to join our GPU AI/HPC Infrastructure team. As a member of this team, you will provide leadership in the design and implementation of groundbreaking fast storage solutions to enable runs of demanding deep learning, high performance computing, and...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA has been a pioneer in computer graphics, PC gaming, and accelerated computing for over 25 years. Our legacy of innovation is fueled by great technology and amazing people. Today, we're pushing the boundaries of AI to define the next era of computing.Job SummaryWe're seeking an exceptional Senior HPC Systems Engineer to join our team. As a...


  • Santa Clara, California, United States Nvidia Full time

    Job Title: Senior Site Reliability Engineer - HPC StorageNVIDIA is a leader in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. We are seeking a phenomenal Senior Site Reliability Engineer to join our team and play a crucial role in designing, implementing, and optimizing on-prem High-Performance...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is a leader in the field of high-performance computing, and we are seeking a skilled Senior Software Engineer to join our team.The ideal candidate will have a strong background in software development, with experience in designing and creating reliable distributed systems. They will also have the ability to implement well-thought-out long-term...

  • HPC Cluster Engineer

    1 month ago


    Santa Clara, California, United States Sustainable Talent Full time

    Unlock the Power of HPCSustainable Talent is seeking a seasoned HPC Cluster Engineer to join our team in shaping the future of AI, deep learning, and machine learning initiatives. As a key player in our Nvidia-powered HPC environment, you'll leverage cutting-edge GPU technology to drive groundbreaking discoveries and revolutionize industries.With over 25...


  • Santa Clara, California, United States HPE Full time

    Job Description:Hewlett Packard Enterprise is seeking a highly skilled Software Engineer to join our HPC and AI organization. As a key member of the Slingshot Ethernet Fabric team, you will play a critical role in expanding HPE's High Performance Ethernet Fabric product growth through Commercial HPC use cases, AI use cases networking, systems, and...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior AI-HPC Storage Solutions ArchitectNVIDIA is a leader in the field of artificial intelligence and high-performance computing, and we are seeking a highly skilled Senior AI-HPC Storage Solutions Architect to join our team.About the Role:We are looking for an expert in designing and implementing high-performance storage solutions for our AI...


  • Santa Clara, California, United States NVIDIA Full time

    Job Description:NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing. For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology. Our history of innovation drives us to solve the world's hardest problems.We are looking for a Senior HPC and AI Solutions...


  • Santa Clara, California, United States Nvidia Full time

    Job SummaryNVIDIA is seeking a highly skilled Senior HPC Cluster Administrator to lead our GPU Compute Cluster team. As a key member of our Deep Learning Frameworks Group, you will be responsible for designing and implementing cutting-edge GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive...


  • Santa Clara, California, United States Nvidia Full time

    NVIDIA Job DescriptionWe are seeking a highly skilled Senior HPC Cluster Administrator to lead our GPU-accelerated systems and provide architectural mentorship to product teams in the deep learning and scientific computing domains.Key Responsibilities:Administer Linux systems, including powerful DGX servers and embedded systems, and bring up hardware to...


  • Santa Clara, California, United States Intel Full time

    Job SummaryWe are seeking an experienced AI and HPC Scale-out Systems architect to join our team at Intel. As a key member of our Data Center and Artificial Intelligence group, you will be responsible for architecting large-scale systems that support breakthrough performance on HPC and AI workloads.Key ResponsibilitiesArchitecting large-scale systems that...

  • HPC Cluster Engineer

    1 month ago


    Santa Clara, California, United States Sustainable Talent Full time

    Unlock the Power of HPCSustainable Talent is seeking a seasoned HPC Cluster Engineer to join our team in shaping the future of AI, deep learning, and machine learning initiatives. As a key player in our Nvidia-powered HPC environment, you'll leverage cutting-edge GPU technology to drive groundbreaking discoveries and revolutionize industries.As a trusted...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in AI, machine learning, and datacenter acceleration. Our company has continuously reinvented itself over two decades, with a strong focus on innovation and growth.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our GPU AI/HPC Infrastructure team. As a key member of our team, you will be responsible...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Product Architect, HPC and AIJob Summary: We are seeking a visionary Product Architect to join our team at NVIDIA. As a key member of our team, you will harness your infrastructure expertise to create reference designs for the world's most powerful AI clusters.Responsibilities:* Design the next-gen datacenter-scale AI infrastructure,...


  • US, CA, Santa Clara NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 fueled the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a “learning machine” that constantly evolves...


  • Santa Clara, California, United States NVIDIA Full time

    Unlock the Power of High-Performance ComputingNVIDIA is a pioneer in the field of high-performance computing, and we're seeking a talented Senior Software Engineer to join our team. As a leader in the industry, we've continuously pushed the boundaries of what's possible with our innovative solutions.As a Senior Software Engineer at NVIDIA, you'll be...


  • Santa Clara, California, United States HPE Full time

    About the Role:Hewlett Packard Enterprise (HPE) is seeking an experienced Software Engineer to join the Slingshot Ecosystem Development Team. This role will focus on expanding HPE's High Performance Ethernet Fabric product growth through Commercial HPC use cases, AI use cases networking, systems, and application and open-source...

Senior HPC Systems Engineer

2 months ago


Santa Clara, United States NVIDIA Full time

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are looking for an outstanding engineer for a Senior HPC Systems Engineer role for at scale AI system performance and datacenter applications. Be a key player to the most exciting computing hardware and software to contribute to the latest breakthroughs in artificial intelligence and GPU computing Provide insights on at-scale system design and tuning mechanisms for large-scale compute runs. You will work with the latest Accelerated Computing and Deep Learning software and hardware platforms, and with many researchers, developers, and customers to craft improved workflows and develop new, leading differentiated solutions. You will interact with HPC, OS, CPU and GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.

What you'll be doing:

Lead all aspects of implementing performance practices in large scale infrastructure, deliver powerful tools, methodologies, and flows to validate and improve several datacenter products in parallel.Accelerate strategic customer deployments and ensure speed-of-light bringup and deployment of ground-breaking AI infrastructure by working hand in hand tailoring design and faster processes to customer needs.Provide engineering solutions to enable large scale performance strategies for performance for Datacenter GPU Computing products and software stacks, ensure technical relationships with internal and external engineering teams, and assisting systems engineers in building creative solutions based on NVIDIA technology.Participating in engagements with various SW and FW (BMC/SBIOS/OS/drivers etc) teams to develop best-in-class practices and tools, you will be analyzing, debugging and resolving critical software issues for the best AI workload performance at scale.Specific responsibilities include owning the architecting of performance design and settings of datacenter at scale products both implemented in FW and SW components to ensure velocity and scale while efficiently using resources. This involves early engagement with HW/FW/SW/platform internal and customer teams, and other groups, to build end-to-end solutions and optimize datacenter product designs.Be an internal reference for software, at scale deployment for datacenter and large-scale GPU-accelerated system solutions among the NVIDIA technical community.
What we need to see:

5+ years of experience in using accelerated computing for datacenter container computing solutions.BS in Engineering, Mathematics, Physics, or Computer Science, MS or PhD desirable (or equivalent experience).Solid understanding of accelerated parallel computing models (MPI, NCCL).Experience using and handling modern Cloud and container-based Enterprise computing architectures.C/C++/Python/Bash programming/scripting experience.Experience with CPU architecture.Experience with container technology and Linux based OSes.Experience working with engineering or academic research community supporting high performance computing or deep learning.Strong verbal and written communication skills as well as excellent teamwork and communication skills.Ability to multitask effectively in a dynamic environment.
Ways to Stand Out From the Crowd:

Deep Learning framework skills.Hands-on experience with LLM training pipelines.Exposure to scheduling and resource management systems.Experience with large scale HPC environments.
Widely considered to be one of the technology world's most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.