Senior Software Engineer, GPU Communications and Networking

2 months ago


Santa Clara, United States Nvidia Full time

Senior Software Engineer, GPU Communications and Networking

locations
US, CA, Santa Clara
time type
Full time

job requisition id
JR1972306
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for phenomenal people like you to help us accelerate the next wave of artificial intelligence.

We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network software team. The position will be part of a fast-paced crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep Learning.

What you will be doing:

Design, implement and maintain highly-optimized communication runtimes for Deep Learning frameworks (e.g. NCCL for TensorFlow/Pytorch) and HPC programming interfaces (e.g. UCX for MPI/OpenSHMEM) on GPU clusters.

Participating in and contributing to parallel programming interface specifications like MPI/OpenSHMEM.

Design, implement and maintain system software that enables interactions among GPUs and interactions between GPUs and other system components.

Creating proof-of-concepts to evaluate and motivate extensions in programming models, new designs in runtimes and new features in hardware.

What we need to see:

M.S./Ph.D. degree in CS/CE or equivalent experience.

5+ years of relevant experience.

Excellent C/C++ programming and debugging skills.

Strong experience with Linux.

Expert understanding of computer system architecture and operating systems.

Experience with parallel programming interfaces and communication runtimes.

Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.

Ways to stand out from the crowd:

Deep understanding of technology and passionate about what you do.

Experience with CUDA programming and NVIDIA GPUs.

Knowledge of high-performance networks like InfiniBand, iWARP etc.

Experience with HPC applications.
Experience with Deep Learning Frameworks such PyTorch, TensorFlow, etc.
Strong collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic matrix environment.

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most forward-thinking and talented people in the world working for us and, due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to hear from you.

The base salary range is 148,000 USD - 339,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Santa Clara, California, United States Nvidia Full time

    Senior Software Engineer, GPU Communications and NetworkinglocationsUS, CA, Santa Claratime typeFull timejob requisition idJR1972306NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of...


  • Santa Clara, United States Nvidia Full time

    Software Engineering Manager - GPU Communications LibrarieslocationsUS, CA, Santa Claratime typeFull timejob requisition idJR1980109We are the GPU Communications Libraries and Networking team at NVIDIA. We deliver communication libraries like NCCL, NVSHMEM, UCX for Deep Learning and HPC. DL and HPC applications have a huge compute demand already and run on...


  • Santa Clara, United States Nvidia Full time

    Senior Platform Software Engineer, AI Server - GPUlocationsUS, CA, Santa ClaraUS, Remotetime typeFull timejob requisition idJR1980965NVIDIA’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern deep learning —...


  • Santa Clara, California, United States AMD Full time

    JOIN AMD AND MAKE A DIFFERENCEAt AMD, we are dedicated to revolutionizing lives through our advanced technology, enhancing our industry, communities, and the global landscape. Our vision is to create exceptional products that propel next-generation computing experiences, serving as the foundation for data centers, artificial intelligence, personal computing,...


  • Santa Clara, United States Advanced Micro Devices , Inc. Full time

    Overview: WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....


  • Santa Clara, California, United States Nvidia Corporation Full time

    We are now looking for a Senior GPU Memory Architect. NVIDIA is seeking a motivated architect to work with a team in solving complex problems while optimizing performance, area, complexity, and power on leading-edge silicon processes. This GPU memory architecture team creates new, innovative products tailored to NVIDIA's world-changing solutions for...


  • Santa Clara, United States netPolarity, Inc. (Saicon Consultants, Inc.) Full time

    Location: Santa Clara, CA (HYBRID) - Flexible - West coast candidates preferredDuration: 6-9 months contract W2 only + Extension Note: GPU programming skills critical (CUDA/ROCm, C++), parallel processingThe Person:A GPU software development / Library engineer with experience in writing GPU code to solve problems in computational geometry, capable of...


  • Santa Clara, United States netPolarity, Inc. (Saicon Consultants, Inc.) Full time

    Location: Santa Clara, CA (HYBRID) - Flexible - West coast candidates preferredDuration: 6-9 months contract W2 only + Extension Note: GPU programming skills critical (CUDA/ROCm, C++), parallel processingThe Person:A GPU software development / Library engineer with experience in writing GPU code to solve problems in computational geometry, capable of...


  • Santa Clara, California, United States Apple Full time

    Overview As a key member of our Silicon Technologies division, you will play a crucial role in the design and development of cutting-edge, high-efficiency processors and system-on-chip (SoC) solutions. Your expertise will contribute to the creation of Apple’s next-generation GPU, ensuring our products deliver exceptional performance and user satisfaction....


  • Santa Clara, California, United States NVIDIA Full time

    We are currently seeking a Lead GPU System Architect to join our dynamic GPU team.NVIDIA's innovation in graphics and parallel computing is a cornerstone of our success, allowing us to deliver unparalleled performance in graphics processing. We are continually exploring avenues to enhance our GPU architecture and uphold our leadership position in the...


  • Santa Clara, United States Advanced Micro Devices , Inc. Full time

    Overview: WHAT YOU DO AT AMD CHANGES EVERYTHING We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences the building blocks for the data center, artificial intelligence, PCs, gaming and embedded....


  • Santa Clara, United States Nvidia Full time

    Senior Software Engineer, Distributed Systems - DGX CloudlocationsUS, CA, Santa ClaraUS, Remotetime typeFull timejob requisition idJR1983167NVIDIA is hiring engineers to scale up its AI Infrastructure. We expect you to have a strong programming background, a deep understanding of distributed systems, familiarity with software testing and deployment, and...


  • Santa Clara, United States Oracle Full time

    Cloud Engineering Infrastructure Development Oracle Cloud Infrastructure (OCI) Cluster Networking team is building an ultra-high performance network required to support AI/ML/HPC workloads. This is your opportunity to join the AI revolution and designing systems which allow customers to scale from tens to thousands of GPU without compromising on...


  • Santa Clara, United States Nvidia Full time

    Senior System Profiling Software EngineerlocationsUS, CA, Santa Claratime typeFull timejob requisition idJR1982581A key part of NVIDIA's strength is our sophisticated analysis tools that empower NVIDIA engineers to improve perf and power efficiency of our products and the running applications. We are looking for forward-thinking, hard-working, and creative...


  • Santa Clara, California, United States NVIDIA Full time

    The NVIDIA GPU Cloud (NGC) team is seeking experienced software engineers to develop NVIDIA's advanced compute cloud solutions. These solutions encompass software for managing hardware and network provisioning to create a multi-tenant infrastructure. As a software engineer, you will collaborate with fellow engineers, product architects, and product managers...


  • Santa Clara, California, United States Apple Full time

    Energy Efficiency GPU EngineerLocation: Santa Clara, California, United StatesDepartment: HardwareAre you passionate about developing innovative solutions to intricate problems? Within our Silicon Technologies division, you will contribute to the design and production of our cutting-edge, high-performance, energy-efficient processors and system-on-chip (SoC)...


  • Santa Clara, United States US Tech Solutions Full time

    Duration: 12 months contract Job Description: · This position is for an experienced engineer with GPU programming and optimizations skills, with a proven ability to analyse GPU codes and delivery of highly parallel solutions. · You will be part of a team developing and tuning a computational geometry application for Clients CPU and GPU platforms....


  • Santa Clara, United States US Tech Solutions Full time

    Duration: 12 months contract Job Description: · This position is for an experienced engineer with GPU programming and optimizations skills, with a proven ability to analyse GPU codes and delivery of highly parallel solutions. · You will be part of a team developing and tuning a computational geometry application for Clients CPU and GPU...


  • Santa Clara, United States Ledgent Technology Full time

    Failure Analysis Engineer - Servers/GPULocation: Santa Clara, CaContract to Hire - 6+ MonthsOnsiteRate: $40 - $43/hrSummaryThe Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally...


  • Santa Clara, United States Ledgent Technology Full time

    Failure Analysis Engineer - Servers/GPULocation: Santa Clara, CaContract to Hire - 6+ MonthsOnsiteRate: $40 - $43/hrSummaryThe Failure Analysis Engineer uses procedures and instructions to initiate the analysis process when product failure occurs. Investigations are researched for root causes with analysis documented, recorded, and communicated internally...