Senior GPU Supercomputer Scheduler Engineer

4 weeks ago


Austin, United States NVIDIA Full time

NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities that are hard to solve, that only we can take on, and that matter to the world. This is our life’s work, to amplify human imagination and intelligence. Join us today

As a member of the GPU/HPC Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a technology leader to identify architectural changes and/or completely new approaches for improving HPC schedulers for serving many simultaneous and large multi-node GPU workloads with many complex dependencies. This role offers you an excellent opportunity to deliver production grade solutions, get hands on with ground-breaking technology, and work closely with technical leaders solving some of the biggest challenges in machine learning, cloud computing, and system co-design.

What you'll be doing:

  • Design and develop enhancements to the HPC batch scheduler(s).

  • Work extensively with HPC scheduler vendor on bug fixes and feature releases

  • Provide support to staff and end users to resolve batch scheduler issues

  • Build and improve our ecosystem around GPU-accelerated computing

  • Performance analysis and optimizations of deep learning workflows

  • Develop large scale automation solutions

  • Root cause analysis and suggest corrective action for problems large and small scales

  • Finding and fixing problems before they occur

What we need to see:

  • Bachelor’s degree in Computer Science, Electrical Engineering or related field or equivalent experience with 5+ years of work experience

  • Strong understanding of HPC batch schedulers, such as Slurm, RTDA or LSF and HPC workflows that use MPI

  • Significant experience in Programming in C/C++ and advanced scripting in languages such as Python, Go, bash scripting

  • Established experience in Linux operating system, environment and tools

  • Accomplished in computer architecture and operating systems

  • Deep knowledge of Networking Protocols like InfiniBand, Ethernet

  • Experience analyzing and tuning performance for a variety of HPC workloads

  • In-depth understating of container technologies like Docker, Singularity, Podman

  • Flexibility/adaptability for working in a dynamic environment with different frameworks and requirements

  • Excellent communication, interpersonal and customer collaboration skills

Ways to stand out from the crowd:

  • Knowledge in MPI and High-performance computing

  • Background in RDMA technology

  • Experience in kernel programming

  • Open Source Software Contributor

  • Experience with deep learning frameworks like PyTorch and TensorFlow

  • Passionate about SW development processes

  • Want to make what was impossible possible

The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Austin, United States NVIDIA Full time

    Senior GPU Performance and Power Profiling Engineer page is loaded Senior GPU Performance and Power Profiling Engineer Apply locations US, TX, Austin US, CA, Santa Clara time type Full time posted on Posted 3 Days Ago job requisition id JR1981143 At NVIDIA, we build groundbreaking products for the following sectors: Automotive, VR, Gaming, Deep Learning, and...


  • Austin, United States NVIDIA Full time

    Senior ASIC Verification Engineer - GPU page is loaded Senior ASIC Verification Engineer - GPU Apply locations US, TX, Austin US, MA, Westford US, CA, Santa Clara time type Full time posted on Posted 2 Days Ago job requisition id JR1981177 NVIDIA is seeking an elite Senior DV Engineer to verify the design and implementation of the next generation of PCI...


  • Austin, United States NVIDIA Full time

    ASIC Verification Engineer - GPU page is loaded ASIC Verification Engineer - GPU Apply locations US, CA, Santa Clara US, TX, Austin US, NC, Durham time type Full time posted on Posted 30+ Days Ago job requisition id JR1969014 NVIDIA is seeking elite ASIC Verification Engineers to verify the design and implementation of the world’s leading SoC's and GPU's....


  • Austin, United States NVIDIA Full time

    NVIDIA is seeking elite ASIC RTL/Verification ASIC engineers to develop the core Verification and RTL infrastructure of the world's leading GPUs. This position offers the opportunity to have a real impact in a dynamic, technology-focused company impacting product lines ranging from consumer graphics to artificial intelligence to self-driving cars and...


  • Austin, Texas, United States Apple Full time

    SummaryPosted: May 30, 2023Role Number: Do you love crafting elegant solutions to highly sophisticated challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you'll help design and manufacture our next-generation, high-performance, power-efficient processor, system-on-chip (SoC). You'll ensure Apple...


  • Austin, Texas, United States NVIDIA Corporation Full time

    Senior Systems Software Engineer, CUDA Trace and Profiling page is loadedSenior Systems Software Engineer, CUDA Trace and ProfilingNVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years.Join the NVIDIA Developer Tools team and empower engineers throughout the world developing groundbreaking products in AI,...


  • Austin, United States NVIDIA Full time

    NVIDIA is seeking elite ASIC RTL/Verification ASIC engineers to develop the core Verification and RTL infrastructure of the world's leading GPUs. This position offers the opportunity to have a real impact in a dynamic, technology-focused company impacting product lines ranging from consumer graphics to artificial intelligence to self-driving cars and...


  • Austin, TX 78719, USA, United States Apple Inc. Full time

    Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Dynamic, hard-working people and inspiring, innovative technologies are the norm here. The people who work here have...


  • Austin, TX, United States Apple Inc. Full time

    Imagine what you could do here! At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Dynamic, hard-working people and inspiring, innovative technologies are the norm here. The people who work here have...


  • Austin, United States NVIDIA Full time

    Senior Compiler Optimization Engineer page is loaded Senior Compiler Optimization Engineer Apply locations US, WA, Redmond US, TX, Austin US, TX, Remote US, WA, Remote US, CA, Remote time type Full time posted on Posted Yesterday job requisition id JR1980609 We are looking for an experienced Senior Compiler Optimization Engineer for an exciting role in our...


  • Austin, United States NVIDIA Full time

    We are searching for a Senior Backend Compiler Engineer with experience in LLVM code generation for an exciting and fun role in our GPU Software organization. Our Compiler team is responsible for constructing and emitting the highest performance GPU machine instructions for Graphics (OpenGL, Vulkan, DX) and Compute (CUDA, PTX, OpenCL, Fortran, C++). This...


  • Austin, TX, United States Nvidia Full time

    NVIDIA is seeking elite ASIC Verification Engineers to verify the design and implementation of the world’s leading SoC's and GPU's. This position offers the opportunity to have real impact in a dynamic, technology-focused company impacting product lines ranging from consumer graphics to self-driving cars and the growing field of artificial intelligence. We...


  • Austin, United States NVIDIA Full time

    We are looking for an experienced Senior Compiler Optimization Engineer for an exciting role in our Compute Compiler Team. We deliver features and improvements to CUDA and other compute compilers to better realize the potential of NVIDIA GPUs for a growing range of computational workloads, ranging from deep learning, scientific computation, and self-driving...


  • Austin, United States Technology Navigators Full time

    Job DescriptionJob DescriptionNo sponsorship available. No third-party candidates.Our client is an industry-leading organization that is revolutionizing the recycling industry by combining cutting-edge data analytics with artificial intelligence to change the way metal is salvaged. This company is passionate about creating a better, more sustainable future...


  • Austin, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Austin, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Austin, United States NVIDIA Full time

    We are looking for experienced Systems SW Compiler Engineers for an exciting role in our PTX (Parallel Thread Execution) Compiler Development team. Join the PTX Compiler team and help drive the PTX compiler evolution. PTX enables all GPU Computing applications including HPC, Deep Learning and Autonomous Driving. PTX provides a stable programming model and...


  • Austin, United States Cisco Full time

    What You'll Do Cisco Global Supplier Management (GSM) team is seeking a motivated Sr. Sourcing Commodity Manager for GPU's. You will be part of a highly impactful and dynamic organization collaborating with cross-functional teams and suppliers. Responsibilities include: Primary contact and focal point to the Business Units to address all sourcing and...


  • Austin, United States NVIDIA Full time

    We are looking for a Senior Software Security Compiler Engineer! NVIDIA's invention of the GPU 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots,...


  • Austin, United States NVIDIA Full time

    NVIDIA is searching for outstanding software engineers to join the CUDA driver team. This team develops and supports NVIDIA's GPU administration tools for monitoring and orchestrating our Compute GPU product line-up. The NVIDIA Data Center product line-up scales from single GPU add-in cards to full system DGX products, all built on developing technologies...