Cluster Engineer- Deep Learning

2 weeks ago


Santa Clara, United States Sustainable Talent Full time
Job DescriptionJob Description

Are you ready to make your mark in the forefront of technological innovation? As an HPC Cluster Engineer, you'll play a pivotal role in shaping the future of AI, deep learning, and machine learning initiatives. Join us and leverage Nvidia's cutting-edge GPU technology to drive groundbreaking discoveries and revolutionize industries.

Sustainable Talent is thrilled to partner with Nvidia, a global powerhouse with over 25 years of trailblazing advancements in computer graphics, gaming, and accelerated computing.

This is a W-2 full-time contract based in Santa Clara, CA - Hybrid work option. We offer competitive pay based on factors like experience, education, location, etc. and provide full benefits, PTO, and amazing company culture

Additional locations: MA, Westford; US, NC, Durham; US, TX, Austin.

What you'll be doing:

  • You'll lead the charge in optimizing our Infiniband network and managing Lustre and GPFS storage solutions, ensuring seamless performance for our cutting-edge initiatives.
  • Your expertise in the SLURM job scheduler will be instrumental in orchestrating the smooth operation of our clusters, from scheduling tasks to managing resources efficiently.
  • As a Linux sysadmin guru, you'll be responsible for maintaining the stability and security of our systems, leveraging your deep understanding of Linux environments.
  • Harnessing the power of Ansible, you'll automate routine tasks and streamline operations, freeing up time for innovation and optimization.
  • Advanced python and bash scripting will drive automation efforts and enable dynamic solutions to complex challenges.

What We Need to See:

  • Demonstrated experience with SLURM, coupled with a solid understanding of Infiniband networks and Lustre/GPFS storage systems, is essential.
  • A proven track record in Linux system administration, ensuring robustness and security in our computing environment.
  • Proficiency in Ansible is a must-have, enabling you to automate tasks and workflows efficiently.
  • Strong scripting abilities in Python and bash are critical for developing custom solutions and optimizing cluster performance.

Ways to Stand Out From the Crowd:

  • Showcase your knowledge of best practices in HPC cluster operations, automation, and upgrades, setting you apart as a seasoned professional in the field.

Sustainable Talent is a M/F+, disabled, and veteran equal employment opportunity and affirmative action employer.


  • HPC Cluster Engineer

    3 weeks ago


    Santa Clara, United States Sustainable Talent Full time

    Are you ready to make your mark in the forefront of technological innovation? As an HPC Cluster Engineer, you'll play a pivotal role in shaping the future of AI, deep learning, and machine learning initiatives. Join us and leverage Nvidia's cutting-edge GPU technology to drive groundbreaking discoveries and revolutionize industries. Sustainable Talent is...


  • Santa Clara, United States NVIDIA Full time

    We are looking for a Senior Deep Learning Compiler Engineer. NVIDIA is hiring software engineers for its Deep Learning Compiler team. Academic and commercial groups around the world are using GPUs to power a revolution in deep learning, enabling breakthroughs in many areas, e.g. image classification, speech recognition, recommendation systems, large language...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA’s invention of the GPU in 1999 fueled the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots, and self-driving cars that can perceive and understand the...


  • Santa Clara, CA, United States NVIDIA Full time

    Are you passionate about the software development of large scale distributed projects? The Deep Learning Frameworks Teams at NVIDIA are looking for Senior Software Engineers that are passionate about building systems for continuous development, integration, testing and delivery of the most sophisticated software stacks of our time! Our team collaborates...


  • Santa Clara, United States Nvidia Full time

    NVIDIA is searching for an outstanding researcher working on efficient deep learning to join the learning and perception research team. We are passionate about research that pushes boundaries but also has impact in the real world. We are particularly excited about methods for post-training model optimization (pruning, quantization, NAS), efficient...


  • Santa Clara, CA, United States Nvidia Full time

    NVIDIA is an industry leader with groundbreaking developments in High-Performance Computing, Artificial Intelligence and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery and powers what...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA is searching for an outstanding researcher working on efficient deep learning to join the learning and perception research team. We are passionate about research that pushes boundaries but also has impact in the real world. We are particularly excited about methods for post-training model optimization (pruning, quantization, NAS), efficient...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were...


  • Santa Clara, California, United States XPENG Full time

    We design, develop, manufacture and market smart EVs that are seamlessly integrated with advanced Internet, AI and autonomous driving technologies.We are looking for a machine learning engineer with strong programming and development skills and experience with machine learning.Experiences with deep learning algorithms, object detection, sensor fusion, SLAM...


  • Santa Clara, CA, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Santa Clara, United States Hitalent Full time

    Develop algorithms for deep learning, data analytics, machine learning, or scientific computing. Construct and curate large problem specific datasets. Analyze and improve performance of GPU implementations. Collaborate with team members and other partners. Keep up with the latest DL research and collaborate with diverse teams (internal and external to...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is looking for an experienced Machine Learning Engineer to join its Autonomous Vehicle team in Santa Clara, CA, USA. As a member of our team you will develop key features for our autonomous driving platform.You are encouraged to work across people and project boundaries, and among computer vision, computer graphics, and machine learning...


  • Santa Clara, CA, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing. NVIDIA is a “learning machine” that constantly evolves by...


  • Santa Clara, United States SoundHound Full time

    SOUNDHOUND INC. TURNS SOUND INTO UNDERSTANDING AND ACTIONABLE MEANING. We believe in enabling humans to interact with the things around them in the same way we interact with each other: by speaking naturally to mobile phones, cars, TVs, music speakers, coffee machines, and every other part of the emerging 'connected' world. Our latest product, Hound,...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...


  • Santa Clara, CA, 95051, Santa Clara County, CA, United States NVIDIA Full time

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is a "learning machine" that constantly evolves by adapting...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...