HPC Deployment Lead, Professional Services

6 days ago


Texas, United States NVIDIA Full time

About the Role

NVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a leader in the field of high-performance computing, NVIDIA is driving innovation in deep learning, data analytics, and data center optimization. We are looking for a skilled professional to oversee the deployment of cutting-edge InfiniBand and Ethernet technologies with a team of AI and HPC experts.

Key Responsibilities

  • Direct and supervise the service HPC engineering functions in designing, developing, installing, and validating hardware and software for Customer AI High-Performance Computing (HPC) systems.
  • Lead, handle, mentor, and build a high-performing HPC service engineering team to deliver innovative advances in high-performance computing AI systems.
  • Responsible for leading our HPC projects' planning, implementation, and performance. Improve the integrity of system services bring-up and related by applying groundbreaking technical and operational knowledge to configure and maintain HPC AI network and server platforms.
  • Drive HPC team hardware and software deployment, plans, develops, and deploys procedures for system validation.
  • Lead team activities and drive tests and plans for Customer's HPC AI systems implementations, custom scripts, and testing procedures to ensure operational reliability for the system.
  • Support the HPC Engineering team, working with other internal collaborators to develop and run a well-rounded strategy for delivering service quality and continuous service improvement. Supports governance for software engineering through the implementation of standards and quality measures.
  • Leads team member development, helping them set and achieve goals for their career growth. Develop an inclusive environment that values team member differences, creating a sense of belonging and appreciation. Chips in to a culture of trust and clarity.
  • Build strong relationships with NVIDIA leaders, customers, partners, and collaborators. Works closely to identify, implement, and support leading NVIDIA's AI solutions engineering, maintaining currency with industry standards and innovations. Provides input around process optimization, department budgeting, and the monitoring and management of resources.
  • Be the domain authority with customers during planning calls through implementation.

Requirements

  • 8+ overall years' experience in IT, high-performance computing, or other related field; 3+ years of experience in a management or leadership role.
  • Demonstrated expertise in HPC systems design configuration and planning.
  • Proficiency with low latency/high-bandwidth interconnect infrastructure (Infiniband and Ethernet).
  • Expertise with HPC system software cluster management/provisioning tools, including job schedulers (Slurm, salt, xCAT).
  • Proficiency with shared and distributed memory parallelism (OpenMP, MPI, NCCL and HPL) and accelerators (GPUs).
  • Strong scripting ability (Bash, Perl, Python, etc.) and experience with programming fundamentals.
  • Expertise with administration, supervising and maintaining secure Linux/Unix operating systems (CentOS, Solaris).
  • Experience establishing processes for maintaining system performance, managing best-in-class standards, and familiarity with cloud computing and container technologies.
  • Ability to understand and work with large, sophisticated systems, identify and resolve problems, handle performance, and troubleshoot network issues related to infrastructure.
  • Expertise with multi-vendor hardware/software management, security, and network/Internet protocols. Strong communication and social skills, with the ability to provide detailed information and high-level summaries to management-level individuals and groups, present the business side of technical topics to non-technical audiences, and develop positive working relationships and strong rapport with team members.
  • Bachelor's degree in computer science, information systems, or a related field or equivalent experience.
  • Solid knowledge of HPC storage.
  • Exemplary communication and interpersonal skills, with the ability to present the business side of technical topics to non-technical audiences and persuasively and optimally get along with relationships with various stakeholders and diverse individuals and groups.

Preferred Qualifications

  • InfiniBand experience.
  • Experience with GPU-focused hardware/software.
  • Experience with MPI.
  • Automation tooling background (Ansible, Salt, Puppet, etc.).
  • Ethernet and Storage technologies such as Lustre or GPFS.

Compensation and Benefits

The base salary range for this position is $208,000 - $327,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Texas, United States NVIDIA Full time

    About the RoleNVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a leader in the field of high-performance computing, NVIDIA is driving innovation in deep learning, data analytics, and data center optimization. We are looking for a skilled professional to oversee the deployment of cutting-edge InfiniBand...


  • Texas, United States NVIDIA Full time

    Job SummaryNVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a key member of the team, you will be responsible for supervising the deployment of cutting-edge InfiniBand and Ethernet technologies with a team of AI and HPC experts.Key ResponsibilitiesDirect and supervise the service HPC engineering functions...

  • Sr. HPC Engineer

    4 weeks ago


    Texas City, United States Blueprint Consulting Services Full time

    Sr. HPC EngineerRemoteWho is Blueprint? We are a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States. Unified by a shared passion for solving complicated problems, our people are our greatest asset. We use technology as a tool to bridge the gap between strategy and execution, powered by the...


  • Texas City, Texas, United States Hewlett Packard Enterprise Development LP Full time

    Position Overview:The Senior Analyst role focuses on Financial Planning & Analysis (FP&A) within the HPC AI & Servers division. This position is hybrid, requiring an average of 2-3 days per week in the office.About Hewlett Packard Enterprise:Hewlett Packard Enterprise is a leading global edge-to-cloud organization that transforms how individuals and...

  • Sr. HPC Engineer

    1 day ago


    Texas City, United States Blueprint Consulting Services Full time

    Who is Blueprint? We are a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States. Unified by a shared passion for solving complicated problems, our people are our greatest asset. We use technology as a tool to bridge the gap between strategy and execution, powered by the knowledge, skills, and...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled Senior HPC Infrastructure Engineer to join our team at Advanced Micro Devices, Inc. As a key member of our HPC EDA Infrastructure team, you will be responsible for establishing and maintaining our technological leadership position in HPC EDA infrastructure in the semiconductor industry.Key ResponsibilitiesConduct...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled and experienced HPC Solutions Architect and System Engineer to join our team at Advanced Micro Devices, Inc. This is a key position that will play a critical role in driving the success of our Data Center GPU organization.Key ResponsibilitiesDrive technical innovation to improve AMD's capabilities across...


  • Texas City, Texas, United States Clean Harbors Full time

    About the RoleWe are seeking a highly skilled and experienced Crew Leader/Foreman to join our team at HPC-Industrial, a Clean Harbors company. As a Crew Leader/Foreman, you will be responsible for leading and managing a team of industrial cleaning professionals to ensure the safe and efficient completion of industrial cleaning operations.Key...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled Platform Emulation Engineer to join our Data Center GPU organization at Advanced Micro Devices, Inc. This is an exciting opportunity to work on bleeding-edge SoC architecture and technology, participating in the development of emulation infrastructure to enable pre-silicon activities and ensure high-quality...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Civil Engineering Professional to join our Renewables team in Plano, TX. As a key member of our team, you will be responsible for providing professional and technical support to multiple Westwood projects.Key ResponsibilitiesProvide technical support to multiple Westwood projectsConduct due...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the Role:Westwood Professional Services, Inc. is seeking a highly motivated and detail-oriented Environmental Intern to join our Wetlands/Due Diligence team in our Minneapolis office. This position involves working on various projects, including solar, wind, transmission, residential, and commercial development projects.Key Responsibilities:Assist with...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the Role:Westwood Professional Services, Inc. is seeking a highly motivated and detail-oriented Environmental Intern to join our Wetlands/Due Diligence team in our Minneapolis office. This position involves working on various projects, including solar, wind, transmission, residential, and commercial development projects.Key Responsibilities:Assist with...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    Job SummaryWe are seeking a highly motivated and detail-oriented Environmental Intern to join our team at Westwood Professional Services, Inc. in Minneapolis, MN. As an Environmental Intern, you will assist our Wetlands/Due Diligence team with various projects, including solar, wind, transmission, residential, and commercial development projects.Key...

  • Lead Mobile Architect

    7 hours ago


    Texas, United States TEK NINJAS Full time

    Job Title: Lead Mobile App ArchitectLocation: Remote (CST or EST)Type: Contract-to-Hire (CTH)Overview: We are seeking a highly skilled Lead Mobile App Architect to join our team at TEK NINJAS. This role involves overseeing the architectural design and deployment of a new mobile application, ensuring it is scalable and aligns with industry best practices.Key...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking an experienced and detail-oriented Civil Engineering Project Lead to join our renewable energy team at Westwood Professional Services, Inc.Key ResponsibilitiesLead the overall direction and coordination of civil engineering projects, ensuring alignment with project goals and client expectations.Manage project teams, providing...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled GPU Architect to join our team at Advanced Micro Devices, Inc. as an AI - GPU Systems Architect. This is a critical role that will involve designing and developing cutting-edge accelerated computing platforms.Key ResponsibilitiesLead the development of complex platform architectures from component to rack scale...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Civil Engineering Intern to join our Land Development team at Westwood Professional Services, Inc.Key ResponsibilitiesPrepare preliminary analysis on various technical tasksConduct project research, field work, observation, and preliminary design calculationsTake personal responsibility for...


  • Dallas, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Civil Engineering Intern to join our team at Westwood Professional Services, Inc. in Dallas, TX.Key ResponsibilitiesPrepare preliminary analysis on various technical tasksConduct project research, field work, observation, and preliminary design calculationsTake personal responsibility for...


  • Houston, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Landscape Architecture Intern to join our team at Westwood Professional Services, Inc. in Houston, TX.Key ResponsibilitiesPerform entry-level design tasks, project coordination, and training under the supervision of a Registered Landscape Architect.Work collaboratively as part of a design...


  • Dallas, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Landscape Architecture Intern to join our team at Westwood Professional Services, Inc. in Dallas, TX.Key ResponsibilitiesPerform entry-level design tasks, project coordination, and training under the supervision of a Registered Landscape Architect.Work collaboratively as part of a design...