HPC Deployment Manager, Professional Services Senior

6 days ago


Texas, United States NVIDIA Full time

About the Role

NVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a leader in the field of high-performance computing, NVIDIA is driving innovation in deep learning, data analytics, and data center optimization. We are looking for a skilled professional to oversee the deployment of cutting-edge InfiniBand and Ethernet technologies with a team of AI and HPC experts.

Key Responsibilities

  • Direct and supervise the service HPC engineering functions in designing, developing, installing, and validating hardware and software for Customer AI High-Performance Computing (HPC) systems.
  • Lead, handle, mentor, and build a high-performing HPC service engineering team to deliver innovative advances in high-performance computing AI systems.
  • Responsible for leading our HPC projects' planning, implementation, and performance. Improve the integrity of system services bring-up and related by applying groundbreaking technical and operational knowledge to configure and maintain HPC AI network and server platforms.
  • Drive HPC team hardware and software deployment, plans, develops, and deploys procedures for system validation.
  • Lead team activities and drive tests and plans for Customer's HPC AI systems implementations, custom scripts, and testing procedures to ensure operational reliability for the system.
  • Support the HPC Engineering team, working with other internal collaborators to develop and run a well-rounded strategy for delivering service quality and continuous service improvement. Supports governance for software engineering through the implementation of standards and quality measures.
  • Leads team member development, helping them set and achieve goals for their career growth. Develop an inclusive environment that values team member differences, creating a sense of belonging and appreciation. Chips in to a culture of trust and clarity.
  • Build strong relationships with NVIDIA leaders, customers, partners, and collaborators. Works closely to identify, implement, and support leading NVIDIA's AI solutions engineering, maintaining currency with industry standards and innovations. Provides input around process optimization, department budgeting, and the monitoring and management of resources.
  • Be the domain authority with customers during planning calls through implementation.

Requirements

  • 8+ overall years' experience in IT, high-performance computing, or other related field; 3+ years of experience in a management or leadership role.
  • Demonstrated expertise in HPC systems design configuration and planning.
  • Proficiency with low latency/high-bandwidth interconnect infrastructure (Infiniband and Ethernet).
  • Expertise with HPC system software cluster management/provisioning tools, including job schedulers (Slurm, salt, xCAT).
  • Proficiency with shared and distributed memory parallelism (OpenMP, MPI, NCCL and HPL) and accelerators (GPUs).
  • Strong scripting ability (Bash, Perl, Python, etc.) and experience with programming fundamentals.
  • Expertise with administration, supervising and maintaining secure Linux/Unix operating systems (CentOS, Solaris).
  • Experience establishing processes for maintaining system performance, managing best-in-class standards, and familiarity with cloud computing and container technologies.
  • Ability to understand and work with large, sophisticated systems, identify and resolve problems, handle performance, and troubleshoot network issues related to infrastructure.
  • Expertise with multi-vendor hardware/software management, security, and network/Internet protocols. Strong communication and social skills, with the ability to provide detailed information and high-level summaries to management-level individuals and groups, present the business side of technical topics to non-technical audiences, and develop positive working relationships and strong rapport with team members.
  • Bachelor's degree in computer science, information systems, or a related field or equivalent experience.
  • Solid knowledge of HPC storage.
  • Exemplary communication and interpersonal skills, with the ability to present the business side of technical topics to non-technical audiences and persuasively and optimally get along with relationships with various stakeholders and diverse individuals and groups.

Preferred Qualifications

  • InfiniBand experience.
  • Experience with GPU-focused hardware/software.
  • Experience with MPI.
  • Automation tooling background (Ansible, Salt, Puppet, etc.).
  • Ethernet and Storage technologies such as Lustre or GPFS.

Compensation and Benefits

The base salary range for this position is $208,000 - $327,750. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits.

NVIDIA is an equal opportunity employer and is committed to fostering a diverse work environment. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.



  • Texas, United States NVIDIA Full time

    Job SummaryNVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a key member of the team, you will be responsible for supervising the deployment of cutting-edge InfiniBand and Ethernet technologies with a team of AI and HPC experts.Key ResponsibilitiesDirect and supervise the service HPC engineering functions...


  • Texas, United States NVIDIA Full time

    About the RoleNVIDIA is seeking an experienced HPC Deployment Manager to join its Professional Services division. As a leader in the field of high-performance computing, NVIDIA is driving innovation in deep learning, data analytics, and data center optimization. We are looking for a skilled professional to oversee the deployment of cutting-edge InfiniBand...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled Senior HPC Infrastructure Engineer to join our team at Advanced Micro Devices, Inc. As a key member of our HPC EDA Infrastructure team, you will be responsible for establishing and maintaining our technological leadership position in HPC EDA infrastructure in the semiconductor industry.Key ResponsibilitiesConduct...

  • Sr. HPC Engineer

    4 weeks ago


    Texas City, United States Blueprint Consulting Services Full time

    Sr. HPC EngineerRemoteWho is Blueprint? We are a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States. Unified by a shared passion for solving complicated problems, our people are our greatest asset. We use technology as a tool to bridge the gap between strategy and execution, powered by the...

  • Sr. HPC Engineer

    1 day ago


    Texas City, United States Blueprint Consulting Services Full time

    Who is Blueprint? We are a technology solutions firm headquartered in Bellevue, Washington, with a strong presence across the United States. Unified by a shared passion for solving complicated problems, our people are our greatest asset. We use technology as a tool to bridge the gap between strategy and execution, powered by the knowledge, skills, and...


  • Texas City, Texas, United States Hewlett Packard Enterprise Development LP Full time

    Position Overview:The Senior Analyst role focuses on Financial Planning & Analysis (FP&A) within the HPC AI & Servers division. This position is hybrid, requiring an average of 2-3 days per week in the office.About Hewlett Packard Enterprise:Hewlett Packard Enterprise is a leading global edge-to-cloud organization that transforms how individuals and...


  • Austin, Texas, United States Advanced Micro Devices, Inc Full time

    About the RoleWe are seeking a highly skilled and experienced HPC Solutions Architect and System Engineer to join our team at Advanced Micro Devices, Inc. This is a key position that will play a critical role in driving the success of our Data Center GPU organization.Key ResponsibilitiesDrive technical innovation to improve AMD's capabilities across...


  • Texas, United States Mathys+Potestio The Creative Party® Full time

    Mathys+Potestio is seeking a Senior Project Manager for People Services & Technology to join a dynamic global technology organization.Position OverviewThe Senior Project Manager for People Services & Technology plays a pivotal role in steering projects towards successful outcomes, whether they involve small initiatives or extensive cross-departmental...


  • Houston, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking an experienced and ambitious professional to join our transportation service team as a Senior Project Manager. This is an opportunity to join our highly experienced and dynamic team dedicated to providing high-level professional services to our clients.Key ResponsibilitiesManage the design and delivery of multi-discipline...


  • Austin, Texas, United States GCS Technologies Full time

    About the RoleWe are seeking a highly skilled Senior IT Project Analyst to join our team at GCS Technologies. As a Senior IT Project Analyst, you will be responsible for managing and implementing multiple IT infrastructure projects for a wide variety of clients.Key ResponsibilitiesManage and implement multiple IT infrastructure projectsCollaborate with...


  • Texas City, Texas, United States Clean Harbors Full time

    About the RoleWe are seeking a highly skilled and experienced Crew Leader/Foreman to join our team at HPC-Industrial, a Clean Harbors company. As a Crew Leader/Foreman, you will be responsible for leading and managing a team of industrial cleaning professionals to ensure the safe and efficient completion of industrial cleaning operations.Key...


  • Texas, United States WEX Inc Full time

    About the RoleThis is a senior leadership position within WEX Inc, responsible for driving the technical vision and roadmap of the company's strategic credit card issuing and processing platform, TAG.Key ResponsibilitiesDevelop and execute a technical strategy that aligns with the company's business objectives.Lead a team of application architects and...


  • Fort Worth, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Land Planning Intern to join our team at Westwood Professional Services, Inc. in our Ft. Worth, TX office.Key ResponsibilitiesWork with senior certified planners to review and interpret regulatory codes, ordinances, and comprehensive plans.Analyze land features to identify opportunities,...

  • Senior AEM Engineer

    2 weeks ago


    Texas, United States Capgemini Full time

    Position: Senior AEM EngineerLocation: RemoteRole Overview:As a Senior AEM Engineer, you will be instrumental in the design, development, and deployment of solutions utilizing Adobe Experience Manager (AEM) for both Sites and Assets. Your expertise will contribute to enhancing our digital platforms and ensuring seamless user experiences.Key...


  • Texas, United States Red Oak Technologies Full time

    Position: Senior Technical Program ManagerLocation: On-site in Austin, TexasOverview:We are seeking a Technical Program Manager with a robust background in managing customer-facing applications and a solid understanding of API integration.Key Responsibilities:Proven experience in managing UI and frontend projectsStrong grasp of API functionalities and...


  • Texas, United States Red Oak Technologies Full time

    Senior Technical Program Manager for Customer-Facing ApplicationsLocation: Austin, TexasPosition Overview:We are seeking a Technical Program Manager with a robust background in managing UI and frontend applications. The ideal candidate will possess a strong understanding of API integration and have a proven track record in leading projects that directly...


  • Texas, United States Capgemini Full time

    Position: Senior AEM Solutions ArchitectLocation: RemoteRole Overview:As a Senior AEM Solutions Architect, you will play a pivotal role in the design, development, and deployment of innovative software solutions utilizing Adobe Experience Manager (AEM) for both Sites and Assets. Your expertise will be essential in ensuring high-quality deliverables through...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Electrical Engineering Intern to join our team at Westwood Professional Services, Inc. As an Electrical Engineering Intern, you will have the opportunity to work on a variety of projects, including construction document production, coordination with design teams, and utilization of AutoCAD...


  • Fort Worth, Texas, United States Westwood Professional Services, Inc. Full time

    About Westwood Professional Services, Inc.At Westwood, our mission is to create a better world through our work. We transform the energy grid, design resilient infrastructure, and develop communities that will thrive today and for future generations. With over 50 years of experience and a legacy of innovation, we stand at the forefront of our industry,...


  • Plano, Texas, United States Westwood Professional Services, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Electrical Engineering Intern to join our team at Westwood Professional Services, Inc. As an Electrical Engineering Intern, you will have the opportunity to work on a variety of projects related to renewable energy design, including solar and wind energy systems.Key ResponsibilitiesSupport...