Senior AI and ML Infra Engineer, Research Clusters

4 weeks ago


Durham, United States NVIDIA Full time

At NVIDIA in Santa Clara, CA, USA, we are currently seeking a skilled AI/ML Infrastructure Engineer to join our team. As an Engineer, you will have a unique chance to enhance productivity for our researchers by implementing improvements throughout the entire stack. Your main responsibility will be to identify and address infrastructure gaps to ensure reliable, efficient, and scalable solutions. Join us and be a part of shaping the future of AI/ML technology

In this role, you will have the chance to

  • Contribute to advanced AI/ML infrastructure solutions that have a direct impact on the efficiency of our highly skilled research teams.

  • A dynamic and collaborative environment that values innovation, creativity, and continuous improvement.

  • Competitive compensation and comprehensive benefits package.

  • Opportunities for professional growth and career advancement within the AI/ML infrastructure domain.

What you will be doing:

  • Work closely with our research teams to comprehend their infrastructure requirements and challenges, translating those observations into actionable enhancements.

  • Design and implement solutions for critical areas such as storage management for datasets and logs, error attribution, and core reliability issues within our large scale GPU clusters.

  • Continuously monitor and optimize the performance of our AI/ML infrastructure, ensuring high availability, scalability, and efficient resource utilization.

  • Create and deploy automation tools, monitoring solutions, and effective operational strategies to simplify infrastructure management and minimize manual tasks.

  • Help define and enhance important measures of AI researcher productivity, ensuring that our actions are in line with measurable results.

  • Collaborate with diverse teams, including researchers, data engineers, and DevOps professionals, to create a seamless and integrated AI/ML infrastructure ecosystem.

  • Keep abreast of the latest advancements in AI/ML infrastructure technologies, frameworks, and effective strategies, and promote their implementation within the company.

What we need to see:

  • BS or equivalent experience (MS preferred) in Computer Science or related with 12+yrs of relevant experience

  • Strong background in software engineering, with experience in building and maintaining large-scale distributed systems, preferably in the context of AI/ML infrastructure.

  • Proficiency in programming languages such as Python, Go, or C++, as well as familiarity with cloud computing platforms (e.g., AWS, GCP, Azure).

  • Hands-on experience with containerization technologies (e.g., Docker, Kubernetes), automation tools (e.g., Ansible, Terraform), and monitoring solutions (e.g., Prometheus, Grafana).

  • Understanding of AI/ML workflows, including data processing, model training, and inference pipelines.

  • Excellent problem-solving skills, with the ability to analyze complex systems, identify bottlenecks, and implement scalable solutions.

  • Excellent communication and collaboration skills, with the ability to work effectively with diverse teams and individuals.

  • Enthusiasm for continual learning and keeping abreast of emerging technologies and effective approaches in the AI/ML infrastructure field.

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most experienced and versatile people in the world working for us and, due to unprecedented growth, our extraordinary engineering teams are growing fast. If you're a creative and autonomous engineer with real passion for technology, we want to hear from you.

The base salary range is 220,000 USD - 419,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and . NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Durham, United States Infinia ML Full time

    Job DescriptionJob DescriptionSalary: Infinia ML, based in the Raleigh-Durham area, is an AI company specializing in automating workflows in healthcare. Infinia ML is part of Aspirion, a leading provider of complex claims and denials management services. Aspirion helps hospitals and healthcare systems minimize patient burden by maximizing reimbursement from...


  • Durham, United States Infinia ML Full time

    Job DescriptionJob DescriptionSalary: Infinia ML, based in the Raleigh-Durham area, is an AI company specializing in automating workflows in healthcare. Infinia ML is part of Aspirion, a leading provider of complex claims and denials management services. Aspirion helps hospitals and healthcare systems minimize patient burden by maximizing reimbursement from...

  • Staff AI/ML Engineer

    4 weeks ago


    Durham, United States Fortrea Full time

    As a leading global contract research organization (CRO) with a passion for scientific rigor and decades of clinical development experience, Fortrea provides pharmaceutical, biotechnology, and medical device customers a wide range of clinical development, patient access and technology solutions across more than 20 therapeutic areas. With over 19,000 staff...


  • Durham, United States Infinia ML Full time

    Job DescriptionJob DescriptionSalary: The Company Infinia ML, based in the Raleigh-Durham area, is an AI company specializing in intelligent document processing. Our primary focus is automating workflows that traditionally required human-level understanding of written content. We are developing a versatile and standardized platform for processing written...


  • Durham, United States Infinia ML Full time

    Job DescriptionJob DescriptionSalary: The Company Infinia ML, based in the Raleigh-Durham area, is an AI company specializing in intelligent document processing. Our primary focus is automating workflows that traditionally required human-level understanding of written content. We are developing a versatile and standardized platform for processing written...

  • Senior AI/ML Engineer

    3 weeks ago


    Durham, United States Fidelity TalentSource LLC Full time

    Job Description:Sr. Machine Learning Engineer As a Machine Learning Engineer, build and maintain large scale ML Infrastructure and ML pipelines. Contribute to building advanced analytics, machine learning platform and tools to enable both prediction and optimization of models. Extend existing ML Platform and frameworks for scaling model training &...


  • Durham, United States Pyramid Consulting Full time

    Immediate need for a talented Machine Learning/AI Engineer. This is a 12+ Months Contract opportunity with long-term potential and is located in Westlake, TX (Hybrid). Please review the job description below and contact me ASAP if you are interested. Job ID:24-18255 Pay Range: $65 - $75/hour. Employee benefits include, but are not limited to, health...

  • AI Engineer

    1 week ago


    Durham, United States Compunnel Full time

    Description Requirements: health care experience. Requires Skills: ETL data movement knowledge Snowflake Pipeline experience Aws cloud experience Some CI/CD “claims data experience” Nice to have: Azure could work instead of AWS Summary Data Analyst- AI/Machine Learning Delivery Do you have a passion for data, a desire to work with innovative...

  • AI Engineer

    1 week ago


    Durham, United States JobRialto Full time

    Description Requirements: health care experience. Requires Skills: ETL data movement knowledge Snowflake Pipeline experience Aws cloud experience Some CI/CD "claims data experience" Nice to have: Azure could work instead of AWS Summary Data Analyst- AI/Machine Learning Delivery Do you have a passion for data, a desire to work with innovative technology and...


  • Durham, United States Alphanumeric Systems Inc. Remote Work Freelance Full time $95 - $100

    Alphanumeric is hiring a SENIOR SOFTWARE ENGINEER - GOVERNANCE to work out of the Research Triangle Park, NC area with our client of 20 years committed to improving lives through medical and pharmaceutical advancements.The Onyx Research Data Platform organization represents a major investment by R&D and Digital & Tech, designed to deliver a step-change in...


  • Durham, United States Alphanumeric Systems Inc. Remote Work Freelance Full time $95 - $100

    Alphanumeric is hiring a SENIOR SOFTWARE ENGINEER - GOVERNANCE to work out of the Research Triangle Park, NC area with our client of 20 years committed to improving lives through medical and pharmaceutical advancements.The Onyx Research Data Platform organization represents a major investment by R&D and Digital & Tech, designed to deliver a step-change in...


  • Durham, United States CoVar Applied Technologies Inc Full time

    Machine Learning Engineer About CoVar CoVar is a small AI/ML R&D software company in Durham, NC, that uses artificial intelligence to solve problems that matter. We develop AI/ML tools to help the DoD detect enemies and threats, help biomedical researchers find new cures for diseases, and help monitor machinery to prevent injuries and environmental...


  • Durham, United States CoVar Full time

    Job DescriptionJob DescriptionMachine Learning EngineerAbout CoVarCoVar is a small AI/ML R&D software company in Durham, NC, that uses artificial intelligence to solve problems that matter. We develop AI/ML tools to help the DoD detect enemies and threats, help biomedical researchers find new cures for diseases, and help monitor machinery to prevent injuries...


  • Durham, United States CoVar Full time

    Job DescriptionJob DescriptionMachine Learning EngineerAbout CoVarCoVar is a small AI/ML R&D software company in Durham, NC, that uses artificial intelligence to solve problems that matter. We develop AI/ML tools to help the DoD detect enemies and threats, help biomedical researchers find new cures for diseases, and help monitor machinery to prevent injuries...


  • Durham, United States CoVar Full time

    Job DescriptionJob DescriptionMachine Learning EngineerAbout CoVarCoVar is a small AI/ML R&D software company in Durham, NC, that uses artificial intelligence to solve problems that matter. We develop AI/ML tools to help the DoD detect enemies and threats, help biomedical researchers find new cures for diseases, and help monitor machinery to prevent injuries...


  • Durham, United States NVIDIA Full time

    NVIDIA is looking for a passionate, world-class computer scientist to work in its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. Artificial intelligence, the dream of computer scientists for over half a century, is no longer science fiction. And in the next few years, it will transform every industry. Soon, self-driving...


  • Durham, United States NVIDIA Full time

    NVIDIA is looking for a passionate, world-class computer scientist to work in its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. Artificial intelligence, the dream of computer scientists for over half a century, is no longer science fiction. And in the next few years, it will transform every industry. Soon, self-driving...


  • Durham, United States Direct Supply Full time

    Position Summary: Direct Supply is building the future of healthcare technology with industry-leading products, solutions and platforms to help improve the lives of millions of seniors and those who care for them. In the Senior Manager, Engineering position, you'll lead a talented team of Engineers, drive technical excellence, and contribute to our AI...


  • Durham, United States Direct Supply Full time

    Position Summary: Direct Supply is building the future of healthcare technology with industry-leading products, solutions and platforms to help improve the lives of millions of seniors and those who care for them. In the Senior Manager, Engineering position, you'll lead a talented team of Engineers, drive technical excellence, and contribute to our AI...


  • Durham, United States APR Consulting Full time

    Senior Software Engineer - Governance Location: Durham, North Carolina Type: Contract Job #76728 A healthcare client is looking for a Senior Software Engineer - GovernanceLocation: Durham, NC Position: Senior Software Engineer - Governance Pay Rate: $105.33/hr Duration: 12 months Expected Shift: Monday - Friday 8-5pm Remote roleOverview: The...