Senior AI and ML Infrastructure Engineer

4 days ago


Durham, North Carolina, United States NVIDIA Full time
About the Role

NVIDIA is seeking a highly skilled AI/ML Infrastructure Engineer to join our team in Santa Clara, CA, USA. As an Engineer, you will have a unique opportunity to enhance productivity for our researchers by implementing improvements throughout the entire stack.

Key Responsibilities
  • Contribute to the design and implementation of advanced AI/ML infrastructure solutions that have a direct impact on the efficiency of our research teams.
  • Work closely with our research teams to comprehend their infrastructure requirements and challenges, translating those observations into actionable enhancements.
  • Design and implement solutions for critical areas such as storage management for datasets and logs, error attribution, and core reliability issues within our large-scale GPU clusters.
  • Continuously monitor and optimize the performance of our AI/ML infrastructure, ensuring high availability, scalability, and efficient resource utilization.
  • Create and deploy automation tools, monitoring solutions, and effective operational strategies to simplify infrastructure management and minimize manual tasks.
  • Help define and enhance important measures of AI researcher productivity, ensuring that our actions are in line with measurable results.
  • Collaborate with diverse teams, including researchers, data engineers, and DevOps professionals, to create a seamless and integrated AI/ML infrastructure ecosystem.
Requirements
  • BS or equivalent experience (MS preferred) in Computer Science or related with 12+yrs of relevant experience.
  • Strong background in software engineering, with experience in building and maintaining large-scale distributed systems, preferably in the context of AI/ML infrastructure.
  • Proficiency in programming languages such as Python, Go, or C++, as well as familiarity with cloud computing platforms (e.g., AWS, GCP, Azure).
  • Hands-on experience with containerization technologies (e.g., Docker, Kubernetes), automation tools (e.g., Ansible, Terraform), and monitoring solutions (e.g., Prometheus, Grafana).
  • Understanding of AI/ML workflows, including data processing, model training, and inference pipelines.
  • Excellent problem-solving skills, with the ability to analyze complex systems, identify bottlenecks, and implement scalable solutions.
  • Excellent communication and collaboration skills, with the ability to work effectively with diverse teams and individuals.
  • Enthusiasm for continual learning and keeping abreast of emerging technologies and effective approaches in the AI/ML infrastructure field.
What We Offer

NVIDIA offers highly competitive salaries and a comprehensive benefits package. We have some of the most experienced and versatile people in the world working for us and, due to unprecedented growth, our extraordinary engineering teams are growing fast.

We are committed to fostering a diverse work environment and proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Durham, North Carolina, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled AI/ML Infrastructure Engineer to join our team in Santa Clara, CA, USA. As an Engineer, you will have a unique opportunity to enhance productivity for our researchers by implementing improvements throughout the entire stack.Key ResponsibilitiesContribute to advanced AI/ML infrastructure solutions that have a...


  • Durham, North Carolina, United States NVIDIA Full time

    AI/ML Infrastructure Engineer Role at NVIDIANVIDIA is seeking a skilled AI/ML Infrastructure Engineer to enhance productivity for our researchers by implementing improvements throughout the stack.Key Responsibilities: Identify and address infrastructure gaps to ensure reliable, efficient, and scalable solutions. Collaborate with research teams to understand...

  • Software Engineer

    3 weeks ago


    Durham, North Carolina, United States NetApp Full time

    About NetAppNetApp is a leading provider of intelligent data infrastructure solutions, empowering customers to turn data into business opportunity. Our innovative products and services help organizations unlock the full potential of their data, driving business growth and success.We're a company that values diversity, inclusion, and collaboration. Our team...

  • AI/ML Architect

    18 hours ago


    Durham, North Carolina, United States IQVIA Full time

    Global Technology Organization – Architecture and Standards: IQVIA's Architecture & Standards organization aims to enhance efficiency, speed, quality, interoperability, and alignment of IQVIA's technology by establishing the IQVIA way. Our vision is to foster a developer-first culture that promotes collaboration across siloed software development and other...


  • Durham, North Carolina, United States NetApp Full time

    About NetAppNetApp is a leader in intelligent data infrastructure, empowering customers to turn challenges into opportunities. Our innovative approach combines fresh thinking with proven expertise to help customers unlock the full potential of their data.We're seeking a talented Senior Product Manager to join our AI Product Platform Team. This role will...

  • AI Researcher

    1 month ago


    Durham, North Carolina, United States Infinia ML Inc Full time

    Product ResearcherInfinia ML Inc is seeking a skilled Product Researcher to support the development of AI-enabled solutions. As a key member of the Product team, you will play a crucial role in understanding user behaviors, needs, and motivations.Key Responsibilities:Lead end-to-end user research projects, from defining research objectives to delivering...


  • Durham, North Carolina, United States Cisco Full time

    Job SummaryWe are seeking a highly skilled Technical Marketing Engineer to join our team at Cisco. As a Technical Marketing Engineer, you will be responsible for designing and developing AI solutions that meet customer use cases and business objectives.Key Responsibilities:Design and develop AI solutions that integrate with Cisco on-premises and hybrid cloud...

  • Senior AI Engineer

    3 weeks ago


    Durham, North Carolina, United States NVIDIA Full time

    Job Title: Senior AI EngineerNVIDIA is seeking a highly skilled Senior AI Engineer to join its Compute Developer Technology (Devtech) team. As a key member of this team, you will play a crucial role in developing cutting-edge AI solutions using GPUs.Key Responsibilities:Develop and implement advanced AI algorithms and techniques in deep learning, graphs, and...


  • Durham, North Carolina, United States Nvidia Full time

    About NVIDIANVIDIA is a leader in the field of artificial intelligence, and we're looking for talented individuals to join our team. As a Windows AI Engineer, you'll be working on developing inference runtimes, optimizing GenAI pipelines and inference backends, and devising algorithms that flawlessly incorporate AI into games and applications for Windows.Key...

  • Senior AI Engineer

    2 months ago


    Durham, North Carolina, United States NVIDIA Full time

    About NVIDIANVIDIA is a leading technology company that specializes in designing and manufacturing graphics processing units (GPUs) and high-performance computing hardware. We are a pioneer in the field of artificial intelligence (AI) and have developed a range of AI-powered technologies that are transforming industries such as healthcare, finance, and...

  • Senior AI Engineer

    3 weeks ago


    Durham, North Carolina, United States NVIDIA Full time

    NVIDIA is seeking a highly skilled computer scientist to join its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. The successful candidate will play a key role in developing cutting-edge techniques in deep learning, graphs, machine learning, and data analytics, and will work closely with key customers to understand their...

  • Senior AI Engineer

    4 weeks ago


    Durham, North Carolina, United States NVIDIA Full time

    NVIDIA is seeking a highly skilled computer scientist to join its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. The successful candidate will be responsible for developing cutting-edge techniques in deep learning, graphs, machine learning, and data analytics, and performing in-depth analysis and optimization to ensure...


  • Durham, North Carolina, United States Fidelity TalentSource LLC Full time

    About the RoleFidelity TalentSource is seeking a highly skilled Senior Java Software Engineer to join our team. As a key member of our Recommendation Engine Product team, you will play a critical role in advancing Fidelity's customer personalization efforts.Key ResponsibilitiesDesign and develop cloud native Java applications using innovative technologies...

  • Senior AI Engineer

    1 month ago


    Durham, North Carolina, United States NVIDIA Full time

    NVIDIA is seeking a highly skilled computer scientist to join its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. Artificial intelligence, a long-standing goal of computer scientists, is no longer a distant dream. In the coming years, it will revolutionize every industry. Self-driving cars will reduce congestion and...


  • Durham, North Carolina, United States Nvidia Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our team at NVIDIA. As a key member of our team, you will be responsible for designing and building innovative software solutions for AI applications scalable to thousands of GPUs.Key ResponsibilitiesCrafting a code generation system to accelerate portions of a graph collected...


  • Durham, North Carolina, United States Fidelity TalentSource LLC Full time

    About the RoleFidelity TalentSource is seeking a highly skilled Senior Java Software Engineer to join our team. As a key member of our Recommendation Engine Product team, you will play a critical role in advancing Fidelity's customer personalization efforts.Key ResponsibilitiesDesign and develop cloud native Java applications using innovative...


  • Durham, North Carolina, United States NVIDIA Full time

    NVIDIA is seeking a talented computer scientist to work in its Compute Developer Technology (Devtech) team as an AI Developer Technology Engineer. This role will play a crucial part in driving the company's success in the field of artificial intelligence.Key Responsibilities:Develop and study cutting-edge techniques in deep learning, graphs, machine...


  • Durham, North Carolina, United States Google Full time

    About the RoleWe're seeking a highly skilled Senior Software Engineer to join our Core Machine Learning team at Google. As a key member of our organization, you will be responsible for developing and maintaining cutting-edge machine learning technologies that drive innovation and excellence across Google and the world.ResponsibilitiesDesign, develop, and...


  • Durham, North Carolina, United States Fidelity TalentSource LLC Full time

    About the RoleFidelity TalentSource is seeking a skilled Site Reliability Engineer to join our team in Durham, NC. As a key member of our Site Reliability Center of Excellence, you will play a critical role in ensuring the reliability and resilience of our systems.Key ResponsibilitiesDesign and implement chaos testing strategies to identify and mitigate...


  • Durham, North Carolina, United States GlaxoSmithKline Full time

    About the RoleWe are seeking a highly skilled Product Owner to join our team and lead the development of Data and AI products that will accelerate and improve the probability of success of Manufacturing and Engineering activities in our Global Supply Chain.Key ResponsibilitiesDefine and agree vision and roadmaps for individual Data and AI products, such as...