Principal Engineer, Distributed Machine Learning Specialist

16 hours ago


Santa Clara, California, United States NVIDIA Full time

NVIDIA is seeking a highly skilled Principal Engineer to join our Distributed Machine Learning team. As a key member of our team, you will be responsible for designing and developing innovative solutions to accelerate and scale machine learning workloads on GPU-enabled Spark clusters.

Key Responsibilities:

  • Design and develop new user-friendly APIs and libraries to optimize the use of existing deep learning and machine learning frameworks in GPU-enabled Spark clusters.
  • Develop GPU-accelerated machine learning libraries for distributed training and inference on Spark clusters, focusing on improving performance and usability.
  • Demonstrate the superiority of developed solutions on industry-standard benchmarks and datasets.
  • Contribute to the enhancement of open-source projects such as RAPIDS, XGBoost, and Apache Spark.
  • Collaborate with NVIDIA partners and customers to deploy distributed machine learning algorithms in cloud or on-premise environments.
  • Stay up-to-date with published advances in distributed machine learning systems and algorithms.
  • Provide technical mentorship to a team of engineers.

Requirements:

  • BS, MS, or PhD in Computer Science, Computer Engineering, or a closely related field (or equivalent experience).
  • 12+ years of work or research experience in software development.
  • 5+ years of experience as a technical lead in distributed machine learning and/or deep learning.
  • 3+ years of open-source development experience.
  • 3+ years of hands-on experience with Spark MLlib, XGBoost, and/or PyTorch.
  • Knowledge of the internals of Apache Spark MLlib.
  • Experience with Kubernetes, YARN, Spark, and/or Ray for distributed ML orchestration.
  • Proven technical skills in designing, implementing, and delivering high-quality distributed systems.
  • Excellent programming skills in C++, Scala, and Python.
  • Familiarity with agile software development practices.

Preferred Qualifications:

  • Familiarity with NVIDIA libraries, such as RAPIDS cuML, Spark-RAPIDS, and NVTabular.
  • Familiarity with NVIDIA GPUs and CUDA.
  • Familiarity with Horovod, Petastorm, and other existing/past distributed learning libraries.
  • Experience working with multi-functional teams across organizational boundaries and geographies.

NVIDIA is committed to fostering a diverse work environment and is an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.



  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to join our Distributed Machine Learning team, focusing on GPU-accelerated Apache Spark. As a key member of our team, you will design and develop innovative solutions to accelerate and scale model training, leveraging open-source communities and NVIDIA's expertise in GPU acceleration.Key...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to join our Distributed Machine Learning team, focusing on GPU-accelerated Apache Spark. As a key member of our team, you will design and develop innovative solutions to accelerate and scale model training, leveraging open-source communities and NVIDIA's expertise in GPU acceleration.Key...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to join our Distributed Machine Learning team. As a key member of our team, you will be responsible for designing and developing GPU-accelerated distributed machine learning solutions.Key ResponsibilitiesDesign and develop new user-friendly APIs and libraries to optimize the use of existing...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to lead the development of GPU-accelerated distributed machine learning solutions. As a key member of our team, you will design and develop innovative APIs and libraries to optimize the use of existing deep learning and machine learning frameworks in GPU-enabled Spark clusters.Key...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Principal Machine Learning EngineerAt Palo Alto Networks, we're pushing the boundaries of cybersecurity innovation. As a Principal Machine Learning Engineer, you'll be at the forefront of developing cutting-edge AI and machine learning solutions to protect our digital world.About the RoleWe're seeking a highly skilled and experienced Machine...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Machine Learning Engineer to join our team at Palo Alto Networks. As a key member of our cybersecurity team, you will be responsible for designing and developing advanced machine learning solutions to protect our customers' digital way of life.Key ResponsibilitiesDesign and develop high-performance...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Principal Machine Learning EngineerAt Palo Alto Networks, we're seeking a highly skilled Principal Machine Learning Engineer to join our team. As a key member of our cybersecurity team, you will be responsible for designing and developing advanced AI and machine learning solutions to protect our customers' digital way of life.Job SummaryWe're...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Machine Learning Engineer to join our team at Palo Alto Networks. As a key member of our cybersecurity team, you will be responsible for designing and developing advanced AI and Machine Learning solutions to protect our customers' networks and systems.As a Principal Machine Learning Engineer, you will...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Machine Learning Engineer to join our team at Palo Alto Networks. As a key member of our engineering team, you will be responsible for designing and developing advanced machine learning solutions to protect our customers' digital way of life.Key ResponsibilitiesDesign and develop high-performance...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About UsPalo Alto Networks is a leading cybersecurity company that protects the digital way of life. Our mission is to be the cybersecurity partner of choice, and we're committed to innovation and collaboration.Job DescriptionWe're seeking a highly skilled Principal Machine Learning Engineer to join our team. As a key member of our engineering team, you will...


  • Santa Clara, California, United States XPENG Motors Full time

    Job Title: Senior Machine Learning EngineerXpeng Motors is a leading innovator in the electric vehicle industry, pushing the boundaries of smart mobility. We're seeking a highly skilled Senior Machine Learning Engineer to join our team and contribute to the development of cutting-edge AI solutions.Job SummaryWe're looking for a talented engineer with a...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Cybersecurity Innovator to join our team at Palo Alto Networks. As a key member of our team, you will be responsible for designing and developing advanced AI and Machine Learning solutions to protect our customers' digital way of life.With a strong background in Machine Learning techniques and data analytics, you...

  • Principal Engineer

    3 weeks ago


    Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the technology world, known for its innovative and forward-thinking approach to computing and deep learning. We are committed to fostering a diverse work environment and proud to be an equal opportunity employer.Job DescriptionWe are seeking a Principal Engineer to join our team and contribute to the development of our AI...


  • Santa Clara, California, United States Eightfold Full time

    About EightfoldEightfold AI is a pioneering company in the field of AI-powered talent intelligence, revolutionizing the way organizations manage their talent. Our AI-powered Talent Intelligence Platform helps companies identify, attract, and retain top talent, while also providing employees with the tools they need to grow and succeed in their careers.About...


  • Santa Clara, California, United States Johnson and Johnson Full time

    Job Title: Senior Principal MLOps EngineerJohnson & Johnson is seeking a highly skilled Senior Principal MLOps Engineer to join our team in the US. As a key member of our digital surgery team, you will be responsible for leading the development and deployment of machine learning systems and infrastructure.Key Responsibilities:Lead the execution of ML systems...


  • Santa Clara, California, United States XPENG Motors Full time

    Job Title: Machine Learning Engineer - AI FoundationXpeng Motors is a leading smart electric vehicle company that designs, develops, manufactures, and markets smart EVs with advanced Internet, AI, and autonomous driving technologies. We are committed to in-house R&D and intelligent manufacturing to create a better mobility experience for our customers.We are...


  • Santa Clara, California, United States Eightfold Full time

    About EightfoldEightfold AI is a pioneering company in the field of AI-powered talent intelligence, revolutionizing the way organizations manage their talent. Our AI-powered Talent Intelligence Platform helps companies identify, attract, and retain top talent, while also providing employees with the tools they need to grow and succeed in their careers.About...


  • Santa Clara, California, United States XPENG Motors Full time

    Transforming the Future of MobilityXpeng Motors is a pioneering company in the field of smart electric vehicles, pushing the boundaries of innovation and technology. We are committed to creating a better mobility experience for our customers through in-house R&D and intelligent manufacturing.Job SummaryWe are seeking a highly skilled Machine Learning...


  • Santa Clara, California, United States Advanced Micro Devices , Inc. Full time

    Unlock the Power of InnovationWe're seeking a highly motivated Machine Learning (ML)/Artificial Intelligence (AI) intern/co-op to join our team and participate in research and development of next-generation product differentiation features with extraordinary ML/AI engineers.As a member of our team, you'll have the opportunity to learn innovative new...

  • AI Engineer

    1 month ago


    Santa Clara, California, United States Abbott Laboratories Full time

    About the RoleWe are seeking a highly skilled Principal AI Engineer to join our team at Abbott Laboratories. As a key member of our Corporate IT Organization, you will be responsible for designing and developing artificial intelligence products that drive business growth and innovation.Key ResponsibilitiesDevelop and apply industry-leading machine learning...