Current jobs related to Principal Engineer for AI Systems - Santa Clara, California - NVIDIA


  • Santa Clara, California, United States Nvidia Full time

    About NVIDIANVIDIA is a leader in the technology industry, renowned for its innovative products and solutions. We are seeking a highly experienced and dynamic Principal Software Engineer to join our team and contribute to the development of our generative AI systems and productivity solutions.Job SummaryWe are looking for a skilled software engineer to lead...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Principal Engineer for AI Software ResiliencyWe are seeking a highly skilled Principal Engineer to lead the development of AI software resiliency for our cutting-edge AI supercomputers.About the Role:As a Principal Engineer, you will play a pivotal role in defining and implementing critical resiliency features for our AI supercomputers at a scale...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionWe are seeking a highly skilled Principal Engineer to lead the development of AI software resilience for our cutting-edge AI supercomputers.As a key member of our team, you will play a critical role in defining and implementing critical resiliency features for our AI systems, ensuring they remain robust and reliable at all times.Your expertise...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionWe are seeking a highly skilled Principal Engineer to lead the development of AI software resiliency for our cutting-edge AI supercomputers. As a key member of our team, you will play a pivotal role in defining and implementing critical resiliency features to ensure our AI systems remain robust and reliable at all times.Key...

  • Principal Engineer

    4 weeks ago


    Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the technology world, known for its innovative and forward-thinking approach to computing and deep learning. We are committed to fostering a diverse work environment and proud to be an equal opportunity employer.Job DescriptionWe are seeking a Principal Engineer to join our team and contribute to the development of our AI...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionNVIDIA is seeking a highly skilled Principal Engineer to design and build a software factory that will take an AI model and create deployable services across Cloud and On-prem Kubernetes environments.The ideal candidate will have advanced programming skills to build distributed and compute systems, backend services, microservices, and cloud...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Principal Machine Learning EngineerAt Palo Alto Networks, we're pushing the boundaries of cybersecurity innovation. As a Principal Machine Learning Engineer, you'll be at the forefront of developing cutting-edge AI and machine learning solutions to protect our digital world.About the RoleWe're seeking a highly skilled and experienced Machine...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to design and build a software factory that will take AI models and create deployable services across Cloud and On-prem Kubernetes environments.Key ResponsibilitiesArchitect and build a scalable software factory that operates with high uptime and performance.Design and implement highly...

  • Principal AI Engineer

    4 weeks ago


    Santa Clara, California, United States Abbott Laboratories Full time

    About Abbott LaboratoriesAbbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans the spectrum of healthcare, with leading businesses and products in diagnostics, medical devices, nutritionals and branded generic medicines.At Abbott, you can do work that matters, grow, and...


  • Santa Clara, California, United States Dell Full time

    Job Title: Senior Principal Power EngineerAs a Senior Principal Power Engineer at Dell, you will play a critical role in developing next-generation large-scale AI Infrastructure with a focus on leading power systems. You will engage with high-profile AI customers to optimize solutions for their applications and work closely with suppliers, CTO organization,...

  • AI Systems Engineer

    3 days ago


    Santa Clara, California, United States Meshy Full time

    About MeshyWe are a leading 3D generative AI company headquartered in the Silicon Valley, on a mission to unleash 3D creativity.We simplify the creation of distinctive 3D assets for both professional artists and hobbyists by transforming text and images into stunning 3D models in minutes.Our global team of experts in computer graphics, AI, and art includes...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job DescriptionAt Palo Alto Networks, we're seeking a highly skilled Principal Machine Learning Engineer to join our team. As a key member of our cybersecurity team, you will be responsible for designing and developing advanced machine learning solutions to protect our customers' digital way of life.Our mission is to leverage AI and machine learning...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Sr Principal Engineer Software to join our team at Palo Alto Networks. As a key member of our engineering team, you will be responsible for driving the technical roadmap and developing next-generation data processing systems optimized for AI-powered use cases.As a Sr Principal Engineer, you will be a thought...


  • Santa Clara, California, United States Dell Full time

    Job SummaryWe are seeking a highly experienced Senior Principal Power Engineer to join our AI Infrastructure Team in Austin, Texas, Santa Clara, California, or Hopkinton, Massachusetts. As a key member of our team, you will be responsible for developing next-generation large-scale AI Infrastructure with a focus on leading power systems.Key...


  • Santa Clara, California, United States Nvidia Full time

    Job SummaryWe are seeking a highly motivated and experienced Principal Graphics System Engineer to join our team at NVIDIA. As a key member of our graphics team, you will be responsible for designing and implementing new emerging graphics features that cut through the entire stack from top-level graphics APIs through shading languages and into the driver...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job SummaryPalo Alto Networks is seeking a highly skilled Senior Principal Software Engineer to join our AI Runtime Security team. As a key member of our team, you will be responsible for designing and developing scalable, reliable, and efficient cloud services.Key ResponsibilitiesArchitect and develop cloud services for AI Runtime SecurityLead the design...

  • Principal Scientist

    3 days ago


    Santa Clara, California, United States Amazon Full time

    About the RoleWe are seeking a highly skilled Principal Scientist to join our team at Amazon. As a key member of our organization, you will be responsible for leading advanced research in Large Language Models (LLMs), Generative AI, and Deep Learning.Key ResponsibilitiesConduct research and develop novel algorithms, architectures, and methodologies for...


  • Santa Clara, California, United States Dell Full time

    As a Senior Principal Power Engineer at Dell, you will play a critical role in developing next-generation large-scale AI Infrastructure with a focus on leading power systems. This includes power supplies, transformers, power distribution units, bus systems, backup supplies, and more.You will engage with high-profile AI customers to optimize solutions for...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionNVIDIA is seeking a highly motivated and experienced Principal Graphics System Software Engineer to join our team. As a key member of our graphics software engineering team, you will be responsible for designing and implementing new emerging graphics features that cut through the entire stack from top-level graphics APIs through shading...


  • Santa Clara, California, United States Org_Subtype_BU022_Infrastructure_Solutions_Group Full time

    Job Title: Senior Principal Thermal EngineerWe are seeking a highly experienced Senior Principal Thermal Engineer to join our AI Infrastructure Team in Austin, Texas, Santa Clara, California, or Hopkinton, Massachusetts.About the RoleThis is a unique opportunity to lead critical industry relationships with the supply base to develop advanced cooling...

Principal Engineer for AI Systems

2 months ago


Santa Clara, California, United States NVIDIA Full time
About the Role

NVIDIA is seeking a highly skilled Principal Engineer to lead the development of AI software resiliency for our most powerful AI supercomputers.

Key Responsibilities
  • Develop and implement critical resiliency features to support frontier model training at scale.
  • Drive down cluster downtime towards zero, ensuring robust and reliable AI systems.
Requirements
  • Master's or Ph.D. in Computer Science, Electrical Engineering, Computer Engineering, or a related field from a reputable institution.
  • Minimum of 10 years of experience in systems architecture or related fields, with a deep understanding of distributed systems and large-scale AI infrastructure.
  • At least 10 years of hands-on experience in software development for distributed systems and 5 years in developing AI frameworks such as PyTorch or JAX/XLA.
About NVIDIA

NVIDIA is a leader in the field of AI and deep learning, and we are committed to fostering a diverse and inclusive work environment.

We are recognized as one of the world's most desirable technology employers, home to some of the most forward-thinking and hardworking people in the world.

We are dedicated to pushing the boundaries of innovation and excellence in AI research and development.