Principal Software Engineer

1 month ago


Santa Clara, California, United States Nvidia Corporation Full time
We are looking for a Principal Software Engineer with experience in building highly scalable and robust enterprise software to join us. We are building and improving a powerful platform that will automate diagnosis and repair of a cluster of GPUs or CPUs across public clouds, private clouds and virtual and physical hardware.

What you'll be doing:

  • Architecting scalable and reliable software components to enable the core platform to maintain an inventory of resources including hosts, GPUs, and switches; to automate actions to diagnose failures and to repair
  • Influencing the product roadmap in collaboration with teams across various departments with the goal of reducing SRE toil and improving hardware utilization
  • Collaborating with various organizations across Nvidia to drive adoption of the platform in order to improve GPU utilization
  • Defining and running benchmarks for various subsystems
  • Leading and delivering high impact projects with high quality, performance and stability with the lowest resource consumption
  • Developing a robust feedback control system that analyzes signals about system health and automatically runs commands to fix discovered issues
  • Programming in modern languages like Go and Rust

What we need to see:
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)
  • 15 years of equivalent experience
  • Demonstrated ability in building scalable and robust distributed systems
  • Proven record of product rollouts and collaborating with early adopters
  • Proficiency in programming in C/C++, Java, Rust or Go.
  • Technical stewardship of projects across the organization

Ways to stand out from the crowd:
  • Deep understanding of multi-threading and distributed systems concepts
  • Excellent track record of delivering projects
  • Expertise in optimizing SQL queries
  • Expert level knowledge of Rust programming

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and dedicated people in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you

The base salary range is 272,000 USD - 419,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Santa Clara, California, United States Belden Full time

    Job SummaryBelden is seeking a highly skilled Principal Engineer to lead the development of our next-generation products. As a key member of our engineering team, you will use your expertise to propose and develop innovative solutions to customer problems.Key ResponsibilitiesDefine the infrastructure and software architecture for React-based frontend...


  • Santa Clara, California, United States Belden Full time

    Job SummaryBelden is seeking a highly skilled Principal Engineer to lead the development of our next-generation products. As a key member of our engineering team, you will use your expertise to propose and develop innovative solutions to customer problems.Key ResponsibilitiesDefine the infrastructure and software architecture for React-based Front End...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled and experienced Senior Principal Software Engineer to join our team at Palo Alto Networks. As a key member of our engineering team, you will be responsible for designing, building, and optimizing our software products to meet the needs of our customers.Key ResponsibilitiesLead the development of complex software...


  • Santa Clara, California, United States Integrated Resources Inc. Full time

    Clojure Principal Software EngineerContract PositionIntegrated Resources, Inc. is a distinguished staffing agency known for its exceptional service and integrity in the professional specialty sector since 1996. Our focus is on providing top-tier talent consistently across various domains, including Information Technology (IT), Clinical Research,...


  • Santa Clara, California, United States Integrated Resources Inc. Full time

    System / Clojure Principal Software EngineerContract PositionIntegrated Resources, Inc. is a leading staffing agency recognized for its excellence in professional specialty services. Established in 1996, we have earned a reputation for delivering outstanding service and maintaining integrity in all our operations. Our mission is to provide top-tier talent...


  • Santa Clara, California, United States NetScaler Full time

    About the TeamCitrix and TIBCO recently merged to create Cloud Software Group, a leading provider of cloud-based solutions for enterprise customers. Our team is responsible for developing and maintaining the security features of our flagship product, NetScaler.Job DescriptionJob SummaryWe are seeking an experienced Principal Software Engineer to lead the...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionWe are seeking a highly skilled Principal Engineer for AI Software Resiliency to join our team at NVIDIA. As a key member of our organization, you will play a pivotal role in defining and implementing critical resiliency features for AI supercomputers at a scale of 100,000+ GPUs.Key ResponsibilitiesDevelop and lead the execution of software...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job DescriptionAbout the RolePalo Alto Networks is seeking a highly skilled Principal Software Quality Assurance Engineer to join our IoT Security team. As a key member of our software QA engineering team, you will be responsible for building, automating, and running functional testing scenarios for our products in virtualized elements.Key...


  • Santa Clara, California, United States Oracle Full time

    About the RoleWe are seeking a highly skilled Principal Software Engineer to join our Observability and Data team at Oracle Cloud Infrastructure. As a key member of our team, you will be responsible for designing and developing innovative solutions for our cloud-based observability platform.Key ResponsibilitiesDesign and develop scalable and highly available...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a seasoned and accomplished QA/Automation engineer with experience in validating core networking & security features in the QA team. As a Principal Software Test Engineer in CloudNGFW Security, you will be part of a world-class software test engineering team that works on various ground-breaking technologies in the Layer 4-7...


  • Santa Clara, California, United States Palo Alto Networks Full time

    As a Principal Software Engineer on our Prisma Access Cloud Service team, you will design, develop and deliver next-generation technologies.Prisma Access extends the protection of our next-generation security platform.Prisma Access Cloud Service operationalizes the deployment by leveraging a cloud-based security infrastructure operated by Palo Alto...


  • Santa Clara, California, United States SA TECHNOLOGIES Full time

    SA Technologies Inc. is a prominent player and one of the rapidly expanding IT consulting firms with a presence in multiple countries. We are recognized as an Oracle Gold Partner, SAP Services Partner, and IBM Certified enterprise.All opportunities at SA Technologies are Direct Client Requirements sourced from IT Hiring Managers. We ensure competitive...


  • Santa Clara, California, United States d-Matrix Full time

    d-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The "holy grail" of AI compute has been to break through the memory wall to minimize data movements. We've achieved this with a first-of-its-kind DIMC engine. Having secured over $154M, $110M in our Series B offering, d-Matrix is...


  • Santa Clara, California, United States Jobot Full time

    Senior/Principal Test Engineer at JobotThis role presents an exciting opportunity to join a well-established and publicly traded semiconductor firm as a Senior or Principal Test Engineer specializing in MEMS technology. The successful candidate will possess a robust background in Test Systems Development, Mixed Signal Analog, PCB-Level Circuit design, and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Our MissionAt Palo Alto Networks everything starts and ends with our mission:Being the cybersecurity partner of choice, protecting our digital way of life.Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Principal Engineer to lead the development of AI software resiliency for our most powerful AI supercomputers.Key ResponsibilitiesDevelop and implement critical resiliency features to support frontier model training at scale.Drive down cluster downtime towards zero, ensuring robust and reliable AI...


  • Santa Clara, California, United States Jobot Full time

    Senior/Principal Test Engineer at JobotThis position is with a prominent publicly traded semiconductor firm seeking a Senior or Principal MEMS Test Engineer to enhance their team. The successful candidate will possess a robust background in Test Systems Development, MEMS technology, Mixed Signal Analog, PCB-Level Circuit design, and automated test equipment...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Software Architect to join our IoT security cloud infrastructure team at Palo Alto Networks. As a key member of our team, you will be responsible for designing and implementing highly scalable and high-performance services to support IoT devices' operational technology support.Key...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Company OverviewOur VisionAt Palo Alto Networks, our journey begins and ends with our vision:To be the trusted partner in cybersecurity, safeguarding our digital existence.We envision a future where each day is more secure than the last. Our organization is founded on the principles of challenging and redefining the status quo, and we seek innovators who are...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is a leader in the field of artificial intelligence and high-performance computing. We are seeking a highly skilled Principal Platform Software Architect to join our team.Key ResponsibilitiesDesign and develop next-generation data center server product platform architecture, bringing up and driving solutions to production.Work closely...