Senior Cloud Infrastructure Software Engineer

2 weeks ago


Santa Clara, California, United States NVIDIA Full time

The NVIDIA GPU Cloud (NGC) team is seeking experienced software engineers to develop NVIDIA's advanced compute cloud solutions. These solutions encompass software for managing hardware and network provisioning to create a multi-tenant infrastructure. As a software engineer, you will collaborate with fellow engineers, product architects, and product managers to deliver comprehensive software solutions for managing intricate cloud infrastructure deployments.

You will be responsible for crafting services and software that align with the overarching architectural vision for the NVIDIA Cloud Platform, working alongside other teams to build a resilient and scalable system. You will take ownership of your code throughout the entire lifecycle - from development to testing and production. We value a strong commitment to code quality, testing, deployment efficiency, and the drive to bring exceptional products to market.

Key Responsibilities:

  • Collaborate with internal stakeholders at NVIDIA.
  • Design and implement scalable software systems for managing NVIDIA's cloud infrastructure.
  • Develop automation software for network and systems management in a multi-tenant cloud environment.
  • Engage with open-source communities related to the software we utilize and develop.
  • Present roadmaps, visions, and demonstrations to internal stakeholders and NVIDIA leadership.

Qualifications:

  • Over 10 years of experience in designing and building distributed software systems.
  • A BS/MS degree in Computer Science or related fields (or equivalent experience).
  • Proven ability to write code in mainstream systems programming languages such as C, C++, Golang, or Rust.
  • Demonstrated experience in designing and implementing maintainable APIs for users.
  • Practical knowledge of asynchronous programming, type safety, threading models, state machines, and data structures.
  • Familiarity with data persistence technologies (SQL or similar).
  • Understanding of secure communication protocols (mutual-TLS, IPsec, etc.).

Preferred Qualifications:

  • Experience with a Hyperscale Cloud Service Provider.
  • Knowledge of networking protocols including IP, IPv6, BGP, HTTP, ICMP, and tunneling protocols (VXLAN, Geneve, GRE).
  • Background in host management systems (DHCP, Redfish, UEFI) and host security services like TPM, TXT, and SecureBoot.
  • Familiarity with Kubernetes and distributed task scheduling.
  • Understanding of SRE principles (observability, SLOs, logging, etc.).

NVIDIA is at the forefront of revolutionary advancements in Artificial Intelligence, High-Performance Computing, and Visualization. Our GPU technology serves as the core of modern computing and is integral to our products and services. We invite innovative and passionate individuals to join us in accelerating the next wave of artificial intelligence. NVIDIA is recognized as one of the most desirable employers in the technology sector, attracting some of the most talented minds globally. If you are enthusiastic about developing cloud services, we would like to hear from you.

Compensation includes a competitive salary range, along with equity and benefits. NVIDIA is dedicated to fostering a diverse workplace and is proud to be an equal opportunity employer, valuing diversity in our current and future workforce.


  • Software Engineer

    1 week ago


    Santa Clara, California, United States Oracle Full time

    Software Engineer - Cloud Engineering Infrastructure DevelopmentOracle is seeking a skilled Software Engineer to design, develop, and troubleshoot software programs for various purposes, including file storage, databases, applications, and tools networks.Key Responsibilities:Collaborate with cross-functional teams to define and develop software for tasks...


  • Santa Clara, California, United States Astera Labs Full time

    Astera Labs stands at the forefront of innovative connectivity solutions, enabling the full potential of AI and cloud infrastructure. Our Intelligent Connectivity Platform seamlessly integrates PCIe, CXL, and Ethernet semiconductor-based solutions alongside the COSMOS software suite, delivering a software-defined architecture that is both scalable and...


  • Santa Clara, California, United States Geospatial And Cloud Analytics Inc Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Geospatial And Cloud Analytics Inc. As a key member of our engineering team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large-scale cloud infrastructure.Key ResponsibilitiesDesign and implement...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is seeking talented engineers to enhance its AI Infrastructure. We are looking for individuals with a robust programming foundation, profound knowledge of distributed systems, and a strong grasp of software testing and deployment methodologies. Excellent communication and organizational skills are essential. We value innovative thinkers who can...


  • Santa Clara, California, United States Oracle Full time

    Job DescriptionJob Summary: We are seeking a highly skilled and experienced Senior Principal Software Engineer to join our Cloud Engineering Infrastructure Development team at Oracle. As a key member of our team, you will be responsible for designing, developing, and performance tuning the networking stack required to run distributed AI/ML/HPC workloads...


  • Santa Clara, California, United States Amazon Full time

    About the RoleWe are seeking a Cloud Software Engineer to join our innovative team focused on enhancing the Developer Experience. Our mission is to leverage GenAI to empower developers in creating applications that are faster, more cost-effective, secure, and reliable.GenAI will enable a diverse range of builders to harness the capabilities of AWS,...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a seasoned Cloud Engineer to join its fast-paced Infrastructure, Planning and Processes organization. As a Senior Cloud Engineer, you will be part of a dynamic team that develops and maintains NVIDIA's internal cloud provisioning product for GPUs and Tegra systems.Key ResponsibilitiesDesign and implement scalable, resilient...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Senior Principal Cloud Reliability Engineer to join our team. As a key member of our cloud infrastructure team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Key ResponsibilitiesContribute to the success of our cloud infrastructure team by...


  • Santa Clara, California, United States Oracle Full time

    Job OverviewAs a Senior Software Engineer at Oracle, you will collaborate with teams of developers tasked with the design, creation, and maintenance of our innovative software solutions. Your role will require you to architect comprehensive system interactions, engage in hands-on development, and possess a deep understanding of cloud infrastructure and...

  • Senior IT Engineer

    1 week ago


    Santa Clara, California, United States OmniVision Technologies Full time

    About OmniVision TechnologiesWe are a leading developer of advanced digital imaging solutions, providing a diverse culture that works together on the development of cutting-edge imaging technology, products, and solutions.Job SummaryWe are seeking a highly skilled Senior IT Engineer to lead our cloud infrastructure team. The successful candidate will be...


  • Santa Clara, California, United States NVIDIA Full time

    Job SummaryNVIDIA is seeking a highly skilled Senior Cloud Engineer to join its Infrastructure, Planning and Processes organization. As a Senior Cloud Engineer, you will be part of a fast-paced team that develops and maintains NVIDIA's internal cloud provisioning product for GPUs and Tegra systems.Key ResponsibilitiesDesign and implement scalable, resilient...


  • Santa Clara, California, United States ServiceNow Full time

    Job DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in...


  • Santa Monica, California, United States GoodRx Full time

    GoodRx serves as a pivotal healthcare marketplace in the United States, assisting millions of individuals each month in locating trustworthy health information and securing discounts on their healthcare expenses. Since its inception, GoodRx has facilitated savings of $60 billion for consumers, providing access to prescription discounts accepted at over...


  • Santa Clara, California, United States NVIDIA Full time

    Job SummaryNVIDIA is seeking a highly skilled Senior SRE Engineer to join its fast-paced Infrastructure, Planning and Processes organization. As a key member of the team, you will be responsible for designing and implementing scalable, resilient cloud infrastructure platforms for NVIDIA's internal cloud provisioning product.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States TechStar Group Full time

    Job Title: Cloud Infrastructure Architect**Job Summary:**We are seeking a highly skilled Cloud Infrastructure Architect to join our team at TechStar Group. As a key member of our infrastructure team, you will be responsible for designing, implementing, and managing our cloud infrastructure to ensure high levels of performance, availability, and security.Key...


  • Santa Clara, California, United States ServiceNow Full time

    Job Description**About ServiceNow**ServiceNow is a global market leader in the field of cloud-based platforms, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based platform seamlessly connects people, systems, and processes to empower organizations to find smarter, faster, and...


  • Santa Clara, California, United States eTeam Full time

    Job DescriptionJob Title: Cloud Infrastructure ArchitectLocation: Remote (with occasional travel)Job Type: Full-timeAbout eTeam: eTeam is a leading provider of cloud-based solutions, dedicated to delivering innovative and secure infrastructure to our clients.Job Summary: We are seeking an experienced Cloud Infrastructure Architect to join our team. The ideal...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Our MissionAt Palo Alto Networks everything starts and ends with our mission:Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for...


  • Santa Clara, California, United States Sage Lake Senior Living Full time

    About the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in ensuring the high availability and performance of our AI-powered applications.Key ResponsibilitiesOperate and improve the observability and maintainability of our distributed microservice cloud applications and...


  • Santa Clara, California, United States d-Matrix Full time

    Software Engineer, Senior - AI/ML Workloadsd-Matrix - Santa Clara, CALocationSanta Clara, CaTypeFull timeDepartmentR&D - SW Kernels & Workloadsd-Matrix has fundamentally changed the physics of memory-compute integration with our digital in-memory compute (DIMC) engine. The "holy grail" of AI compute has been to break through the memory wall to minimize data...