Senior Systems Software Engineer

3 weeks ago


Santa Clara, California, United States NVIDIA Full time

NVIDIA is seeking a senior engineer to design and build a factory automation pipeline for NVIDIA Inference Microservices (NIMs). The right person for this role brings technical drive and creativity to change the way NVIDIA optimizes and serves performant inferencing for every AI model.

The NIM offerings are easy to use, highly performant, and tested in all deployment scenarios, in the cloud, on customer's self-hosted infrastructure, and locally on all NVIDIA GPUs.

You will apply your deep technical expertise to design an efficient, scalable, and reliable automation factory pipeline that will take AI models to become NIMs that are validated for best-in-class performance and accuracy.

NVIDIA is building a new category of products by intersecting our prowess in deep learning and computing with industry-leading technologies.

You will harness groundbreaking technologies and build a highly efficient factory to power how NVIDIA builds and validates NIMs for inferencing all the way through deployment in heterogeneous hardware and software environments.

You will influence and drive technical advances in NVIDIA's workflows and build the infrastructure that strives to accelerate the delivery of every AI model on NVIDIA's GPUs anywhere.

We are looking for technical talent to design and build our factory capabilities, including the underlying infrastructure, pipelines, backends, Docker build, test harness, metrics, performance engineering, log ingestion, and more.

Key Responsibilities:

  • Develop, analyze, and optimize factory infrastructure that will take an AI model in and produce a deployable service that is validated across Cloud, On-prem, and Kubernetes environments.
  • With the team, define and deliver rapid iterations on the group's technical strategies and roadmaps to deliver and improve the NIM factory.
  • Develop harness, automate hardware acceptance, analyze benchmarks, data gathering, and statistical analysis of systems health and performance analysis of NIMs.
  • Work with technical leaders designing and developing scalable and reliable factory acceptance and performance tuning of hardware platforms.
  • Collaborate with multiple AI model teams to understand their requirements to build an efficient infrastructure that improves every team's productivity.

Requirements:

  • A history of using advanced programming skills to build tooling and automation for hardware system characterization and benchmarking.
  • Proven experience debugging and analyzing performance of compute applications and systems.
  • Deep technical expertise working with system software and platform layers, including Kernel, device driver, memory, storage, networking, and PCIe devices.
  • Passion for building platform engineering components and automation of system benchmarking and characterization.
  • Excellent interpersonal skills and the ability to lead multi-functional efforts.
  • Experience working with hardware clusters, distributed systems, networking, GPU interconnects (PCIe, NVlink), node, and cluster interconnect (Infiniband).
  • BS or MS in Computer Science, Computer Engineering, or related field (or equivalent experience).
  • 6+ years of shown experience developing performant microservices, cloud software, and/or tooling roles.

Preferred Qualifications:

  • Experience delivering optimized system engineering environment for inference applications in data center and consumer-grade hardware platforms.
  • A history of building and deploying automated benchmarking solutions in Cloud and On-prem environments, and their associated CI/CD pipelines.
  • Prior experience in working with large-scale compute infrastructure solutions.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.



  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a Senior Systems Software Engineer to join our TAO Toolkit Team, where you will be responsible for developing novel, scalable, and automated pipelines to make sense of petabytes of unstructured data. You will collaborate with multiple deep-learning architects and engineers to enable the development of pioneering AI models.Key Responsibilities:...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a Senior Systems Software Engineer to join our TAO Toolkit Team at NVIDIA. Our team builds frameworks, services, algorithms, and tools that power the largest NVIDIA Multi-Modal Foundation Models and their customization.Key Responsibilities:Design, develop, and support a platform to access large datasets, integrating data from various...


  • Santa Clara, California, United States United Software Group Full time

    We are seeking a highly skilled Senior Software Quality Assurance Engineer to join our team at United Software Group. In this role, you will be responsible for developing and maintaining our CI/CD pipeline, ensuring the highest quality of our software products.Key responsibilities include:Designing and implementing automated testing frameworksDeveloping and...


  • Santa Clara, California, United States Citrix Systems Inc Full time

    Job DescriptionJob Title: Senior IT Software EngineerJob Summary:We are seeking a highly skilled Senior IT Software Engineer to join our team. The successful candidate will be responsible for managing system implementation in Oracle CPQ / CX, making significant process optimizations for business productivity.Key Responsibilities:Design the architecture of...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a highly skilled Senior System Software and Firmware Engineer to join our team at NVIDIA. As a key member of our engineering team, you will be responsible for designing, implementing, and verifying system software and firmware for our next-generation System on Chip (SoC) products.Key Responsibilities:Architect and design system software and...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionWe are seeking a highly skilled Senior System Software Engineer to join our team at NVIDIA's System Software group. As a key member of our team, you will be responsible for designing and developing diagnostic middleware and applications for NVIDIA SOC and GPU hardware.Key Responsibilities:Designing and developing diagnostic middleware and...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is seeking a talented software engineer to join our Solutions Engineering team and contribute to the development of our autonomous vehicle platform.You will work closely with experts in Deep Learning, Computer Vision, and vehicle control to design, develop, and implement software and systems that will revolutionize the automotive industry.The ideal...


  • Santa Clara, California, United States LanceSoft Full time

    Job Title: Senior Software EngineerJob Summary:LanceSoft is seeking a highly skilled Senior Software Engineer to join our team. As a Senior Software Engineer, you will be responsible for designing, developing, and testing software applications. You will work closely with cross-functional teams to ensure that software meets the required quality and...


  • Santa Clara, California, United States NVIDIA Full time

    Job DescriptionNVIDIA is seeking a highly skilled Senior System Software Engineer to join our team. As a key member of our CUDA Driver team, you will be responsible for designing, developing, and delivering high-quality software solutions for accelerating general-purpose computation on the GPU.Key Responsibilities:Design and implement new features for the...


  • Santa Clara, California, United States Selector Software Full time

    Job OverviewSelector Software is seeking a skilled Software Development Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, developing, and deploying scalable cloud-based systems.Key Responsibilities:Design and implement cloud-based systems using Python and GolangDevelop REST APIs and microservices for...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is a leader in the field of artificial intelligence and computing. We are seeking a highly skilled Senior System Software Engineer Platform to join our team.As a Senior System Software Engineer Platform, you will be responsible for designing and implementing microcontroller firmware for GPU Server platforms. This will involve developing C/C++ server...


  • Santa Clara, California, United States Omega Solutions Full time

    Job Title: Senior Java Software EngineerJob Summary:Omega Solutions is seeking a highly skilled Senior Java Software Engineer to join our team. As a key member of our development team, you will be responsible for designing, developing, and maintaining large-scale distributed systems and web services using Java and related technologies.Key...


  • Santa Clara, California, United States NVIDIA Full time

    Job Description:NVIDIA's Solution Engineering team is seeking a highly skilled Senior System Software Engineer to join our team. As a key member of our team, you will be responsible for designing and developing cutting-edge software systems for our mobile system-on-chip (SOC) devices.Key Responsibilities:* Collaborate with software engineers, hardware, and...


  • Santa Clara, California, United States NVIDIA Full time

    Senior System Software ArchitectNVIDIA is seeking an exceptional Senior System Software Architect to join our System Performance and Power Management Software team. This team is responsible for developing comprehensive power management solutions and ensuring peak performance from concept to production.In this role, you will be integral in overseeing the...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a highly skilled Senior System Software Engineer to join our team and contribute to the development of the CUDA driver and runtime. As a key member of our team, you will work on optimizing the performance of our platform for accelerating general purpose computation on the GPU.Our team is responsible for analyzing performance issues,...


  • Santa Clara, California, United States Nvidia Full time

    Job DescriptionNVIDIA is seeking a highly motivated and creative engineer to join the Server Platform Software team. In this role, you will focus on securing NVIDIA's Data Center Systems.Key ResponsibilitiesDefining security architecture for NVIDIA Data Center SystemsWorking with team members across NVIDIA to make the defined architecture a realityDesigning...


  • Santa Clara, California, United States Applied Materials Full time

    We are seeking a highly skilled and experienced Senior Software Engineering Manager to lead our software development team at Applied Materials.This role requires a strong technical background in software development, with a focus on leading a team of engineers in designing, developing, and maintaining scalable and robust software solutions for semiconductor...


  • Santa Clara, California, United States Oracle Full time

    Job DescriptionThe Oracle Cloud Infrastructure (OCI) team is seeking a skilled Senior Software Engineer to join our team of innovators. As a key member of our software design, development, and operations team, you will work on building and operating high-scale services that meet the needs of our customers.Our ideal candidate will have a strong background in...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is seeking a highly skilled engineer to join the Server Platform Software team. In this role, you will focus on securing NVIDIA's Data Center Systems.Key Responsibilities: Define security architecture for NVIDIA Data Center Systems. Collaborate with team members to implement the defined architecture. Design and implement security features for NVIDIA...


  • Santa Clara, California, United States Qualcomm Full time

    Job Title: Senior Software Engineer - WLAN ApplicationsJob Summary:We are seeking a highly skilled Senior Software Engineer to join our WLAN Applications team at Qualcomm. As a key member of our team, you will be responsible for designing, developing, and testing cutting-edge WLAN applications using advanced software knowledge and experience.Key...