Senior Infrastructure Performance Engineer

2 weeks ago


Santa Clara, California, United States NVIDIA Full time
Transform IT Compute Platform Architecture

NVIDIA is at the forefront of technological innovation, driving efficiency and optimizing the performance of our infrastructure both on-prem and cloud. We are seeking a highly skilled Senior Staff Infrastructure Performance Engineer to join our dynamic team.

Key Responsibilities:
  • Lead initiatives to transform IT Compute platform architecture to build new service offerings across On-Prem & Cloud.
  • Define and implement metrics to measure the efficiency of compute platforms & services and drive efficiency.
  • Collect and review system data for capacity and planning purposes, analyze capacity data and develop plans for appropriate level enterprise-wide systems, and coordinate with management personnel in implementing changes.
  • Develop and maintain tools for collecting, analyzing, and visualizing data for reporting, alerting, monitoring.
  • Collaborate with NVIDIA leadership, senior engineers, program managers, and product managers to develop compelling IT products and services that meet customer needs.
Requirements:
  • Bachelor's degree in Engineering, Computer Science, Mathematics, or related field, or equivalent experience.
  • 12+ years of proven experience in compute platform engineering with a focus on automation.
  • Experience with design and deployment of virtualization architectures, including VMware, Openshift or KubeVirt platforms.
  • Proven experience evaluating existing application architectures and identify opportunities for containerization to improve scalability, reliability, and efficiency.
  • Strong analytical skills with the ability to define and track key performance metrics.
  • Experience in developing tools for data analysis and performance profiling, Development with Terraform, Config Management tools.
  • Proficiency in programming languages such as Go and/or Python.
  • Experience with running large environments consisting of BareMetal, large scale virtualized environment with a mix of tens of thousands of VM's and cloud infrastructure.
Preferred Qualifications:
  • Deep understanding of other infrastructure components like Storage, DNS, AD, Security Tools etc.
  • Hands-on experience with cloud platforms such as AWS, Azure, or Google Cloud Platform.
  • Solid understanding of microservices architecture, infrastructure as code (IaC) and configuration management tools.
  • Understanding of AI ops and how to leverage LLMs to automate various optimization initiatives

NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative and autonomous, we want to hear from you.

The base salary range is 248,000 USD - 385,250 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.



  • Santa Clara, California, United States NVIDIA Full time

    Transformative Infrastructure Performance EngineerNVIDIA is at the forefront of technological innovation, driving efficiency and optimizing the performance of our infrastructure both on-prem and cloud. We are seeking a highly skilled Senior Staff Infrastructure Performance Engineer to join our dynamic team.Key Responsibilities:Lead initiatives to transform...


  • Santa Clara, California, United States Nvidia Full time

    Job Title: Senior Staff Infrastructure Performance EngineerNVIDIA is a leader in the technology industry, and we are seeking a highly skilled Senior Staff Infrastructure Performance Engineer to join our dynamic team. As a key member of our IT organization, you will play a critical role in driving efficiency and optimizing the performance of our...


  • Santa Clara, California, United States NVIDIA Full time

    Transform IT Compute Platform ArchitectureNVIDIA is seeking a highly skilled Senior Staff Infrastructure Performance Engineer to join our dynamic team. As a key member of our IT organization, you will be responsible for leading initiatives to transform our IT Compute platform architecture to build new service offerings across On-Prem & Cloud.Key...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Cloud Infrastructure EngineerNVIDIA is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our Infrastructure, Planning and Process (IPP) team. As a key member of our global organization, you will be responsible for designing, building, and maintaining our cloud infrastructure to support the development and deployment of...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Cloud Infrastructure EngineerNVIDIA is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our Infrastructure, Planning and Process (IPP) team. As a key member of our global organization, you will be responsible for designing, building, and maintaining our cloud infrastructure to support the development and deployment of...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior Site Reliability EngineerNVIDIA is seeking a highly skilled Senior Site Reliability Engineer to join our Infrastructure, Planning and Process (IPP) team. As a key member of our global organization, you will play a critical role in designing and implementing scalable, reliable, and efficient cloud infrastructure solutions.Our cloud services...


  • Santa Clara, California, United States NVIDIA Full time

    Unlock the Power of AI with NVIDIANVIDIA is revolutionizing the world of computer graphics, gaming, and accelerated computing. As a pioneer in this field, we're pushing the boundaries of what's possible with AI.About the RoleWe're seeking a highly skilled Senior Performance Engineer to join our team. As a key member of our team, you'll be responsible for...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA has been a pioneer in computer graphics, PC gaming, and accelerated computing for over 25 years. Our legacy of innovation is fueled by great technology and amazing people. Today, we're pushing the boundaries of AI to define the next era of computing.As a leader in GPU computing, we're creating a world where our technology acts as the brain...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Senior Cloud Infrastructure EngineerPalo Alto Networks is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team. As a Senior Cloud Infrastructure Engineer, you will be responsible for designing, building, and operating reliable, secure cloud infrastructure.Key Responsibilities:Design and implement scalable cloud...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Senior Cloud Infrastructure EngineerPalo Alto Networks is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team. As a key member of our Cloud Infrastructure team, you will be responsible for designing, building, and operating scalable and secure cloud infrastructure.About the RoleWe are looking for a talented engineer with...


  • Santa Clara, California, United States Pan Asia Resources Full time

    Job Title: Senior Systems Infrastructure EngineerWe are seeking a highly skilled Senior Systems Infrastructure Engineer to join our team at Pan Asia Resources. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure on AWS.Key Responsibilities:Design and implement scalable and...


  • Santa Clara, California, United States Sustainable Talent Full time

    Job OverviewSustainable Talent is seeking a highly skilled Senior Infrastructure Engineer to support the NVIDIA Cloud Infrastructure Team. As a key member of our team, you will be responsible for supporting infrastructure team operations, cloud infrastructure system enrollments, deployments, and troubleshooting.Key Responsibilities:Support Infrastructure...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA has been a pioneer in computer graphics, PC gaming, and accelerated computing for over 25 years. Our legacy of innovation is fueled by great technology and amazing people. Today, we're pushing the boundaries of AI to define the next era of computing. As we tap into the unlimited potential of AI, we're looking for talented individuals to...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Senior SRE Engineer to join its Infrastructure, Planning and Processes organization. As a key member of the team, you will be responsible for designing and implementing scalable, resilient cloud infrastructure platforms using Kubernetes and other technologies.Key ResponsibilitiesDesign and implement Kubernetes...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team at Palo Alto Networks. As a key member of our Cloud Infrastructure team, you will be responsible for designing, building, and operating scalable and secure cloud infrastructure to support our mission-critical applications.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Trillium Staffing Full time

    Senior SRE EngineerTrillium Staffing is seeking a seasoned Senior SRE Engineer to join its fast-paced Infrastructure, Planning and Processes organization in Santa Clara, CA. As a key member of the team, you will be responsible for developing and maintaining sophisticated internal cloud provisioning products for GPUs and Tegra systems.Key...


  • Santa Clara, California, United States ServiceNow Full time

    OverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between.They are...


  • Santa Clara, California, United States XPENG Motors Full time

    About XPeng MotorsXPeng Motors is a leading smart electric vehicle company in China, dedicated to designing, developing, manufacturing, and marketing smart EVs that integrate advanced Internet, AI, and autonomous driving technologies.Job Title: Senior Staff AI Performance EngineerWe are seeking a highly skilled Senior Staff AI Performance Engineer to join...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Senior Staff Site Reliability Engineer to join our CDL/SLS team at Palo Alto Networks. As a key member of our team, you will be responsible for designing, building, and operating reliable, secure cloud infrastructure.Key ResponsibilitiesContribute to the success of SRE and DevOps teamsDevelop expertise in new...


  • Santa Clara, California, United States XPENG Motors Full time

    Job Title: Senior Staff AI Infrastructure SREXpeng Motors is a leading smart electric vehicle company that designs, develops, and manufactures cutting-edge EVs with advanced Internet, AI, and autonomous driving technologies. We are committed to in-house R&D and intelligent manufacturing to create a better mobility experience for our customers.About the...