Principal Engineer, GPU Platform

4 weeks ago


San Francisco, United States OpenAI Full time
About the Team
The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses.

You'll join the team responsible for running the infrastructure that supports the models backing ChatGPT and the API. The systems we support include inference kubernetes clusters, GPU health, Infiniband performance, node lifecycle, and more.We seek to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is more important to us than unfettered growth.

About the Role
The inference compute team builds and maintains infrastructure abstractions allowing OpenAI to run models at scale.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:
  • Design and build the inference infrastructure that power our products, enabling reliability and performance
  • Ensure our infrastructure can scale to the next order of magnitude
  • Help create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think
  • Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed.
You might thrive in this role if you:
  • Have 10+ years building core infrastructure
  • Have experience running GPU clusters at scale
  • Have experience operating orchestration systems such as Kubernetes at scale
  • Take pride in building and operating scalable, reliable, secure systems
  • Are comfortable with ambiguity and rapid change

This role is exclusively based in our San Francisco HQ. We offer relocation assistance to new employees.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

  • San Francisco, United States OpenAI Full time

    About the Team The Applied Engineering team works across research, engineering, product, and design to bring OpenAI's technology to consumers and businesses. You'll join the team responsible for running the infrastructure that supports the models backing ChatGPT and the API. The systems we support include inference kubernetes clusters, GPU health, Infiniband...


  • San Francisco, United States OpenAI Full time

    Our team runs the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production applied AI workloads.We seek to learn from deployment and distribute the benefits...


  • San Jose, United States Oho Group Ltd Full time

    We are working with a leading innovator in smart electric vehicles who are seeking GPU Virtualization Engineers. The company specializes in autonomous driving, digital systems, electric powertrains, and batteries. Notable advancements include battery swapping technology, Battery as a Service (BaaS), and Autonomous Driving as a Service (ADaaS). Its diverse...


  • San Jose, United States Oho Group Ltd Full time

    An industry leading smart electric vehicle company is looking for a Virtualization Engineer that specialises within GPU.Their focus areas include designing, developing, co-manufacturing, and selling high-end smart electric vehicles. They specialize within autonomous driving, digital technologies, electric powertrains, and battery systems.Roles and...


  • san jose, United States Oho Group Ltd Full time

    An industry leading smart electric vehicle company is looking for a Virtualization Engineer that specialises within GPU.Their focus areas include designing, developing, co-manufacturing, and selling high-end smart electric vehicles. They specialize within autonomous driving, digital technologies, electric powertrains, and battery systems.Roles and...


  • San Francisco, United States Calendly Full time

    About the team & opportunity What’s so great about working on Calendly’s Platform Services team? You will play a crucial role in designing, developing, and maintaining the foundational services and capabilities that support our engineering teams and drive the success of our products. You will leverage your deep technical expertise and leadership skills...

  • GPU Design Engineer

    4 weeks ago


    San Diego, California, United States MediaTek Full time

    Job Title: GPU Design Verification EngineerCompany: MediaTekAt MediaTek, we are seeking a skilled GPU Design Verification Engineer to join our team. As a leading fabless semiconductor company, we empower innovation and inspire people to expand their horizons through smart technology.Responsibilities:Develop a deep understanding of GPU specs, including 3D...


  • San Francisco, United States Calendly Full time

    About the team & opportunity What’s so great about working on Calendly’s Platform Services team? You will play a crucial role in designing, developing, and maintaining the foundational services and capabilities that support our engineering teams and drive the success of our products. You will leverage your deep technical expertise and leadership skills...

  • GPU Modeling Engineer

    6 months ago


    San Jose, United States SAMSUNG Full time

    Position Summary Samsung, a world leader in advanced semiconductor technology, is founded on a simple philosophy – the endless pursuit of excellence will create a better world for all. At Samsung Austin Research and Development Center (SARC) and Advanced Computing Lab (ACL), we are building a center of excellence for Intellectual Property (IP) that is...


  • San Jose, United States ClinDCast LLC Full time

    Job Title: Lead Principal Machine Learning Engineer Work Mode: Hybrid Location: San Jose, CA Job Description: We are seeking a Lead Principal Machine Learning Engineer to spearhead the development of cutting-edge deep learning solutions. Leverage your expertise in machine learning, data science, and engineering to analyze high-resolution, high-velocity...


  • san jose, United States Oho Group Ltd Full time

    We are working with a leading innovator in smart electric vehicles who are seeking GPU Virtualization Engineers. The company specializes in autonomous driving, digital systems, electric powertrains, and batteries. Notable advancements include battery swapping technology, Battery as a Service (BaaS), and Autonomous Driving as a Service (ADaaS). Its diverse...

  • Platform Engineer

    3 weeks ago


    San Francisco, United States Voltage Park Inc. Full time

    About Voltage Park On-DemandVoltage Park’s mission is to make AI infrastructure accessible to all. Today, we own 24,000+ H100s and operate 7+ data-centers across the US. We serve customers of all sizes, from small research labs to large enterprises. We’re in search of a Platform Engineer to join our On-Demand team, where you’ll help us build a platform...


  • San Francisco, United States Calendly Full time

    About the team & opportunity What’s so great about working on Calendly’s Platform Services team? You will play a crucial role in designing, developing, and maintaining the foundational services and capabilities that support our engineering teams and drive the success of our products. You will leverage your deep technical expertise and leadership skills...


  • san francisco, United States Understanding Recruitment Full time

    Principal Software EngineerUS Tech start-up - Fully Remote$180k + BenefitsWe're excited to share an opportunity with a fast-growing, heavily-backed live shopping platform based on the West Coast, currently valued at over $100M!They're on the lookout for a Principal Software Engineer with expertise in Full Stack Engineering (React.js/Node.js) and a focus on...


  • San Francisco, United States Understanding Recruitment Full time

    Principal Software EngineerUS Tech start-up - Fully Remote$180k + BenefitsWe're excited to share an opportunity with a fast-growing, heavily-backed live shopping platform based on the West Coast, currently valued at over $100M!They're on the lookout for a Principal Software Engineer with expertise in Full Stack Engineering (React.js/Node.js) and a focus on...

  • Software Engineer

    4 months ago


    San Francisco, United States CentML Full time

    About Us We believe AI will fundamentally transform how people live and work. CentML's mission is to massively reduce the cost of developing and deploying ML models so we can enable anyone to harness the power of AI and everyone to benefit from its potential. Our founding team is made up of experts in AI, compilers, and ML hardware and has led efforts at...

  • Software Engineer

    3 days ago


    San Francisco, United States ZipRecruiter Full time

    Job DescriptionMagic’s mission is to build safe AGI that accelerates humanity’s progress on the world’s most important problems. We believe the most promising path to safe AGI lies in automating research and code to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Principal Software Engineer, Data PlatformCHAOS Inc. is a global technology company delivering next-generation capabilities to the defense and critical industrial sectors. Founded in 2022 by a seasoned leadership team, CHAOS has quickly become the place where world-class multi-disciplinary engineers come to build mission-critical technologies. CHAOS has a...


  • San Ramon, United States AHEAD Full time

    Principal Technical Consultant, Platform Engineering Serving as a technical thought leader and SME for our ecosystem of partners, customers, and service providers in the realm of Modern Apps & Platform Engineering, in addition to serving as a leader of other teammates. The Principal Technical Consultant works as part of the consulting team to lead the...


  • San Francisco, United States Social Finance (SoFi) Full time

    Employee Applicant Privacy NoticeWho we are:Shape a brighter financial future with us. Together with our members, we're changing the way people think about and interact with personal finance. We're a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals. The...