Platform ML Engineering Manager, Model Graph

3 weeks ago


San Francisco, United States Openai Full time
About the Team

The Platform ML team builds the ML side of our state-of-the-art internal training framework used to train our cutting-edge models. We work on distributed model execution as well as the interfaces and implementation for model code, training, and inference.

Our priorities are to maximize training throughput (how quickly we can train a new model) and researcher throughput (how quickly we can develop new models) with the goal of accelerating progress towards AGI. We frequently collaborate with other teams to speed up the development of new capabilities.

About the Role

We are looking for an experienced engineering manager to help lead critical work on model definition and efficient distributed execution within our shared internal training stack. Our internal training stack is used by Research for large scale and small scale runs.

In this role, you will:
  • Reduce the time it takes to try out new architecture ideas for training new models and increase the robustness of model code.
  • Collaborate closely with researchers and other systems engineers to maximize the benefits of our shared internal training stack.
  • Make it feasible to get SOTA throughput for our most important research models.
  • Hire world-class AI systems engineers in one of the most competitive hiring markets.
  • Coordinate the training needs of OpenAI's research teams.
  • Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.
You might thrive in this role if you:
  • Have 3+ years of experience in engineering management and 7+ years as an IC working with high scale distributed systems and ML systems.
  • Have experience with ML systems, particularly high scale distributed training or inference for modern LLMs.
  • Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
  • Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.


About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

  • San Francisco, United States OpenAI Full time

    About the Team The Platform ML team builds the ML side of our state-of-the-art internal training framework used to train our cutting-edge models. We work on distributed model execution as well as the interfaces and implementation for model code, training, and inference. Our priorities are to maximize training throughput (how quickly we can train a new model)...


  • San Francisco, United States OpenAI Full time

    About the Team The Platform ML team builds the ML side of our state-of-the-art internal training framework used to train our cutting-edge models. We work on distributed model execution as well as the interfaces and implementation for model code, training, and inference. Our priorities are to maximize training throughput (how quickly we can train a new model)...


  • San Francisco, United States Scale AI, Inc. Full time

    As a software engineer on the ML Infrastructure team, you will work on developing the platform for orchestrating post-training and model evaluation jobs. At Scale, we are constantly developing new data sources and running experiments to understand their impact on ML models. To support this effort, we are looking for engineers who are comfortable navigating...


  • San Francisco, United States Salesforce.Com Inc Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category: Software Engineering About Salesforce We're Salesforce, the Customer Company, inspiring the future of business with AI + Data + CRM. Leading with our core values, we help companies across every...


  • San Francisco, United States Salesforce, Inc. Full time

    Machine Learning Architect - Search & Knowledge GraphsAbout SalesforceWe’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your...

  • ML Platform Engineer

    1 month ago


    San Francisco, United States Abridge Al, Inc Full time

    Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most-their patients. Our enterprise-grade technology transforms patient-clinician conversations into...


  • San Francisco, United States salesforce.com, inc. Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job Category: Software EngineeringAbout SalesforceWe're Salesforce, the Customer Company, inspiring the future of business with AI + Data + CRM. Leading with our core values, we help companies across every...


  • San Francisco, United States Abridge Al, Inc Full time

    Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most-their patients. Our enterprise-grade technology transforms patient-clinician conversations into...


  • San Jose, United States PayPal Full time

    The CompanyPayPal has been revolutionizing commerce globally for more than 25 years. Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empowers consumers and businesses in approximately 200 markets to join and thrive in the global economy.We operate a global, two-sided network at scale that...


  • San Francisco, United States The Product Folks Full time

    Adobe is the global leader in digital media and digital marketing solutions. Our creative, marketing and document solutions empower everyone – from emerging artists to global brands – to bring digital creations to life and deliver immersive, compelling experiences to the right person at the right moment for the best results. In short, Adobe is...


  • San Francisco, California, United States Capital One Full time

    Capital One is seeking an experienced engineering leader to lead our AI and ML platform. This role will involve managing and growing a team of software engineers, defining strategy and roadmap, and driving delivery of converged interaction patterns for our enterprise AI and ML platforms. The ideal candidate will have strong technical acumen, excellent...


  • San Francisco, California, United States Oleria Corp. Full time

    Job SummaryWe are seeking an exceptional Principal AI/ML Engineer to join our creative team at Oleria Corp. as part of our mission to revolutionize access control solutions for enterprise cloud applications. As a key member of our engineering team, you will play a crucial role in building a data-driven, autonomous identity security platform that leverages AI...

  • Founding ML Engineer

    17 hours ago


    San Francisco, United States HealthLeap Inc. Full time

    Make a difference in the future of healthcare Join an early stage team working to better diagnose and treat patients Location Type Full time Department HealthLeap is pioneering AI-driven healthcare solutions, starting with malnutrition - one of medicine's most under diagnosed conditions.We are developing tools to identify, treat, and prevent malnutrition,...


  • San Francisco, California, United States Programmers Full time

    Job OverviewWe are seeking an experienced Senior Data Science Lead to oversee the development of our AI/ML platform.Key Responsibilities:Design and develop a robust AI/ML platform that prioritizes accuracy, security, and efficiencyLead agile workstreams from requirement gathering to creating actionable task plans for the teamProvide coaching and mentorship...


  • San Francisco, United States Oleria Security Full time

    Company Overview We're seeking an exceptional Principal AI/ML Engineer to join our creative team. Oleria is an enterprise cybersecurity startup founded by notable industry senior leaders Jim Alkove and Jagadeesh Kunda, with deep security, data, and SaaS experience building and securing some of the world's largest platforms and products used by billions of...

  • Data Scientist

    1 week ago


    San Francisco, United States NovumTech Partners Full time

    ResponsibilitiesWorking as part of our team researching and develop machine learning modelsArchitecting ML training, validation and inference pipelinesDesigning and implementing approaches to maximizing the potential of data in AI modelsDefining creative solutions to deep problems, and communicating your ideas to the teamRequirementsPhD or masters in a...


  • San Francisco, California, United States NovumTech Partners Full time

    Job SummaryWe are seeking a highly skilled Machine Learning Model Architect to join our team at NovumTech Partners. As a key member of our research and development team, you will be responsible for designing and implementing AI models that drive business growth and innovation.About the RoleThe successful candidate will have a strong background in machine...


  • San Francisco, United States Oleria Security Full time

    Company Overview We're seeking an exceptional Principal AI/ML Engineer to join our creative team. Oleria is an enterprise cybersecurity startup founded by notable industry senior leaders Jim Alkove and Jagadeesh Kunda, with deep security, data, and SaaS experience building and securing some of the world's largest platforms and products used by billions of...


  • San Francisco, United States Oleria Corp. Full time

    Company OverviewWe’re seeking an exceptional Principal AI/ML Engineer to join our creative team. Oleria is an enterprise cybersecurity startup founded by notable industry senior leaders Jim Alkove and Jagadeesh Kunda, with deep security, data, and SaaS experience building and securing some of the world’s largest platforms and products used by billions of...


  • San Francisco, United States RemoteWorker CA Full time

    Company Overview: Welcome to the forefront of machine learning operations! At our company, we're driving the next wave of AI revolution through cutting-edge ML operations technologies. Our mission is to develop scalable and reliable ML systems that empower businesses and revolutionize industries. Join us and be part of a dynamic team committed to pushing the...