Research Engineer, Frontier Evals

1 week ago


San Francisco, California, United States OpenAI Full time $200,000 - $370,000 per year

About the team

The Frontier Evals team builds north star model evaluations to drive progress towards safe AGI/ASI. This team builds ambitious evaluations to measure and steer our models, and creates self-improvement loops to steer our training, safety, and launch decisions. Some of the team's open-sourced evaluations include

SWE-bench Verified

,

MLE-bench

,

PaperBench

, and

SWE-Lancer

, and the team built and ran frontier evaluations for

GPT4o

,

o1

,

o3

,

GPT 4.5

,

ChatGPT Agent

, and

GPT5

. If you are interested in feeling firsthand the fast progress of our models, and steering them towards good, this is the team for you.

About you

We seek exceptional research engineers that can push the boundaries of our frontier models in the finance domain. We are looking for those who will help shape AI evaluations of financial reasoning and related capabilities, and will own individual threads within this endeavor end-to-end.

In this role, you'll:

Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas

Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it

Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities

We expect you to:

Have strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)

Be passionate about Excel spreadsheets and/or finance

Be detail-oriented and thorough

Be a team player / willing to do a variety of tasks to move the team forward

Be passionate and knowledgeable about AGI/ASI measurement

Be able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

It would be great if you also have:

Prior background / domain expertise in finance, especially investment banking or private equity (e.g., through internships, prior jobs)

An ability to work cross-functionally

Excellent communication skills

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

For additional information, please see

OpenAI's Affirmative Action and Equal Employment Opportunity Policy Statement

.

Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.

To notify OpenAI that you believe this job posting is non-compliant, please submit a report through

this form

. No response will be provided to inquiries unrelated to job posting compliance.

We are committed to providing reasonable accommodations to applicants with disabilities.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Compensation

$200K – $370K + Offers Equity


  • Applied Research

    1 week ago


    San Francisco, California, United States Prime Intellect Full time $120,000 - $180,000 per year

    Building Open Superintelligence InfrastructurePrime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes,...

  • Applied Research

    12 hours ago


    San Francisco, California, United States Prime Intellect Full time

    Building Open Superintelligence InfrastructurePrime Intellect is building the open superintelligence stack - from frontier agentic models to the infra that enables anyone to create, train, and deploy them. We aggregate and orchestrate global compute into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes,...

  • Evals Lead

    1 week ago


    San Francisco, California, United States Cartesia Full time

    About CartesiaOur mission is to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device.We're pioneering the model...

  • Founding AI Engineer

    3 hours ago


    San Francisco, California, United States VizopsAI Full time

    VizopsAI is building the most advanced AI Engineering Platform to optimize and deploy self-improving AI Agents. Our vision is to be the control plane for agent performance that makes agents more capable, consistent, personalized and reliable over time. We're a lean, fast-moving team that is building novel RL algorithms and software that push the frontier of...

  • ML Evals Engineer

    4 days ago


    San Francisco, California, United States OpenAI Full time

    About the TeamThe Consumer Products Team at OpenAI is building the next generation of consumer hardware and software interfaces—the future of how people interact with AI. Our mission is to design deeply personal, multimodal experiences that make advanced AI feel natural, useful, and human. Evaluation is at the heart of this mission: reliable, insightful...


  • San Francisco, California, United States OpenAI Full time $120,000 - $200,000 per year

    About the TeamThe Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training.We take data center designs, turn them into real, working systems and build any software needed for running large-scale frontier model trainings.Our mission is to bring up,...


  • San Francisco, California, United States Amazon Full time $129,300 - $223,600 per year

    DescriptionWe are seeking a highly skilled Machine Learning Systems Engineer to join Frontier AI Robotics team. This role focuses on building and optimizing distributed training infrastructure for large-scale machine learning models, particularly in deep learning and transformer-based architectures. You will work closely with scientists and engineers to...

  • Software Engineer

    1 week ago


    San Francisco, California, United States Datacurve AI Inc. Full time $200,000 - $250,000 per year

    We're building a gamified developer platform where tens of thousands of engineers create high‑fidelity datasets that push LLM frontiers. This role owns the technical lifecycle of data pipelines—from defining new data formats with partner labs to shipping the tooling, environments, docs, and QA that make those formats real at scale.What You'll DoOwn...


  • San Francisco, California, United States Benchstack Ai Full time

    Machine Learning Research EngineerLocation:San Francisco (on-site)Compensation:$200K–$350K + equityVisa sponsorship availableA stealth AI startup is building technology that teaches models whatgreatfeels like — across writing, design, and creative expression.They're hiring aMachine Learning Research Engineerto design and run experiments that shape how...

  • AI Engineer

    4 days ago


    San Francisco, California, United States Gamma Full time $150,000 - $240,000

    The RoleWe're seeking an AI engineer to own the core models and prompts that power our product. Gamma weaves together text, image, and layout generation to automate all the drudgery of building presentations and websites.We use AI throughout our product, and we want you to help us to elevate quality, evaluate new models, and push the frontier with new...