Applied AI Researcher, Benchmarking

8 hours ago

New York, NY, United States Distyl AI Full time

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000.

In just 24 months, we've rapidly grown to partner with some of the world's largest enterprises-including F100 telecom, healthcare, manufacturing, insurance, and retail companies-delivering multiple AI deployments with $100M+ impact. Our platform, Distillery, along with our team of AI Engineers, Researchers, and Strategists, is pioneering AI-native systems of work, solving the most complex, high-stakes challenges at scale.

Distyl is founded and led by proven leaders from companies like Palantir, Apple, and top national laboratories. We work in deep partnership with OpenAI, jointly going-to-market at the largest enterprises and collaborating evaluating and testing the latest models. Backed by Lightspeed, Khosla, Coatue, industry leaders like Nat Friedman (former GitHub CEO), as well as board members of over 20+ F500s, Distyl is building the future of AI-powered enterprise operations.

What We Are Looking For

At Distyl we're pushing the envelope of AI utilization in enterprise. This requires creative researchers who don't just want to drive incremental improvements on benchmarks or optimize an existing process but instead are looking to creatively redefine how software is used.

Our researchers come from many academic backgrounds but have strong research track records, operate in an AI-native way, and would be bored staying on the rails of a traditional research org.

Key Responsibilities

The Benchmarking team defines how progress is measured. Researchers design evaluation frameworks that capture reasoning depth, interaction quality, reliability, and operational impact. They construct benchmarks that reflect real-world complexity. Their systems become the standard by which new architectures, techniques, and releases are judged.
Researchers in Benchmarking explore new paradigms for evaluating intelligent systems: adversarial robustness testing, longitudinal performance tracking, and human-in-the-loop assessment. They investigate how metrics shape model behavior and establish rigorous methodologies for quantifying emergent capability. Their insights drive both Distyl's internal research priorities and industry-wide standards.

What We Require

Experience Designing and Running Evaluations: You've built or maintained benchmarks, test suites, or experimental frameworks to measure model or system performance.
Statistical and Analytical Rigor: You design fair, reproducible experiments and can extract signal from noisy empirical results.
Experience Building with Models, Not Just Building Models: We develop intelligent systems using models rather than training or fine-tuning them. Ideal candidates have expertise in compound AI systems, agentic collaboration, and associated techniques (ensembling, ReAct, graph-of-thoughts, etc.).
Proven Track Record of Research Results: Whether you've published in top journals, posted amazing work on twitter, or somewhere else we want to see what you've done.
Uses AI Every Day: Before you can revolutionize someone else's workflow, you need to revolutionize yours. You should be using tools like ChatGPT, Cursor, and Perplexity to accelerate your workflow.
Strong Programming and Data Analysis Skills: While you might not consider yourself a software engineer you need to be able to build prototypes of your ideas and then perform the experiments to prove the effectiveness to a F500 Head of AI.
Biases Towards Showing vs Telling: Our customers want to see the power of AI today vs discuss the most elegant idea that will take 5 years to realize.

What We Offer

An opportunity to advance the cutting edge of LLM research and directly revolutionize work in the enterprise space.
Ownership of high-impact research projects, with the autonomy to explore novel approaches and solutions.
Access to state-of-the-art AI models, real business problems, and proprietary data sets across a diverse range of real-world industries.
Competitive salary and benefits package, including equity options, medical/dental/vision covered at 100% for you and your dependents, 401K plan, and perks such as commuter benefits and lunch provided in office.
Be part of a mission-oriented company to create practical adoption during the biggest revolution in human productivity.
A collaborative and intellectually stimulating environment that encourages innovation and personal growth.

If you are an innovative, ambitious, and driven individual looking to make a difference in the world of AI, we want to hear from you. Apply now to join our team as an Applied AI Researcher and help us shape the future of AI-driven solutions for enterprises across the globe.

Note: Distyl is a hybrid working environment and requires in office collaboration 3 days a week. We have offices in SF and NYC

Applied AI Researcher, System Discovery

7 days ago

New York, NY, United States Distyl AI Full time

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000. In just 24 months, we've rapidly grown to partner with some of the world's largest enterprises-including F100 telecom, healthcare, manufacturing, insurance, and retail companies-delivering multiple AI deployments with $100M+ impact....
Applied AI Researcher, System Discovery

6 days ago

New York, NY, United States Distyl AI Full time

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000. In just 24 months, we've rapidly grown to partner with some of the world's largest enterprises-including F100 telecom, healthcare, manufacturing, insurance, and retail companies-delivering multiple AI deployments with $100M+ impact....
Applied AI Researcher, System Self-Construction

5 days ago

New York, NY, United States Distyl AI Full time

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000. In just 24 months, we've rapidly grown to partner with some of the world's largest enterprises-including F100 telecom, healthcare, manufacturing, insurance, and retail companies-delivering multiple AI deployments with $100M+ impact....
Applied AI Researcher, System Self-Construction

1 week ago

New York, NY, United States Distyl AI Full time

Distyl AI develops AI native technologies for humans & AI to collaborate to power the operations of the Global Fortune 1000. In just 24 months, we've rapidly grown to partner with some of the world's largest enterprises-including F100 telecom, healthcare, manufacturing, insurance, and retail companies-delivering multiple AI deployments with $100M+ impact....
Applied AI Designer

28 minutes ago

New York, NY, United States Thread AI Full time

Thread AI Thread AI is focused on building an AI-native workflow orchestration engine and is looking for dedicated individuals to join its growing team. Our goal is to make infrastructure simple for enterprises and public sector agencies seeking to get the most from AI and AI Agents. Headquartered in New York, our growing team is a group of AI, product, and...
Machine Learning Research Lead, Security

1 week ago

New York, NY, United States Scale AI Full time

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale has launched a new team, the Scale AI Security and Policy Research Lab, to bridge the gap between AI researchers and global policymakers to make...
Machine Learning Research Lead, Security

6 hours ago

New York, NY, United States Scale AI Full time

As the leading data and evaluation partner for frontier AI companies, Scale plays an integral role in understanding the capabilities and safeguarding AI models and systems. Building on this expertise, Scale has launched a new team, the Scale AI Security and Policy Research Lab, to bridge the gap between AI researchers and global policymakers to make...
Applied Machine Learning Engineer

55 minutes ago

New York, NY, United States Fireworks AI Full time

About Us: At Fireworks, we're building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We've been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and...
Head of Research

1 week ago

New York, NY, United States techire ai Full time

Job Description Head of Research - Post-Training & Reinforcement Learning Ready to shape how the next generation of AI is trained, aligned, and supervised? This role is about leading one of the most critical research agendas in AI today: advancing post-training and reinforcement learning methods that ensure increasingly capable models remain aligned,...
Head of Research

4 hours ago

New York, NY, United States techire ai Full time

Job Description Head of Research - Post-Training & Reinforcement Learning Ready to shape how the next generation of AI is trained, aligned, and supervised? This role is about leading one of the most critical research agendas in AI today: advancing post-training and reinforcement learning methods that ensure increasingly capable models remain aligned,...

Americas

Europe

Asia / Oceania

Africa

Applied AI Researcher, Benchmarking