AI Infrastructure Engineer

2 days ago

San Francisco, California, United States PlayerZero Full time $150,000 - $250,000 per year

About PlayerZero

PlayerZero is building a self‑healing system for software—automating defect detection, diagnosis, and remediation so developers ship with confidence. Teams use PlayerZero to spot issues before customers do, pinpoint root causes fast, and close the loop from incident to fix.

Our platform includes capabilities like Agentic Debugging and Code Simulations that let engineers reproduce complex scenarios, reason about failures, and validate fixes safely and quickly.

About the role:

We're looking for an experienced backend / infrastructure engineer who loves turning research prototypes into rock-solid production systems. You'll design and scale the core services that power our AI inference stack—from data ingestion and feature stores to retrieval pipelines and GPU orchestration. If you're obsessed with performance, correctness, and shipping fast, you'll feel at home here.

What You'll Do

Own critical services end-to-end
—from architecture and design reviews through deployment, observability, and SLOs.
Scale LLM-driven workloads
: build retrieval-augmented generation pipelines, vector indexes, and evaluation harnesses that handle billions of events per day.
Design data-intensive systems
: streaming ETL, columnar storage, and time-series analytics that feed our self-healing algorithms.
Optimize for cost & latency
across CPUs, GPUs, and serverless runtimes; profile hot paths and squeeze every millisecond.
Champion reliability
: automate testing, chaos drills, and progressive delivery so new models roll out safely.
Collaborate cross-functionally
with ML researchers, product engineers, and customers to ship features that matter.

You might thrive in this role if:

2–5+ years of experience
building scalable backend or infrastructure systems in a production setting.
A
builder mindset
— you like owning projects end-to-end and are thoughtful about data models, performance, and long-term maintainability.
Experience transitioning
prototypes to production
with an understanding of tradeoffs in reliability and scale.
Comfort with
data engineering
workflows — parsing, transforming, indexing, and querying structured or unstructured data.
Exposure to
search infrastructure
or
LLM-backed systems
(e.g. document retrieval, semantic search, evaluation, or prompt engineering).

Bonus Points

Hands-on with vector databases (e.g., pgvector, Pinecone, Weaviate) or inverted-index search (Elasticsearch, Lucene).
Experience operating GPU clusters (Kubernetes, Ray, KServe) or tuning model-parallel inference.
Familiarity with Go / Rust (our primary stack) and TypeScript for the occasional full-stack tweak.
Deep knowledge of observability (OpenTelemetry, Grafana, Datadog) and performance profiling.
Contributions to open-source ML or infrastructure projects.

Our Supporters

Tackling the challenges of such a vast and complex industry is an enormous undertaking. We've raised a Series A funding and are backed by leading investors who share our vision to help us on this journey.

Foundation Capital (Ashu Garg, Jaya Gupta)
WndrCo (Sujay Jaswa, ChenLi Wang)
Green Bay Ventures (Anthony Schiller, Dick Kramlich)
Matei Zaharia (Founder & CTO, Databricks)
Guillermo Rauch (CEO, Vercel)
Dylan Field (Founder & CEO, Figma)
Drew Houston (Founder & CEO, Dropbox)
Peter Bailis (CTO, Workday)
Oliver Jay (MD International, OpenAI)
John Lilly (ex CEO, Mozilla)
Bernard Kim (CEO, Match Group)
Others

Engineering Manager, AI Inference Infrastructure

4 days ago

San Francisco, California, United States Baseten Full time $150,000 - $225,000 per year

About BasetenBaseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. With...
Sr. Recruiter, Physical Infrastructure

2 days ago

San Francisco, California, United States Together AI Full time $175,000 - $210,000 per year

Senior Technical Recruiter, Physical InfrastructureAbout The RoleTogether AI is building the AI Acceleration Cloud. We are building an end-to-end platform for the generative AI lifecycle, integrating fast, reliable inference and model-shaping services with cutting-edge AI cloud infrastructure. We seek a seasoned Senior Technical Recruiter to collaborate with...
Founding Senior Infrastructure Engineer

6 days ago

San Francisco, California, United States Retell AI Full time $215,000 - $290,000 per year

About Retell AiRetell AI is using the first principles to reimagine the call center with cutting edge voice AI.We believe voice is still the most natural way humans communicate, yet it has been trapped in outdated call centers for decades. Our mission is to bring intelligence, empathy, and speed to every phone conversation between businesses and their...
AI Engineer

4 days ago

San Francisco, California, United States Autospark AI Full time $100,000 - $150,000 per year

Company DescriptionAutospark AI develops AI as a Service (AIaaS) solutions that enable small and medium-sized businesses to harness the power of advanced multi-agent AI systems. Our technology supports growth, optimizes marketing efforts, and improves operational efficiencies for clients. We are committed to making AI accessible and impactful for businesses...
AI Infrastructure Engineer

2 days ago

San Jose, California, United States Advanced Micro Devices, Inc Full time $100,000 - $200,000 per year

WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
Founding Engineer – AI

18 hours ago

San Francisco, California, United States Muro AI Full time

About Muro AIMuro AI is transforming how the $2T construction industry plans and builds. Founded by Cornell alumni, ex-founders, and former McKinsey operators, we're building AI agents that automate the most complex, manual, and costly phase of construction: preconstruction.We move fast, build with conviction, and obsess over delivering real impact to the...
Software Engineer, Infrastructure

14 hours ago

San Francisco, California, United States Rethink recruit Full time

About RunloopRunloop is pioneering the next generation of AI-powered software engineering. Our platform empowers developers to build, scale, and optimize AI coding solutions—accelerating how software gets built, maintained, and evolved.We're a tight-knit, senior team of former Google and Stripe engineers, deeply focused on the hard problems of...
Infrastructure / Backend Engineer

4 days ago

San Francisco, California, United States World Labs Full time $150,000 - $250,000 per year

Infrastructure / Backend EngineerLocationSan Francisco Office (HQ)Employment TypeFull timeDepartmentWorld Labs HQAt World Labs, our mission is to revolutionize artificial intelligence by developing Large World Models — taking AI beyond language and 2D visuals into the realm of complex, embodied 3D environments. We're building the infrastructure that allows...
Engineering Manager, Infrastructure

6 days ago

San Francisco, California, United States Apollo Full time

is the leading go-to-market solution for revenue teams, trusted by over 500,000 companies and millions of users globally, from rapidly growing startups to some of the world's largest enterprises. Founded in 2015, the company is one of the fastest growing companies in SaaS, raising approximately $250 million to date and valued at $1.6 billion. provides...
Infrastructure Engineer

6 days ago

San Francisco, California, United States vapi Full time $170,000 - $200,000 per year

About VapiVapi is the most configurable platform for building voice agents. Our platform equips companies with everything they need—telephony, real‑time streaming, deterministic fallbacks, HIPAA/SOC2 compliance, and an AI testing suite—to launch production‑grade voice agents fast. In just 18 months, more than 300,000 developers have signed up, with...

Americas

Europe

Asia / Oceania

Africa

AI Infrastructure Engineer