Staff ML Infrastructure Engineer

3 weeks ago

San Joaquin County CA, United States Cubiq Recruitment Full time

Staff / Lead ML Infrastructure Engineer San Francisco, CA — Onsite Salary - Over market average + equity We are building one of the world's leading generative video and multimodal AI platforms, and we're looking for a senior infrastructure engineer to drive the backbone that makes it possible. This role is ideal for an engineer from a top-tier tech company who has built cloud-scale systems, high-performance compute platforms, and battle-tested CI/CD pipelines that support complex ML workloads. What You'll Own Core ML Platform Architecture: Design and evolve the infrastructure that supports large-scale generative video and multimodal model training, evaluation, and deployment. High-Throughput Compute Systems: Build and optimize GPU/TPU clusters, distributed training systems, and orchestration layers tailored for video-heavy pipelines. Production Reliability for Generative Models: Create the tooling and services needed to safely push frequent model updates while handling massive compute loads and long-running jobs. End-to-End CI/CD for ML: Lead the development of automated pipelines for model training, validation, artifact management, and production rollout. Multimodal Data Infrastructure: Build systems to ingest, version, transform, and serve large-scale video, audio, and text datasets with high reliability. Internal Developer Experience: Partner with research, product, and applied ML teams to build intuitive internal tooling for experiment tracking, model lineage, and resource scheduling. Technical Leadership: Mentor engineers, set platform standards, and influence long-term architectural direction. What You've Done Experience architecting and operating large-scale infrastructure at a cloud provider, hyperscaler, or leading AI company. Built or owned mission-critical CI/CD systems, high-capacity compute platforms, or data infrastructure supporting ML teams. Deep experience with distributed compute across GPUs/accelerators, Kubernetes, and cloud infrastructure (AWS/GCP/Azure). Strong engineering fundamentals in Python, Go, or equivalent languages. Previous exposure to ML training pipelines—especially systems that handle heavy video, multimodal, or high-dimensional data. Demonstrated ability to lead complex cross-org initiatives and drive technical strategy. Nice to Have Experience with video processing systems, large-scale media pipelines, or streaming architectures. Familiarity with modern multimodal or video-generation frameworks (PyTorch, JAX, diffusers, custom accelerators). Experience with Ray, Triton, CUDA optimization, or specialized scheduling for ML workloads. Background working in high-growth AI startups or research-focused environments. Security and compliance considerations for models that generate or process user content. Why Join Shape the underlying platform powering one of the most advanced generative video systems in the world. Influence the future of multimodal AI by building infrastructure that directly accelerates research and product breakthroughs. Work closely with experienced founding engineers, researchers, and platform builders from leading tech companies. Highly competitive compensation, meaningful equity, and strong in-person engineering culture in San Francisco.

Staff ML Infrastructure Engineer

3 weeks ago

San Francisco, CA, United States Cubiq Recruitment Full time

Staff / Lead ML Infrastructure Engineer San Francisco, CA — Onsite Salary - Over market average + equity Do not wait to apply after reading this description a high application volume is expected for this opportunity. We are building one of the world’s leading generative video and multimodal AI platforms, and we’re looking for a senior infrastructure...
Staff ML Engineer

7 hours ago

San Jose, United States ChipStack Inc Full time

About UsChips are at the center of today's tech-driven world. But how we design them has not changed in decades, while their complexity and specialization have skyrocketed due to increasing performance demands from applications like AI. We want to change that.Our team is small, technical, and fast-moving. We've built and shipped at the intersection of AI,...
Staff ML Engineer

3 weeks ago

San Jose, CA, United States ChipStack, Inc. Full time

About Us Considering making an application for this job Check all the details in this job description, and then click on Apply. Chips are at the center of today's tech-driven world. But how we design them has not changed in decades, while their complexity and specialization have skyrocketed due to increasing performance demands from applications like AI. We...
ML Infrastructure Engineer, Safeguards

3 weeks ago

San Francisco, CA, United States GlueGROUPS Inc. Full time

About Anthropic Scroll down to find an indepth overview of this job, and what is expected of candidates Make an application by clicking on the Apply button. Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of...
ML Infrastructure Engineer

2 hours ago

San Francisco, California, United States Acceler8 Talent Full time

Senior ML Infrastructure / Backend EngineerSeries C Startup | AI-Powered 3D & Avatar Platform | Hybrid (LA or SF)We're hiring a Senior ML Infrastructure / Backend Engineer to join a well-funded AI company building the visual and interaction layer for the next generation of AI-powered digital identities.This team is developing production systems that bring AI...
Member of Technical Staff: ML Infrastructure

4 days ago

San Francisco, United States Essential AI Full time

Member of Technical Staff: ML Infrastructure Join to apply for the Member of Technical Staff: ML Infrastructure role at Essential AI About Us Essential AI is building an open platform to fuel and accelerate AI breakthroughs globally. Our open models, robust tooling, reproducible pipelines, and evaluation frameworks are designed for collaboration and...
Senior/Staff Software Engineer

3 weeks ago

San Francisco, United States Voxel Labs Full time

Who Are We Industrial labor is incredibly dangerous work - almost 3 million people in the US per year are injured in the workplace for entirely preventable and at times, fatal or debilitating causes. Protecting these essential people who power our world is what motivates Voxelitos, and we'd love for you to join us. At Voxel, we're passionate about...
Senior ML Infrastructure Engineer

41 minutes ago

San Francisco, California, United States Parametric (YC F25) Full time

About UsParametric is building robots to reliably automate physical labor in the real world. We've spent the last few months aggressively building our technology and fundraising and are now excited to begin rapidly growing the company.About The RoleAs a Senior ML Infrastructure Engineer, you'll build the systems that power our entire autonomy stack. You'll...
Senior ML Infrastructure Engineering Manager

3 weeks ago

Sunnyvale, CA, United States Google Inc. Full time

A leading tech company in Sunnyvale, CA, seeks a Software Engineering Manager for ML Infrastructure. This role involves overseeing technical leadership on major projects and managing a team of engineers. You will drive product strategy and optimize ML infrastructure while ensuring your team's growth and success. The position requires extensive software...
Senior/Staff Software Engineer

1 week ago

San Francisco, CA, United States Voxel Labs Full time

Who Are We Industrial labor is incredibly dangerous work - almost 3 million people in the US per year are injured in the workplace for entirely preventable and at times, fatal or debilitating causes. Protecting these essential people who power our world is what motivates Voxelitos, and we'd love for you to join us. At Voxel, we're passionate about...

Americas

Europe

Asia / Oceania

Africa

Staff ML Infrastructure Engineer