Machine Learning Infrastructure Specialist
20 hours ago
We are scaling our inference systems to handle millions of LLM requests daily, requiring exceptional talent to drive growth.
This role involves designing and implementing large-scale, fault-tolerant systems for AI infrastructure. Key responsibilities include:
- Architecting distributed systems for our inference network.
- Developing resource allocation models across heterogeneous hardware.
- Optimizing network performance metrics (latency, throughput, availability).
- Building robust monitoring and observability systems.
The ideal candidate will have 5+ years of experience building high-performance, scalable distributed systems, as well as strong programming skills in TypeScript, Python, and either Go, Rust, or C++.
Experience with Kubernetes/Nomad orchestration, AI tooling (ChatGPT, Claude, Cursor), and GPU programming (CUDA) is a plus. Startup experience (pre-seed to series A) is also required.
The salary range is $180K - $300K + Equity (0.1-3%) | Visa Sponsorship Available, with the location being San Francisco, CA.
-
Machine Learning Infrastructure Specialist
7 days ago
San Francisco, California, United States Unreal Gigs Full timeMachine Learning Infrastructure Specialist Wanted at Unreal GigsWe are looking for a skilled Machine Learning Infrastructure Specialist to join our team at Unreal Gigs. As an expert in designing and implementing scalable AI systems, you will play a critical role in driving business innovation.About the RoleThe successful candidate will have a strong...
-
Machine Learning Infrastructure Specialist
2 days ago
San Francisco, California, United States ZipRecruiter Full time**About Us**Welcome to ZipRecruiter, where we're pioneering AI-driven innovation. We're committed to building robust infrastructure that powers our machine learning models at scale. As a Senior Machine Learning Infrastructure Engineer, you'll lead the charge, designing, developing, and optimizing our machine learning infrastructure.**What You'll Do:**Design...
-
Machine Learning Infrastructure Specialist
20 hours ago
San Jose, California, United States Adobe Full timeWe are seeking an experienced Machine Learning Infrastructure Specialist to join our team at Adobe. In this role, you will design, develop, and maintain robust AI/ML infrastructure solutions to support the training and deployment of large-scale AI models.ResponsibilitiesKey responsibilities include:Developing high-quality, product-level code that is easy to...
-
Machine Learning Infrastructure Architect
2 days ago
San Francisco, California, United States Unreal Gigs Full timeCompany Overview: At Unreal Gigs, we're driving the future of AI innovation by building cutting-edge machine learning infrastructure. Our team is dedicated to developing robust and scalable systems that power our models at scale.Position Overview: As a Senior Machine Learning Infrastructure Engineer, you'll lead the design and development of our machine...
-
Machine Learning Infrastructure Architect
1 month ago
San Francisco, California, United States Unreal Gigs Full timeUnreal Gigs OverviewWelcome to Unreal Gigs, a pioneering force in AI-driven innovation. We're committed to building robust infrastructure that powers our machine learning models at scale.Salary: $195,000 - $255,000 per yearPosition SummaryWe're seeking a seasoned Senior Machine Learning Infrastructure Engineer to lead the design, development, and...
-
Machine Learning Infrastructure Architect
2 days ago
San Francisco, California, United States Unreal Gigs Full timeAt Unreal Gigs, we're on the cutting-edge of AI-driven innovation. As a Senior Machine Learning Infrastructure Engineer, you'll lead the design, development, and optimization of our machine learning infrastructure.About the RoleYou'll work on challenging projects, from building scalable data pipelines to deploying and managing machine learning models in...
-
Machine Learning Infrastructure Architect
2 days ago
San Francisco, California, United States ZipRecruiter Full timeJob Title: Cloud Engineering Manager - AI">We are seeking a seasoned Cloud Engineering Manager with expertise in Artificial Intelligence and Machine Learning to lead our cloud infrastructure initiatives. As a Cloud Engineering Manager, you will oversee the design, development, and optimization of our cloud-based infrastructure solutions to support machine...
-
Machine Learning Infrastructure Architect
3 weeks ago
San Francisco, California, United States Unreal Gigs Full timeUnreal GigsWe are looking for a highly skilled Machine Learning Infrastructure Architect to lead our MLOps strategy and build the backbone of our AI operations.About the Role:Job Description:As a Machine Learning Infrastructure Architect, you will be responsible for designing and implementing scalable, secure, and efficient MLOps infrastructure that...
-
San Francisco, California, United States Unreal Gigs Full timeAbout the RoleUnreal Gigs is a trailblazer in AI-driven innovation, and we're seeking a seasoned leader to drive our machine learning infrastructure initiatives.Key ResponsibilitiesTechnical Leadership: Provide strategic guidance, mentorship, and technical leadership to a team of machine learning infrastructure engineers, fostering a culture of excellence,...
-
San Francisco, California, United States Flip Full timeAbout Flip.shopWelcome to Flip.shop, where innovation meets the social commerce revolution. Our Series C funding round has propelled our valuation to an impressive $1.05 billion, and we're redefining the shopping experience by giving consumers a voice in a space dominated by tech giants.Opportunities at Flip.shopThis isn't just a job—it's a chance to build...
-
San Francisco, California, United States Unreal Gigs Full timeJob OverviewCompany Background: Welcome to Unreal Gigs, a leading innovator in machine learning infrastructure. Our mission is to empower data scientists and engineers with cutting-edge technology and expertise.Position Summary: As a Machine Learning Infrastructure Solutions Architect, you will play a critical role in designing and optimizing our machine...
-
Machine Learning Infrastructure Engineer
2 days ago
San Francisco, California, United States Anyscale Full timeAbout AnyscaleWe're a leading provider of distributed computing solutions, dedicated to empowering software developers with accessible and scalable tools.Our mission is to democratize distributed computing and make it accessible to developers of all skill levels. We're commercializing Ray, a popular open-source project that's creating an ecosystem of...
-
Machine Learning Operations Specialist
2 days ago
San Francisco, California, United States ZipRecruiter Full timeAbout the RoleWe are seeking a highly skilled Machine Learning Operations Specialist to join our team at ZipRecruiter. As an MLOps specialist, you will be responsible for designing, automating, and managing robust machine learning pipelines that power AI-driven products.With a strong background in DevOps and cloud infrastructure management, you will work...
-
Machine Learning Infrastructure Engineer
20 hours ago
San Francisco, California, United States Anyscale Full timeAbout the JobWe are seeking a highly skilled engineer to join our distributed training team. As a Machine Learning Infrastructure Engineer at Anyscale, you will play a key role in shaping the future of ML training infrastructure. You will work closely with our team to develop and maintain widely adopted open-source machine learning libraries, including Ray...
-
Machine Learning Infrastructure Architect
7 days ago
San Francisco, California, United States OpenAI Full timeWe are seeking a visionary Machine Learning Infrastructure Architect to join our team at OpenAI in San Francisco, CA. This role involves designing and maintaining robust and secure systems that power the training and advanced use cases of next-gen AI models.You will work closely with researchers to enhance system capabilities and support experimental and...
-
Machine Learning Infrastructure Specialist
7 days ago
San Diego, California, United States Apixio Full timeAbout the RoleThe Senior MLOps Engineer will play a critical role in operationalizing and automating machine learning workflows, ensuring scalability, reliability, and efficiency. As part of our team, you will collaborate closely with data scientists, software engineers, and DevOps teams to deploy, monitor, and manage machine learning models in production...
-
Machine Learning Infrastructure Lead
2 days ago
South San Francisco, California, United States Genentech Full timeAbout the RoleWe're looking for a Machine Learning Infrastructure Lead to join our team at Genentech Computational Sciences. As a key member of our Prescient Design group, you'll play a leading role in developing and maintaining large-scale machine learning models and infrastructure.About the ResponsibilitiesThis role involves:Contributing to cutting-edge...
-
Chief Machine Learning Infrastructure Architect
3 weeks ago
San Francisco, California, United States ZipRecruiter Full timeJob OverviewA highly skilled and experienced Chief Machine Learning Infrastructure Architect is sought after to lead our MLOps efforts, focusing on designing and implementing scalable infrastructure for deploying, monitoring, and managing machine learning models at scale. This role requires a deep understanding of machine learning concepts, strong technical...
-
Machine Learning Infrastructure Architect
3 weeks ago
San Francisco, California, United States Unreal Gigs Full timeCompany Overview: At Unreal Gigs, we're at the forefront of AI-driven innovation. We're committed to building robust infrastructure that powers our machine learning models at scale.
-
Machine Learning Infrastructure Architect
3 weeks ago
San Francisco, California, United States Sentry Full timeAbout the RoleAs a Senior Machine Learning Systems Engineer at Sentry, you will play a pivotal role in shaping the company's AI/ML landscape. Your primary responsibility will be to design and build the core infrastructure required for developing, evaluating, deploying, and iterating on models and pipelines at scale.This position is crucial as it involves...