Machine Learning Infrastructure Engineer

2 days ago


Mountain View, California, United States Waymo Full time
Job Title

Senior Machine Learning Engineer, Training

About the Role

We are seeking a Senior Machine Learning Engineer to join our Hybrid team at Waymo. As a key member of our ML Infrastructure team, you will be responsible for developing the infrastructure components necessary for distributed training, implementing automation solutions for provisioning, deployment, monitoring, and scaling of distributed training infrastructure, and identifying performance bottlenecks and optimization opportunities.

Responsibilities
  • Develop the infrastructure components necessary for distributed training
  • Implement automation solutions for provisioning, deployment, monitoring, and scaling of distributed training infrastructure
  • Monitor system health, diagnose and perform routine maintenance tasks to ensure the reliability of the distributed training infrastructure
  • Identify performance bottlenecks and optimization opportunities
  • Improve the developer experience and performance of our scalable ML framework
Requirements
  • Bachelor's degree in Computer Science, Engineering, or related field, or 4+ years equivalent experience
  • Experience building distributed systems for production environments
  • Solid Python or C++ skills
  • Prior experience with Machine Learning frameworks (e.g., TensorFlow, PyTorch) and distributed training algorithms
Preferred Qualifications
  • Practical familiarity using ML accelerator profiling tools to uncover performance bottlenecks
  • Experience deploying and managing distributed systems in cloud environments
  • Knowledge of optimization and deep learning algorithms
About Us

Waymo is an autonomous driving technology company with the mission to be the most trusted driver. We have a fully autonomous ride-hailing service, Waymo One, and can also be applied to a range of vehicle platforms and product use cases.

Benefits

Waymo employees are eligible to participate in our discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements.

Estimated Salary Range

$192,000-$243,000 USD per year.



  • Mountain View, California, United States Nuro Full time

    **About Us**">Nuro is a robotics company that aims to improve everyday life through innovative technologies. Founded in 2016, we have spent years developing autonomous driving (AD) technology and commercializing AD applications. Our world-class autonomous driving system, the Nuro DriverTM, combines AD hardware with our AI-first self-driving software.We've...


  • Mountain View, California, United States NewsBreak Full time

    About UsAt NewsBreak, we are revolutionizing the way users interact with local news and their communities. Our mission is to foster safer, more vibrant, and authentically connected lives through robust collaborations with thousands of local publishers and businesses across the nation.We proudly stand as the nation's premier local news app, with our...


  • Mountain View, California, United States Nuro Full time

    Nuro is a pioneering robotics company dedicated to enhancing everyday life through innovative technology. Founded in 2016, we have spent years developing autonomous driving (AD) solutions and commercializing AD applications. Our world-class Nuro DriverTM combines AD hardware with AI-first self-driving software, built to learn and improve through data.We've...


  • Mountain View, California, United States Waymo Full time

    About the RoleWe're looking for a highly skilled Senior Machine Learning Engineer, Training to join our Waymo ML Infrastructure team. In this role, you'll develop infrastructure components for distributed training and implement automation solutions for provisioning, deployment, monitoring, and scaling of distributed training infrastructure.This Hybrid role...


  • Mountain View, California, United States NewsBreak Full time

    About NewsBreakNewsBreak is revolutionizing the way users interact with local news and their communities by bridging local users, content creators, and businesses.We foster safer, more vibrant, and authentically connected lives through robust collaborations with thousands of local publishers and businesses across the nation.Our MissionWe are redefining the...


  • Mountain View, California, United States Waymo Full time

    About WaymoWaymo is an autonomous driving technology company dedicated to developing the world's most advanced driver.The Waymo Driver, our self-driving system, has been autonomously driving tens of millions of miles on public roads and simulating billions of miles in virtual environments across 13+ U.S. states.Job DescriptionWe are seeking a skilled...


  • Mountain View, California, United States NewsBreak Full time

    NewsBreak is redefining the way users interact with local news and their communities. Our mission is to foster safer, more vibrant, and authentically connected lives by bridging local users, content creators, and businesses.We are looking for a talented Machine Learning Infrastructure Developer to join our team. As a key member of our infrastructure team,...


  • Mountain View, California, United States Moveworks Full time

    Job DescriptionWe are seeking a highly skilled Senior Machine Learning Infrastructure Specialist to join our team at Moveworks. As a critical member of our AI infrastructure team, you will play a key role in building and optimizing cutting-edge machine learning systems for large language models.In this position, you will work closely with our...


  • Mountain View, California, United States Tik Tok Full time

    Job SummaryThe Machine Learning Infrastructure Specialist will be responsible for designing and implementing the infrastructure for TikTok's machine learning models. This role requires expertise in distributed systems, data engineering, and cloud computing.Key ResponsibilitiesDesign and develop scalable data pipelines for machine learning model training and...


  • Mountain View, California, United States Moveworks Full time

    At Moveworks, we're revolutionizing the way businesses interact with AI. As a Senior Machine Learning Infrastructure Specialist, you'll play a critical role in building and scaling our cutting-edge ML infrastructure.">We're looking for an expert in machine learning who can design, build, and optimize scalable ML infrastructure to support training,...


  • Mountain View, California, United States CV Library Full time

    Job OverviewWe are seeking a highly skilled Cloud Infrastructure Specialist to join our team at CV Library. This is a 12+ month contract opportunity that requires expertise in machine learning infrastructure, cloud platforms, and containerization technologies.Key Responsibilities:Design and implement scalable machine learning infrastructure on Google Cloud...


  • Mountain View, California, United States NewsBreak Full time

    Company Overview:NewsBreak is a pioneering local news app that has revolutionized the way users interact with their communities. Founded in 2015, the company has established itself as the nation's premier local news provider, bridging local users, content creators, and businesses across the nation.The company's headquarters is located in the tech hub of...


  • Mountain View, California, United States Nuro Full time

    About NuroNuro exists to better everyday life through robotics. Founded in 2016, we've spent eight years developing autonomous driving technology and commercializing applications.Our Driver is a world-class system combining AD hardware with our AI-first self-driving software. Built to learn and improve through data, it's one of the few driverless autonomous...


  • Mountain View, California, United States Waymo Full time

    **About Us**Waymo is a leading autonomous driving technology company dedicated to improving access to mobility while saving lives. Our mission is to be the most trusted driver, and we're committed to developing the world's most experienced driver - The Waymo DriverTM.We're seeking an exceptional Senior Machine Learning Engineer, Training to join our Hybrid...


  • Mountain View, California, United States Tik Tok Full time

    About the Role">TikTok is seeking a talented Machine Learning Engineer - Model Training Infrastructure to join our AML team. As a key member of this team, you will be responsible for designing and implementing a global-scale machine learning system for feeds, ads, and search ranking models.">Key Responsibilities">">Design and implement a global-scale machine...


  • Mountain View, California, United States Tik Tok Full time

    About Our Team We are a dynamic team of engineers working on building a scalable and secure machine learning infrastructure for TikTok. Our team is passionate about pushing the boundaries of AI innovation and is committed to delivering high-quality solutions that meet the needs of our users.Job Description We are seeking a talented Machine Learning Engineer...


  • Mountain View, California, United States Moveworks Full time

    About the RoleWe are seeking an experienced Machine Learning Engineer to help build cutting-edge ML infrastructure for large language models at Moveworks. This critical role will involve designing, building, and optimizing scalable machine learning systems for training, evaluation, and deployment of LLMs.The successful candidate will collaborate with our...


  • Mountain View, California, United States Tik Tok Full time

    Job SummaryTikTok is seeking a skilled Machine Learning Engineer - Machine Learning Infrastructure to join our AML team in the United States. As a key member of our global organization, you will be responsible for designing and implementing a next-generation AI infrastructure and recommendation platform for ads ranking, search ranking, and live & ecom...


  • Mountain View, California, United States MatX Full time

    About the RoleWe are looking for a talented Machine Learning Hardware Engineer to join our team. The ideal candidate will have a deep understanding of Machine Learning Accelerator architectures and experience with benchmarking and optimizing code.Key ResponsibilitiesDesign and develop hardware models for our ML workloads.Collaborate with cross-functional...


  • Mountain View, California, United States Waymo Full time

    **Job Description**In this position, you will be responsible for designing, implementing, and optimizing the distributed training infrastructure for our machine learning models. You will work closely with cross-functional teams to develop solutions that improve the scalability, reliability, and performance of our ML frameworks.