Machine Learning Infrastructure Engineer

3 weeks ago


Palo Alto, California, United States Tesla Full time
Job Description

The role of a Software Engineer on Tesla's Autopilot AI team involves optimizing and scaling our neural network training infrastructure. This position requires expertise in designing, implementing, and maintaining high-performance applications for neural network training, evaluation, and data processing pipelines.

Responsibilities
  • Data Pipeline Design and Implementation: Build and maintain robust data processing pipelines that handle petabytes of autonomous vehicle data, including images, videos, and auto-generated labels, ensuring scalability and reliability.
  • Neural Network Training Process Optimization: Support neural network training by optimizing code and data formats for faster data loading, orchestrating auto-labeling jobs, and debugging bottlenecks to enhance overall training efficiency.
  • System Performance Enhancement: Develop and implement automation, monitoring, and optimization tools to improve the efficiency of system performance, including resource utilization, parallelism, and data I/O.
Requirements
  • Strong Software Engineering Skills: Extensive experience with Python and software engineering best practices, including code optimization and system-level programming.
  • Experience with Deep Learning Frameworks: Proficiency in one or more deep learning frameworks, such as PyTorch or TensorFlow, with hands-on experience in optimizing model training processes.
  • Distributed Systems Experience: Proven track record of building and managing large-scale distributed systems, particularly in AI/ML workflows, with a deep understanding of parallel computing, resource utilization, and data handling.

Estimated Salary: $140,000 - $360,000 per year



  • Palo Alto, California, United States Rivian Full time

    Role SummaryWe're seeking a skilled professional to join our Platform Architecture team as a Machine Learning Infrastructure Engineer. In this role, you will be responsible for developing infrastructure to enable state-of-the-art machine learning models used in ADAS systems to run efficiently on the SoC. This will involve collaborating closely with Software,...


  • Palo Alto, California, United States Snap Full time

    About SnapSnap is a technology company that believes the camera presents the greatest opportunity to improve the way people live and communicate. We contribute to human progress by empowering people to express themselves, live in the moment, learn about the world, and have fun together.We are passionate about creating innovative products that enhance...


  • Palo Alto, California, United States Lanai Full time

    The RoleWe're looking for an ML and Data Science Engineer to help build the world's best enterprise AI platform that enables humans to do the extraordinary. You'll be working on exciting challenges such as LLM applications, Natural Language Understanding (NLU), domain adaptation, question answering, semantic search, and many more.Your expertise will be...


  • Palo Alto, California, United States FORDER I.T. Full time

    Job Title: Machine Learning Infrastructure ArchitectWe are seeking an experienced Machine Learning Infrastructure Architect to join our team at FORDER I.T. This role involves designing, optimizing, and scaling ML infrastructure to drive advancements in our advertising technology.About the Role:The successful candidate will have a strong background in machine...


  • Palo Alto, California, United States Qualified Health Full time

    Qualified Health is seeking an experienced MLOps Engineer to join our team and play a key role in designing, implementing, and maintaining infrastructure for deploying and managing advanced gen-AI agents and workflows powered by large language models.">About the Role">This position requires collaboration with data scientists and engineers to translate...


  • Palo Alto, California, United States Woven by Toyota Full time

    Job ResponsibilitiesAs a Machine Learning Engineer on our team at Woven by Toyota, you will play a critical role in accelerating the development and deployment of machine learning models while improving their performance in joint projects. Your key responsibilities will include:• Developing and integrating cutting-edge methods for efficient large-scale...


  • Palo Alto, California, United States Tesla Full time

    Company OverviewTesla is a leading electric vehicle and clean energy company that is dedicated to accelerating the world's transition to sustainable energy.Job DescriptionWe are seeking a highly skilled Software Engineer to join our Autonomy team, where you will be responsible for building infrastructure to facilitate neural network architecture design and...


  • Palo Alto, California, United States Tesla Full time

    **Accelerate Innovation with Tesla's Autopilot AI Team**We are seeking a highly skilled **Software Engineer - Model Scaling, Autopilot AI** to join our team at Tesla. As a key member of our Autopilot AI team, you will play a crucial role in optimizing and scaling our neural network training infrastructure.You will work closely with a specialized team of...


  • Palo Alto, California, United States AiDash Full time

    About UsAiDash is making waves in the climate tech space, helping critical infrastructure industries transition to a more sustainable future. Our innovative approach combines satellite data and AI to provide actionable insights, empowering customers to reduce costs, improve reliability, and meet their sustainability goals.As a leading player in the industry,...


  • Palo Alto, California, United States Amazon Full time

    Job DescriptionDesign & Develop: Design, write code, and deploy big data and machine learning services that support search R&D processes. These services define the foundation of our search infrastructure.Operational Excellence: Evaluate system performance, security, design system metrics, and drive quality improvements. Obsess over customer needs and...


  • Palo Alto, California, United States Tesla Full time

    As a member of Tesla's cutting-edge team, you will play a pivotal role in optimizing and scaling our neural network training infrastructure. You will collaborate closely with world-class ML Researchers and Engineers to tackle unique challenges at the intersection of AI and ML training accelerators.Key ResponsibilitiesWork with machine learning Researchers...


  • Palo Alto, California, United States Vianai Systems, Inc. Full time

    Vianai Systems, Inc. is on a mission to create innovative human-centered AI products that transform industries. We're looking for a skilled Machine Learning Infrastructure Specialist to join our team.Main Responsibilities:Develop and maintain robust ML infrastructure components.Setup and drive CI/CD automation initiatives.Collaborate with cross-functional...


  • Palo Alto, California, United States Inflection AI Full time

    Job Description and RequirementsThe Machine Learning Software Engineer role plays a crucial part in integrating ML frameworks and models into our platform for enterprise applications. This involves developing, deploying, and optimizing ML models, ensuring seamless integration with backend systems and APIs to deliver robust enterprise solutions.This position...


  • Palo Alto, California, United States Snap Inc. Full time

    Key Responsibilities:We are seeking an experienced Machine Learning Engineering Manager to lead the development and optimization of the personalized video recommendation engine, working closely with software engineers to ensure scalability, performance, and reliability.The successful candidate will oversee the design and implementation of robust machine...


  • Palo Alto, California, United States Amazon Full time

    Your Responsibilities:You will design, develop, and deploy ML data infrastructure for search ranking at Amazon scale. You will work alongside systems engineers, machine learning scientists, and data analysts to build and deploy ML data infrastructure that meets the needs of various search teams. You will distill project requirements into coherent projects,...


  • Palo Alto, California, United States Match Group Full time

    Job DescriptionWe're looking for a talented Sr. Software Engineer to join our Machine Learning infrastructure team. As a key member of our Engineering team, you'll be responsible for designing and implementing scalable and robust infrastructure to support our machine learning engineers across all business units. Your work will enable teams to rapidly test...


  • Palo Alto, California, United States Snap Inc. Full time

    Job OverviewSnap Inc. is seeking a highly skilled Machine Learning Engineering Manager to lead the Embedding Platform team.The successful candidate will be responsible for leading a team of machine learning engineers and software engineers in developing signals, models, embedding evaluation and monitoring for user understanding and content...


  • Palo Alto, California, United States Match Group Full time

    About UsTinder, part of the Match Group, revolutionized how people meet and connect. Founded in 2012, our rapid growth demonstrates our ability to fulfill a fundamental human need: real connection. With over 630 million downloads and 97 billion matches, we serve approximately 50 million users per month in 190 countries and 45+ languages.We are looking for a...


  • Palo Alto, California, United States AiDash Full time

    About AiDashAiDash is a climate-tech company making critical infrastructure industries climate-resilient and sustainable with satellites and AI. Our full-stack SaaS solutions help customers in electric, gas, and water utilities, transportation, and construction transform asset inspection and maintenance while complying with biodiversity net gain mandates and...


  • Palo Alto, California, United States Snap Inc. Full time

    Job Overview:Snap Inc. is a technology company that aims to improve the way people live and communicate through its innovative camera-based products.We are seeking a seasoned Machine Learning Engineering Manager to lead our Content Relevance team.Key Responsibilities:Oversee a team of machine learning engineers and software engineers in developing and...