AI Training Infrastructure Developer

6 days ago


San Francisco, California, United States Tbwa ChiatDay Inc Full time

At Skild AI, we're revolutionizing the field of artificial intelligence by creating a robust and adaptable robotic intelligence that can thrive in unpredictable environments. Our team is comprised of individuals with diverse backgrounds and experiences, from recent graduates to domain experts. While relevant industry experience is valuable, we place greater emphasis on demonstrated abilities and attitude.

Job Overview

We're seeking an AI Training Infrastructure Developer to spearhead the development and optimization of software infrastructure and tools necessary for training cutting-edge AI models. You will be responsible for crafting robust, scalable, and efficient training pipelines and frameworks that support the entire machine learning lifecycle, from data preparation to model deployment. Collaborating closely with researchers and machine learning engineers, you'll ensure seamless integration and operation of training systems, pushing the boundaries of what AI can achieve in real-world robotics applications.

Main Responsibilities
  • Design and maintain robust, scalable, and distributed training pipelines (data preprocessing, training orchestration, and model evaluation) and frameworks for large-scale AI models.
  • Optimize training processes for performance and resource utilization, ensuring scalability and reliability.
  • Work with researchers and machine learning engineers to integrate state-of-the-art algorithms and techniques into training pipelines.
  • Monitor and analyze training, identifying bottlenecks and proposing solutions to improve efficiency and performance.
  • Ensure the robustness and reliability of the training infrastructure, including automated testing and continuous integration.
Requirements
  • Bachelor's or Master's degree in Computer Science, Robotics, Engineering, or a related field, or equivalent practical experience.
  • Proficiency in Python, C++, or similar languages and at least one deep learning library such as PyTorch, TensorFlow, JAX, etc.
  • Strong background in distributed computing, parallel processing techniques, handling large-scale datasets, and data preprocessing.
  • Deep understanding of state-of-the-art machine learning techniques and models.
  • Experience with cloud-based training environments (AWS, Google Cloud, Azure).
  • Experience in developing and maintaining software tooling and infrastructure for machine learning.
  • Deep understanding and practical experience with software engineering principles, including algorithms, data structures, and system design.
  • Experience with continuous integration and automated testing frameworks.
Estimated Salary Range: $145,000 - $375,000 USD per year

  • San Francisco, California, United States Naptha AI Full time

    About Naptha AIWe are seeking exceptional Software Engineering interns to join Naptha AI and contribute to building the future of AI agent infrastructure.This internship offers hands-on experience working with frontier AI technology, backed by industry veterans and technical leaders through NVIDIA Inception, Google for Startups, and Microsoft for Startups.As...


  • San Francisco, California, United States Magic AI Full time

    Company OverviewMagic AI is a cutting-edge technology company dedicated to building safe Artificial General Intelligence (AGI) that accelerates humanity's progress on the world's most important problems.We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than...


  • San Francisco, California, United States Scale AI Full time

    Cloud AI Engineer Position at ScaleWe are seeking an experienced Cloud AI Engineer to join our team at Scale, a leading provider of AI solutions. As a Cloud AI Engineer, you will play a key role in designing and developing our cloud infrastructure platforms and systems.The ideal candidate will have extensive experience in software development and a deep...


  • San Francisco, California, United States Abridge AI Inc. Full time

    Abridge AI Inc. is a pioneering force in healthcare technology, utilizing artificial intelligence to empower deeper understanding and improve clinical documentation efficiency.Role OverviewWe are seeking an exceptional ML Systems Engineer to join our team, responsible for scaling and deploying machine learning models to handle increasing traffic demands and...


  • San Francisco, California, United States Together AI Full time

    Are you a skilled DevOps engineer looking to take your career to the next level? Do you have a passion for designing and building automated infrastructure pipelines? We are seeking a talented Senior DevOps Engineer to join our cloud engineering team at Together AI. About the RoleWe are hiring a highly experienced Senior DevOps Engineer to lead the...


  • San Francisco, California, United States Naptha AI Full time

    Unlock the future of AI agent development as our Senior AI Ecosystem Developer. We're seeking a visionary to build and nurture relationships with pioneering AI developers, shaping the next wave of AI companies.Naptha AI is at the forefront of creating the foundational infrastructure for the next generation of AI systems, enabling frontier developers to build...


  • San Francisco, California, United States Scale AI, Inc. Full time

    About Scale AI, Inc.We are at the forefront of powering AI and LLMs across multiple industries. Our thesis is that to build exceptional LLMs you need exceptional human beings to train them. Humans are essential in providing the best training data for these models, and Scale operates the largest network of humans in the world to provide this training data.The...


  • San Francisco, California, United States Together AI Full time

    About the RoleWe are seeking an experienced Systems Research Engineer to join our team at Together AI. As a key member of our research-driven artificial intelligence company, you will play a crucial role in researching and building the next generation AI platform.Company OverviewTogether AI is committed to creating open and transparent AI systems that drive...


  • San Francisco, California, United States Unum AI Full time

    At Unum AI, we're revolutionizing data infrastructure with our cutting-edge technology. We're seeking a highly skilled Ai Infrastructure Engineer to join our team in designing and implementing next-generation database management systems.About the RoleThis is an exciting opportunity for a passionate engineer to orchestrate software development and hardware...


  • San Francisco, California, United States Unreal Gigs Full time

    Job OverviewWe are Unreal Gigs, a cutting-edge organization at the forefront of AI innovation. Our team is dedicated to building robust and scalable infrastructure that powers the next generation of AI-driven products and services.This role offers an exceptional opportunity for a seasoned professional to lead our AI infrastructure team and drive the...


  • San Francisco, California, United States Figma Full time

    Figma is a platform that makes design accessible to all. Born on the Web, Figma helps entire product teams brainstorm, design and build better products - from start to finish.Job DescriptionWe're looking for an AI engineer who has experience building data pipelines to collect high-quality data, and evaluation systems to evaluate AI models. You will be...


  • San Francisco, California, United States University of California - San Francisco Campus and Health Full time

    Unlock the Power of AIAbout the RoleWe are seeking an exceptional AI Infrastructure Developer to join our team at the University of California - San Francisco Campus and Health. As a key member of our research team, you will play a vital role in developing and maintaining cutting-edge AI infrastructure tools and technologies.The successful candidate will...


  • San Francisco, California, United States Unreal Gigs Full time

    Design and Build AI InfrastructureArchitect and implement scalable infrastructure that supports AI workloads, including machine learning model training, large-scale data processing, and real-time inference.As an AI Infrastructure Engineer, you'll design solutions that ensure high availability, fault tolerance, and performance optimization.


  • San Francisco, California, United States Abridge AI Inc. Full time

    Abridge AI Inc. is a pioneering organization in the field of healthcare technology, dedicated to harnessing the power of artificial intelligence (AI) to revolutionize medical conversations and improve clinical documentation efficiency.Estimated Salary: $190,000 - $270,000 per yearThis role requires an exceptional Full Stack Engineer to join our team, working...

  • Technical Lead

    6 days ago


    San Francisco, California, United States ZipRecruiter Full time

    Job OverviewWe are seeking a seasoned Technical Lead to spearhead the development of our AI infrastructure. This is an exceptional opportunity to work at the forefront of AI research and development, driving innovation and pushing the boundaries of what's possible.As a key member of our team, you will be responsible for designing, building, and maintaining...


  • San Francisco, California, United States Unreal Gigs Full time

    Job OverviewWe are seeking a highly skilled AI Infrastructure Specialist to join our team at Unreal Gigs. As an AI Infrastructure Specialist, you will be responsible for designing, building, and managing scalable infrastructure for machine learning workloads.The ideal candidate will have strong experience with cloud platforms such as AWS, GCP, or Azure, and...


  • San Francisco, California, United States Scale AI, Inc. Full time

    About the RoleScale AI, Inc. is seeking a highly skilled AI Research Scientist to drive the development of our generative AI products. As a key member of our data science team, you will lead the charge in building and refining our AI infrastructure, leveraging your expertise to advance the state-of-the-art in machine learning and artificial...


  • San Francisco, California, United States ZipRecruiter Full time

    Job Title: AI Infrastructure ArchitectAbout the RoleNaptha AI is seeking an exceptional Distributed Infrastructure Lead to architect and build the foundational systems that will power the next wave of AI agent networks. This is a rare opportunity to shape the future of AI infrastructure at a massively ambitious scale, backed by industry veterans and...


  • San Francisco, California, United States Genmo Full time

    At Genmo, we're pushing the boundaries of video generation and Artificial General Intelligence (AGI). As an experienced Senior/Staff AI Infra Engineer, you'll play a key role in designing and scaling our petabyte-scale data infrastructure.Role OverviewWe're seeking someone with strong technical expertise to create robust, scalable systems that manage...

  • AI Engineer

    5 days ago


    San Francisco, California, United States Abridge AI Inc. Full time

    Company OverviewAbridge AI Inc. is a pioneering organization in the field of healthcare technology, leveraging AI to empower deeper understanding and improve clinical documentation efficiencies.