ML Data Infrastructure Lead

Found in: Jooble US O C2 - 2 weeks ago


San Francisco CA, United States Twelve Labs Full time

Who we are

We’re a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world’s most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building foundation AI models that can accurately and instantly search exact moments within petabytes of video archives, generate coherent text summaries of videos, perform prompt-based video generation, and many more. The Twelve Labs platform provides access to its Large Visual Language Models (VLMs) through a suite of APIs that are trained on massive video datasets and learn to understand the meaning and context behind the visuals, conversations, and sounds within videos.

Twelve Labs recently raised $17M in seed funding, recognized as one of CB Insights’ AI 100 companies within a year of its founding, and secured a massive compute resource through partnering with Oracle. We are hyper focused on delivering the Twelve Labs platform to our customers so they can build video understanding into their products and power dream features they could have only imagined.

Part of the pathway to our rapid growth has been paved by the outstanding group of people united by the company’s mission. Beyond prominent venture capital firms such as Index Ventures and Radical Ventures, the Twelve Labs mission is backed by category building luminaries like Fei-Fei Li (Stanford HAI), Silvio Savarese (Salesforce), Oren Etzioni (AI2), Alexandr Wang (Scale), Lukas Biewald (W&B), Jack Conte (Patreon) and more.

We are committed to creating a diverse and inclusive work environment where our team members can bring their full selves to work, bring out their potential, and most importantly, thrive together. We welcome kind, brilliant, and open minded people from all walks of life to our team. If joining this mission speaks to you, we encourage you to apply

About the Role:

As the ML Data Infrastructure Lead at Twelve Labs, you will lead the data team, managing data infrastructure and preparing high quality video data for our training runs. Unlike text or image, video is complex to process (because of size and decoding), multimodal (visual and audio), and has a temporal aspect. Information can become easily redundant while being dependent on earlier information (like text). Because of the complexity of data processing at Twelve Labs, this role will have a significant impact on the quality of our models.

You will:
  • Acquire and deliver massive and high-quality datasets for our large training runs.
  • Develop and implement best practices and data pipelines (ingest, annotate, and incorporate high-quality datasets into model training and evaluation) by working with internal and external data partners.
  • Improve our data infrastructure (e.g., management, versioning) by collaborating with software engineers and security engineers.
  • Collaborate with modeling and product teams to evaluate the impact of the data on our models and continuously improve the data quality.
  • Hire, provide career growth guidance, coaching, and training for engineers on your team.
  • Work across teams to understand and manage project priorities and product deliverables, evaluate trade-offs, and drive technical initiatives from execution to landing.
You may be a good fit if you have:
  • 5+ years of experience in managing unstructured and/or human-annotated data (e.g., collecting or assessing sample quality)
  • Owned data initiatives such as data cleaning, data validation, data augmentation, and image or video processing
  • Proficiency in Python
  • Experience with ML frameworks such as Pytorch and Tensorflow
  • 2+ years people management experience
Desired Experience:
  • MS, PhD in Computer Science or a related field.
  • Experience with creating large-scale datasets or RLHF-based dataset creation.

Interview and Onboarding Process

Recruiter Phone Screen -> Hiring Manager Call -> Technical Interview and/or Take Home Assignment -> Culture Interview -> Reference Checks

We're also excited to share that we'll do global onboarding in Seoul for all new hires (company-sponsored travel).

Even If there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply If you are a 0-to-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at Twelve Labs.

We welcome applicants from all walks of life and are committed to equal opportunity employment. We cherish and celebrate diversity not just because it is the right thing to do, but because it makes our company much stronger.

Benefits and Perks

An open and inclusive culture and work environment.

Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

Full health, dental, and vision benefits

️ Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

Remote-flexible, offices in San Francisco and Seoul and coworking stipend

VISA support (such as H1B and OPT transfer for US employees)

#J-18808-Ljbffr
  • ML/AI Engineer, Infrastructure

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Figma Full time

    We’re looking for engineers with a Machine Learning and Artificial Intelligence background to improve our products and build new capabilities. You will be building the core infrastructure to serve and deploy models efficiently, as well as world-class tooling that enables us to iterate on models quickly. . You will be combining industry best practices and...


  • San Francisco, United States Sauron Full time

    Who We Are Sauron is the home security company of the future. Homeowners today lack compelling options when it comes to peace of mind against vulnerabilities, and total command and control of their home; there is no definitive, protective brand in the space. Leveraging cutting-edge AI,, sensor technology, and nonlethal deterrence, Sauron brings...

  • Lead ML Systems Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Twelve Labs Full time

    Who we are We’re a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world’s most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building...

  • Senior ML Engineer

    2 weeks ago


    San Francisco, United States Cleanlab Full time

    At Cleanlab you willPioneer novel software systems for the rapidly growing field of data-centric AI. Our tools enable data scientists/engineers (across all industries) to effectively diagnose/fix issues in their datasets thus improving the quality of their business’s core asset.Determine how to best leverage new Generative AI advances/infrastructure for...

  • Principal Infrastructure Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Nextdata Technologies Inc Full time

    The company Decentralized data is the future. Data mesh is the right idea. We’re here to make it a reality. Nextdata OS is a data-mesh-native platform built to meet the challenge of decentralizing data at scale. We are inventing a new way for developers to work with data and share it responsibly via data product containers. Our vision is to build a...

  • Senior Principal, ML Infrastructure Software Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    San Jose, CA, United States Conductor Full time

    What You’ll Do The AGI (Artificial General Intelligence) Computing Lab is dedicated to solving the complex system-level challenges posed by the growing demands of future AI/ML workloads. Our team is committed to designing and developing scalable platforms that can effectively handle the computational and memory requirements of these workloads while...

  • Data Infrastructure Engineer

    Found in: Talent US C2 - 2 weeks ago


    San Francisco, United States TWILIO Full time

    Responsibilities In this role, you’ll: Dive into our dataset and design, implement, and scale data pre/post-processing pipelines. Work on applied ML solutions in the areas of query processing, data mining, cleaning, normalizing and modeling. Design and build data platforms & frameworks for processing high volumes of data, in real-time as well as...

  • Senior Software Engineer, ML Ops

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States TRM Labs Full time

    The Data Platform team collaborates with an experienced group of data scientists, engineers, and product managers to build highly available and scalable data infrastructure for TRM's products and services. As a Senior Software Engineer on the Data Platform team, you will collaborate and partner with Data Scientists and Machine Learning teams to...


  • San Francisco, United States Eon Systems PBC Full time

    Eon collects large-scale neuroscientific data sets to train machine learning based brain emulations. We believe it is possible to scale this technology in a safe, secure and trustworthy manner in the next decade and empower humanity in unprecedented ways.RoleThe data infrastructure engineer is responsible for the setup and maintenance of systems capable of...

  • Data Infrastructure Engineer

    Found in: beBee jobs US - 2 weeks ago


    San Francisco, California, United States Basis Full time

    What working with us is likeWe're a small team based in San Francisco with a few colleagues remote around the world. We have a hybrid office culture with 2 days a week at home and 3 in our Financial District office, SF. We will hire talented people regardless of location, but we prefer candidates in the SF Bay Area or willing to relocate.What we're looking...


  • San Francisco, United States CareerBuilder Full time

    The company Decentralized data is the future. Data mesh is the right idea. Were here to make it a reality. Nextdata OS is a data-mesh-native platform built to meet the challenge of decentralizing data at scale. We are inventing a new way for developers to work with data and share it responsibly via data product containers. Our vision is to build a world...

  • Staff Machine Learning Engineer, Applied ML Accelerator

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Stripe Full time

    Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Some examples include helping our users resolve issues with Stripe faster or making it easier for data...

  • Lead Infrastructure Technician

    Found in: Jooble US O C2 - 1 week ago


    San Francisco, CA, United States EOS Full time

    EOS IT Solutions is a Global Technology and Logistics company, providing Collaboration and Business IT Support services to some of the world's largest industry leaders, delivering forward-thinking solutions based on multi-domain architecture. Customer satisfaction and commitment to superior quality of service are our top business priorities, along with...

  • Research Engineer

    Found in: Jooble US O C2 - 3 days ago


    San Francisco, CA, United States Eon Systems PBC Full time

    Eon collects large-scale neuroscientific data sets to train machine learning based brain emulations. We believe it is possible to scale this technology in a safe, secure and trustworthy manner in the next decade and empower humanity in unprecedented ways. Role Collaborating with a diverse team, including product managers, researchers, and engineering...

  • Data Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Everyday Agents Full time

    Consumer Travel is Ripe for Disruption Yet, the current travel landscape is a maze of disjointed self-serve transactions, often leaving travelers feeling frustrated and unfulfilled. The explosion of travel booking sites is sucking the fun out of getting away, with travelers needing to visit dozens of websites to plan a trip and often needing to resort to...

  • Director, Data Engineering

    Found in: Jooble US O C2 - 2 weeks ago


    San Francisco, CA, United States Tubi Tv Full time

    The Director of Data Engineering will own and build the core components of Tubi’s batch and Streaming data platform. This includes building out and running small teams to focus on the data lake, governance, batch compute and streaming pipelines for our data. With realtime data becoming more important for reactive AI pipelines, Tubi’s data infrastructure...

  • Senior ML Engineer

    Found in: Jooble US O C2 - 1 week ago


    San Francisco, CA, United States Open Data Science Full time

    Мы в поиске ML Engineer на полную удаленка (оплата в любой валюте, заключение контракта с разными странами) Мы проверяем гипотезы бизнеса, масштабируем успешные из них, и выдвигаем свои идеи. AppLovin...

  • SW Infrastructure Engineer

    Found in: beBee jobs US - 2 weeks ago


    San Francisco, California, United States Roadio (Formerly Streetlogic) Full time

    Streetlogic is a team of cyclists on a mission to make biking safer. Our product gives more people the confidence to bike - leading to more livable cities and a cleaner environment.We're building a light-weight, Advanced Driver Assistance System (ADAS) for ebikes, using a vision-first approach to detect incoming collisions and give riders early warning and...

  • Lead Data Scientist

    1 week ago


    San Francisco, United States Glo Full time

    Job DescriptionJob DescriptionPosition Overview:The Lead Data Scientist will formulate Glo’s data strategy and work with the Data Engineer to implement the appropriate architecture and execute on the strategy. This role will be responsible for enabling data-driven decisions across the Company by sourcing accurate data, building scalable infrastructure, and...

  • Data Infrastructure Engineer

    Found in: Appcast US C2 - 20 hours ago


    San Francisco, United States Acceler8 Talent Full time

    Data Infrastructure EngineerAre you an experienced Data Infrastructure Engineer ready to tackle challenges in AI and data processing at an unprecedented scale? Our organization is looking for a skilled professional to join our dynamic team. Here, technology and innovation intersect to create smarter solutions that redefine human-computer interactions.About...