Lead Data Engineer, ML Data Platform

4 weeks ago


San Francisco CA, United States Twelve Labs Inc. Full time

We’re a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world’s most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building foundation AI models that can accurately and instantly search exact moments within petabytes of video archives, generate coherent text summaries of videos, perform prompt-based video generation, and many more. The Twelve Labs platform provides access to its Large Visual Language Models (VLMs) through a suite of APIs that are trained on massive video datasets and learn to understand the meaning and context behind the visuals, conversations, and sounds within videos.

Twelve Labs recently raised $17M in seed funding, recognized as one of CB Insights’ AI 100 companies within a year of its founding, and secured a massive compute resource through partnering with Oracle. We are hyper focused on delivering the Twelve Labs platform to our customers so they can build video understanding into their products and power dream features they could have only imagined.

Part of the pathway to our rapid growth has been paved by the outstanding group of people united by the company’s mission. Beyond prominent venture capital firms such as Index Ventures and Radical Ventures, the Twelve Labs mission is backed by category building luminaries like Fei-Fei Li (Stanford HAI), Silvio Savarese (Salesforce), Oren Etzioni (AI2), Alexandr Wang (Scale), Lukas Biewald (W&B), Jack Conte (Patreon) and more.

We are committed to creating a diverse and inclusive work environment where our team members can bring their full selves to work, bring out their potential, and most importantly, thrive together. We welcome kind, brilliant, and open minded people from all walks of life to our team. If joining this mission speaks to you, we encourage you to apply

About the Role:

As the ML Data Infrastructure Lead at Twelve Labs, you will lead the data team, managing data infrastructure and preparing high quality video data for our training runs. Unlike text or image, video is complex to process (because of size and decoding), multimodal (visual and audio), and has a temporal aspect. Information can become easily redundant while being dependent on earlier information (like text). Because of the complexity of data processing at Twelve Labs, this role will have a significant impact on the quality of our models.

You will:
  • Acquire and deliver massive and high-quality datasets for our large training runs.
  • Develop and implement best practices and data pipelines (ingest, annotate, and incorporate high-quality datasets into model training and evaluation) by working with internal and external data partners.
  • Improve our data infrastructure (e.g., management, versioning) by collaborating with software engineers and security engineers.
  • Collaborate with modeling and product teams to evaluate the impact of the data on our models and continuously improve the data quality.
  • Hire, provide career growth guidance, coaching, and training for engineers on your team.
  • Work across teams to understand and manage project priorities and product deliverables, evaluate trade-offs, and drive technical initiatives from execution to landing.
You may be a good fit if you have:
  • 5+ years of experience in managing unstructured and/or human-annotated data (e.g., collecting or assessing sample quality)
  • Owned data initiatives such as data cleaning, data validation, data augmentation, and image or video processing
  • Proficiency in Python
  • Experience with ML frameworks such as Pytorch and Tensorflow
  • 2+ years people management experience
Desired Experience:
  • MS, PhD in Computer Science or a related field.
  • Experience with creating large-scale datasets or RLHF-based dataset creation.
Interview and Onboarding Process

Recruiter Phone Screen -> Hiring Manager Call -> Technical Interview and/or Take Home Assignment -> Culture Interview -> Reference Checks

We're also excited to share that we'll do global onboarding in Seoul for all new hires (company-sponsored travel).

Even If there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply If you are a 0-to-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at Twelve Labs.

We welcome applicants from all walks of life and are committed to equal opportunity employment. We cherish and celebrate diversity not just because it is the right thing to do, but because it makes our company much stronger.

Benefits and Perks

An open and inclusive culture and work environment.

Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

Full health, dental, and vision benefits

️ Extremely flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

Remote-flexible, offices in San Francisco and Seoul and coworking stipend

VISA support (such as H1B and OPT transfer for US employees)

#J-18808-Ljbffr

  • San Francisco, United States Twelve Labs Full time

    Who we are We’re a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world’s most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building...


  • San Francisco, California, United States Twelve Labs Full time

    Who we areWe're a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world's most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building...


  • San Francisco, California, United States Twelve Labs Full time

    Who we areWe're a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world's most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building...


  • San Francisco, California, United States Twelve Labs Full time

    Who we areWe're a fast-moving, diverse team pushing the frontiers of artificial intelligence. At Twelve Labs, our mission is to help developers build programs that can see, listen, and understand the world as we do by bringing the world's most powerful video understanding infrastructure to market. As a part of achieving this mission, we are building...

  • Lead Data Engineer

    4 days ago


    San Francisco, United States StreetLight Data Full time

    StreetLight pioneered the use of Big Data analytics to shed light on how people, goods, and services move, empowering smarter, data-driven transportation decisions. The company applies proprietary machine-learning algorithms and data processing resources to measure travel patterns of vehicles, bicycles and pedestrians that enable complex transportation...


  • San Francisco, California, United States Twelve Labs Full time

    Who we areAt Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with...


  • San Francisco, CA, United States Unreal Staffing, Inc Full time

    Company Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the boundaries of what's...


  • San Francisco, CA, United States Unreal Gigs Full time

    Job Description Job Description Company Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the...


  • San Francisco, CA, United States Tecton Full time

    At Tecton , we solve the complex data problem in production machine learning. Tecton’s feature platform makes it simple to activate data for smarter models and predictions. Tecton abstracts away the complex engineering to speed up innovation. Tecton’s founders developed the first Feature Store when they created Uber’s Michelangelo ML platform, and...


  • San Francisco, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionCompany Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the...


  • San Francisco, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionCompany Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the...


  • San Francisco, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionCompany Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the...


  • San Francisco, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionCompany Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the...


  • San Francisco, CA, United States Discord Full time

    Discord is about giving people the power to create space to find belonging in their lives. We want to make it easier for you to talk regularly with the people you care about. We want you to build genuine relationships with your friends and communities close to home or around the world. Original, reliable, playful, and relatable. These are the values that...


  • San Francisco, California, United States Woven Full time

    About WovenConsumer Travel is Ripe for DisruptionToday's travelers are seeking personalized, immersive, and authentic experiences that reflect their unique tastes and interests. They want to be inspired, to feel the thrill of discovery, and to seamlessly benefit from technology throughout their journeys. Yet, the current travel landscape is a maze of...


  • San Francisco, California, United States Woven Full time

    About WovenConsumer Travel is Ripe for DisruptionToday's travelers are seeking personalized, immersive, and authentic experiences that reflect their unique tastes and interests. They want to be inspired, to feel the thrill of discovery, and to seamlessly benefit from technology throughout their journeys. Yet, the current travel landscape is a maze of...


  • San Francisco, California, United States Woven Full time

    About WovenConsumer Travel is Ripe for DisruptionTravel has a priceless impact on our lives, manifested in ways as varied as points on the map. But a universal experience among travelers today is the difficult chore of making a trip happen. The current travel landscape is a maze of disjointed self-serve transactions and hundreds of websites with conflicting...


  • San Francisco, CA, United States StackedSP Inc Full time

    Backed by : Databricks Ventures, Bain Capital Ventures, Menlo Ventures Industry - AI, ML, Databases, DCAI Tech Stack/Technologies: Python, React, JavaScript, Flask, AWS, Docker, Git, and cutting-edge infra, ML, and AI tooling. We're looking for a Fullstack Engineer to: Develop cutting-edge features & products for a leading AI platform alongside...


  • San Francisco, United States Equilibrium Energy Full time

    About our Company Equilibrium Energy is a well-funded, Series A clean energy startup backed by some of the most prominent institutional investors in climate. We are building a digital native power company operating at the intersection of grid variability, market volatility, economic optimization, commercial structuring, and risk management, across the...


  • San Francisco, CA, United States Equilibrium Energy Full time

    About Our Company Equilibrium Energy is a well-funded, Series A clean energy startup backed by some of the most prominent institutional investors in climate. We are building a digital native power company operating at the intersection of grid variability, market volatility, economic optimization, commercial structuring, and risk management, across the...