Petabyte-Scale Data Systems Specialist

2 days ago


San Francisco, California, United States Genmo Full time
About the Role

We are seeking an experienced Senior/Staff AI Infra Engineer to join our team at Genmo.

Job Summary:

As a Senior/Staff AI Infra Engineer, you'll be responsible for designing, building, and scaling our petabyte-scale data infrastructure.

Key Responsibilities:

  • Design highly scalable data infrastructure and systems to process petabyte-scale data stores.
  • Manage large-scale distributed processing jobs for ingesting and analyzing large-scale data sets for AI training.
  • Optimize storage systems to maximize performance.
  • Build monitoring systems to ensure reliability of data infrastructure.

Requirements:

  • Bachelors, Masters or PhD in Computer Science or related field.
  • 5+ years of experience working with large-scale systems.
  • Extremely strong experience with Python, large-scale distributed computing frameworks (e.g., Spark, Ray), and systems-level languages (Rust, Go, C++).

The estimated salary for this role is $250,000 - $350,000 per year. The role is based in the Bay Area (San Francisco). Candidates are expected to be located near the Bay Area or open to relocation. We are an Equal Opportunity Employer and value diversity and inclusion in the workplace.



  • San Diego, California, United States Apple Full time

    **About the Role:**We are seeking an experienced Site Reliability Engineer to join our Data Analytics team at Apple. As a key member of our team, you will be responsible for building, monitoring, and troubleshooting complex data infrastructure at the petabyte scale.**Responsibilities:**Builddesign, deploy, and manage complex data infrastructure at the...


  • San Francisco, California, United States Genmo Full time

    About UsGenmo is a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.Job OpportunityWe're seeking a skilled Data Infrastructure Engineer to join our team and contribute to the development of our petabyte-scale data infrastructure.Responsibilities:Design and implement scalable data...


  • San Francisco, California, United States Cruise Full time

    Join Cruise's Data Science Team: We're looking for a talented Staff Software Engineer to join our team as a key member of our ML Data Platform. As a member of our team, you will be responsible for designing, developing, and deploying large-scale data systems in the cloud. Your expertise in Beam and Spark will be instrumental in building a next-generation...


  • San Francisco, California, United States Genmo Full time

    The Ideal Candidate:We're looking for a senior professional with 5+ years of experience working with large-scale systems. You should have a strong understanding of computer science fundamentals, excellent problem-solving skills, and the ability to communicate complex ideas clearly.Additionally, we're interested in candidates who have:Familiarity with...


  • San Jose, California, United States Tik Tok Full time

    About the RoleStreaming Data Engineer, Large-Scale Systems is a critical position in our ad data platform team. You will work closely with product managers and data analysts to build state-of-the-art streaming and batch data processing solutions. The entire data pipeline supports both the TikTok ads platform and our internal business intelligence...


  • San Francisco, California, United States Genmo Full time

    We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Our team is extremely technical with leaders in distributed systems, GPU programming and large-scale training.Job OverviewWe're seeking an experienced Senior/Staff AI Infra Engineer to design, build, and scale our...


  • San Francisco, California, United States Genmo Full time

    About GenmoWe are a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of Artificial General Intelligence (AGI). Our team consists of leaders in distributed systems, GPU programming, and large-scale training.


  • San Francisco, California, United States Genmo Full time

    Job TitleData Infrastructure EngineerCompany OverviewWe are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.Salary$250,000 - $350,000 per year, depending on experience.Job DescriptionWe're seeking an experienced Senior/Staff AI Infra Engineer to design, build, and scale...


  • San Francisco, California, United States Scale AI Full time

    About Data EngineOur Generative AI Data Engine powers the world's most advanced LLMs and generative models through world-class RLHF, human data generation, model evaluation, safety, and alignment. As a Data Engine Specialist on our Horizontal Task Tooling team, you will focus on building web-based interfaces that allow large-scale data collections for...


  • San Francisco, California, United States OpenAI Full time

    Job OverviewWe are looking for a talented Software Engineer to join our Data Acquisition team at OpenAI. This role involves designing and developing scalable data systems that can handle massive amounts of data.ResponsibilitiesDesigning and developing highly scalable distributed systems that can handle petabytes of data.Collaborating with cross-functional...


  • San Francisco, California, United States Genmo Full time

    Genmo is a research lab dedicated to building open, state-of-the-art models for video generation. We're seeking an experienced Senior/Staff AI Infra Engineer to join our team and help us shape the future of AI.Job DescriptionYou will design, build, and scale our petabyte-scale data infrastructure, creating robust, scalable systems that manage and process...


  • San Francisco, California, United States Scale AI Full time

    Company Overview">Scale AI is at the forefront of powering artificial intelligence and large language models across various industries. Our hypothesis is that exceptional human beings are necessary to train these models. Humans play a crucial role in providing high-quality training data for these models, and Scale AI operates the largest network of humans...


  • San Francisco, California, United States MongoDB Full time

    MongoDB is a leading developer data platform empowering innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to build modern applications with ease.The Atlas Online Archive service provides low-cost, tiered storage for querying infrequently-accessed, read-only data. By...


  • San Francisco, California, United States Scale AI Full time

    About ScaleScale AI is transforming how organizations build and deploy artificial intelligence. Our mission is to accelerate the development of AI applications across every industry. We're a trusted partner for leading companies, government agencies, and enterprises, powering the world's most advanced large language models, generative models, and computer...


  • San Francisco, California, United States Scale AI Full time

    About Data Engine: The data we are producing is some of the most important work for how humanity will interact with AI. We have been building web-based applications that help measure our contributors' quality. As a Software Engineer on our team, you'll focus on building systems that monitor and flag quality issues with large-scale data collections. You will...


  • San Diego, California, United States Apple Full time

    About the Role:This is an exciting opportunity to join Apple's Data Analytics team as a Site Reliability Engineer, Data Analytics. As a key member of our team, you will be responsible for ensuring the reliability and scalability of our data infrastructure.Responsibilities:Build, monitor, troubleshoot complex data infrastructure at the petabyte scale.Support...


  • San Francisco, California, United States OpenAI Full time

    We are seeking an experienced Senior Software Engineer to lead our Data Acquisition team. The ideal candidate will have a strong background in large-scale distributed systems and data processing.The successful candidate will own and lead engineering projects in the area of data acquisition, including web crawling, data ingestion, and search. They will...


  • San Francisco, California, United States Saxon Global Full time

    About Saxon GlobalSaxon Global is a dynamic company seeking an experienced Data Systems Specialist to join its team. This role offers a competitive salary of $110,000 per annum, based on the location in San Francisco, California.Job DescriptionThis position involves leading the writing and maintenance of program documentation, as well as partnering on the...

  • AI Data Engineer

    3 days ago


    San Francisco, California, United States Scale AI Full time

    About ScaleWe are a leading AI data foundry, accelerating the development of AI applications. Our mission is to make this happen faster across every industry.Our Generative AI Data EnginePowers the world's most advanced LLMs and generative models through world-class RLHF (Reinforcement Learning with Human Feedback), human data generation, model evaluation,...


  • San Francisco, California, United States Braintrust Data Full time

    About Braintrust DataWe are a cutting-edge developer platform for building world-class AI products. Our innovative approach combines code and datasets, incrementally refining both using frequent evaluations. We provide a rich set of tools to visualize changes and interrogate failures, empowering developers to integrate our platform into their continuous...