Data Infrastructure Engineer

24 hours ago


San Francisco CA United States OpenAI Full time

About the Team

You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering and product teams core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and more.

About the Role

The Applied Data Platform team designs, builds, and operates the foundational data infrastructure that enables products and teams at OpenAI.

You are comfortable with work such as scaling Kubernetes services, OLAP systems, debugging Kafka consumer lag, diagnosing distributed kv store failures, and designing a system to retrieve image vectors with low latency.

You are well versed with infrastructure tooling such as Terraform, have worked with Kubernetes, and possess SRE skill sets.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, and streaming infrastructure while ensuring scalability, reliability, and security.

  • Ensure our data platform can scale reliably to the next several orders of magnitude.

  • Accelerate company productivity by empowering your fellow engineers and teammates with excellent data tooling and systems, providing a best-in-class experience.

  • Bring new features and capabilities to the world by partnering with product engineers, trust & safety, and other teams to build the technical foundations.

  • Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed.

You might thrive in this role if you:

  • Have 4+ years in data infrastructure engineering OR

  • Have 4+ years in infrastructure engineering with a strong interest in data.

  • Take pride in building and operating scalable, reliable, secure systems.

  • Are comfortable with ambiguity and rapid change.

  • Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.

Some of the technologies you’ll be working with include Apache Spark, Clickhouse, Python, Terraform, Kafka, Azure EventHub, and Vector DBs.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected status.

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

#J-18808-Ljbffr

  • San Francisco, United States OpenAI Full time

    You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations,...


  • San Francisco, United States OpenAI Full time

    About the Team You'll join the team that's behind OpenAI's data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical...


  • San Francisco, United States OpenAI Full time

    About the TeamYou’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering and product teams core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and...


  • San Francisco, United States AngelList - Jobboard Full time

    Our CompanyAt Sentio, we are building the infrastructure and developer tools for blockchain to accelerate dApp proliferation. Trusted by over 100 teams across different chains and use cases, our customers include leading Web3 projects like Pendle, Renzo, Pyth, Pancake, and Zircuit.Sentio was founded by a team of serial entrepreneurs and veteran engineers...


  • San Francisco, California, United States Ellation, Inc. Full time

    We're seeking a highly skilled Staff Site Reliability Engineer to join our Data Engineering team at Ellation, Inc. This role is ideal for individuals with a strong background in site reliability engineering and a passion for ensuring the reliability, scalability, and performance of our data infrastructure.About the RoleThis position will be responsible for...

  • Data Architect

    2 days ago


    San Francisco, CA, United States Unreal Gigs Full time

    Are you passionate about designing data architectures that support seamless access, scalability, and security for modern applications? Do you excel at creating robust data infrastructure that empowers data-driven insights and decision-making? If you’re ready to architect data solutions that are both innovative and resilient, our client has the perfect...


  • San Francisco, California, United States DoorDash USA Full time

    DoorDash is a leading food delivery and logistics company, and our Data Engineering team plays a crucial role in building database solutions to support various use cases. As a Staff Software Engineer, Data, you will be responsible for architecting and scaling our data reliability, infrastructure, automation, and tools to meet growing business needs.About the...


  • San Francisco, CA, United States Coatue Management L.L.C. Full time

    RDQ225R487 Job Description Databricks is looking for a Senior Manager, Infrastructure Data Science to shape the future of Databricks infrastructure through data science. You will tackle some of the most complex challenges related to capacity planning, performance optimization, reliability engineering, infrastructure efficiency, and customer experience. You...


  • San Francisco, California, United States Unity Technologies Full time

    About the RoleWe're seeking a skilled Senior Data and ML Infrastructure Engineer to join our team at Unity. As a key member of our Data & ML Platform team, you will design and optimize large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Key Responsibilities:Design and optimize large-scale...


  • San Francisco, California, United States OpenAI Full time

    About the RoleThe Applied Data Platform team is responsible for designing, building, and operating the foundational data infrastructure that enables products and teams at OpenAI.Key Responsibilities:Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure while...


  • San Francisco, California, United States Replica Inc. Full time

    **About Us:**We are Replica Inc., a privacy-centric urban data platform that delivers critical insights about the built environment. Our goal is to empower planners, scientists, analysts, and policymakers with better data, human context, and an intuitive design.Our team models travel behavior over time to show how people across the country live, move, and...


  • San Francisco, California, United States Abridge Full time

    We are a growing team of practitioners, scientists, and engineers working together to empower people and make care more understandable.Abridge is at the forefront of leveraging AI to transform the healthcare industry. Our generative AI-powered products are revolutionizing the practice of medicine, and we're seeking a highly motivated Data Infrastructure...


  • San Francisco, California, United States Magic AI Full time

    Company OverviewMagic AI is a cutting-edge technology company dedicated to building safe Artificial General Intelligence (AGI) that accelerates humanity's progress on the world's most important problems.We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than...


  • San Francisco, CA, United States Mach9 Robotics Inc Full time

    About Mach9 Mach9 is at the forefront of leveraging advanced machine learning and computer vision techniques to transform raw geospatial data into actionable insights to help civil engineers build and maintain infrastructure globally. Our first product , Mach9 Digital Surveyor, helps surveyors automatically extract features from large-scale imagery and 3D...


  • Atlanta, GA, United States Data Engineer Jobs Full time

    *Please note: This role is not eligible for 100% remote work. Employees must live within a commutable distance of the Atlanta Area and must be willing to be onsite at the client and/or Slalom Atlanta office up to 5 days a week.* Who You'll Work With As a modern technology company, our Slalom Technologists are disrupting the market and bringing to life the...


  • San Jose, CA, United States ZipRecruiter Full time

    Job Description Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that....


  • San Francisco, United States Delphina Full time

    About Delphina Today’s Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with countless other frustrating bottlenecks. And the tools they are using for much of this work – e.g. Jupyter notebooks and Pandas – are over a...


  • San Francisco, California, United States Instrinsic Full time

    We are Intrinsic, a rapidly growing startup revolutionizing the way Trust & Safety teams protect their communities from abuse. Our mission is to empower these teams to focus on what matters most by streamlining their workflows and reducing manual reviews.As a Senior Data Engineer, you will play a critical role in designing and implementing our data platform...

  • Lead Data Engineer

    2 days ago


    San Francisco, CA, United States Unreal Gigs Full time

    Company Overview: Welcome to the forefront of data-driven innovation! Our company is dedicated to harnessing the power of data to drive transformative change and solve complex problems across industries. We're committed to building scalable and reliable data infrastructure that enables advanced analytics, machine learning, and business intelligence. Join us...

  • Electrical Engineer

    7 days ago


    San Francisco, California, United States Crusoe Energy Systems LLC Full time

    At Crusoe Energy Systems LLC, we are revolutionizing the field of artificial intelligence cloud infrastructure. Our company is pioneering vertically integrated, purpose-built AI infrastructure solutions that are trusted by Fortune 500 companies to power their most advanced AI applications.We are committed to aligning the future of computing with the future...