Data Infrastructure Engineer

15 hours ago


San Francisco, United States OpenAI Full time

About the Team

You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering and product teams core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and more.

About the Role

The Applied Data Platform team designs, builds, and operates the foundational data infrastructure that enables products and teams at OpenAI.

You are comfortable with work such as scaling Kubernetes services, OLAP systems, debugging Kafka consumer lag, diagnosing distributed kv store failures, and designing a system to retrieve image vectors with low latency.

You are well versed with infrastructure tooling such as Terraform, have worked with Kubernetes, and possess SRE skill sets.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, and streaming infrastructure while ensuring scalability, reliability, and security.

  • Ensure our data platform can scale reliably to the next several orders of magnitude.

  • Accelerate company productivity by empowering your fellow engineers and teammates with excellent data tooling and systems, providing a best-in-class experience.

  • Bring new features and capabilities to the world by partnering with product engineers, trust & safety, and other teams to build the technical foundations.

  • Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed.

You might thrive in this role if you:

  • Have 4+ years in data infrastructure engineering OR

  • Have 4+ years in infrastructure engineering with a strong interest in data.

  • Take pride in building and operating scalable, reliable, secure systems.

  • Are comfortable with ambiguity and rapid change.

  • Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.

Some of the technologies you’ll be working with include Apache Spark, Clickhouse, Python, Terraform, Kafka, Azure EventHub, and Vector DBs.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected status.

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

#J-18808-Ljbffr

  • San Francisco, California, United States VAST Data Full time

    Company Overview:VAST Data is a leading provider of data management solutions for the AI era. Our mission is to revolutionize data infrastructure by providing real-time data analysis and AI training capabilities.We are looking for a talented Sales Director to join our team and drive growth through strategic partnerships. The ideal candidate will have a...


  • San Francisco, California, United States Amazon Data Services, Inc. Full time

    About Amazon Data Services, Inc.Amazon Data Services, Inc. is a pioneering leader in cloud computing and data center infrastructure. Our team of experts works tirelessly to ensure the reliability, scalability, and efficiency of our global infrastructure. We are seeking an experienced Data Center Infrastructure Expert to join our team.Job Overview:We have an...


  • San Francisco, United States OpenAI Full time

    About the Team You'll join the team that's behind OpenAI's data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical...


  • San Francisco, United States Acceler8 Talent Full time

    Join Us as a Data Infrastructure EngineerOur mission is to deepen the partnership between humans and computers, unlocking collaborative capabilities that far exceed what could be achieved today. We believe that building delightful end-user experiences requires innovating across the stack - from the UX all the way down to models that achieve the best user...


  • San Francisco, California, United States Reddit Full time

    Overview\Reddit is a community-driven platform with 100,000+ active communities and 97M+ daily active unique visitors. We're looking for a Senior Data Infrastructure Engineer to join our team.\\We own the infrastructure that supports data writes, reads, and storage, along with the necessary tooling and automation to efficiently operate the...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Together AI is looking for a Senior Data Infrastructure Engineer to help define, build, and operate the data infrastructure that handles millions of events every day to power Together’s mission-critical systems. As a Senior Data Infrastructure Engineer, you will work with our Data and Commerce engineering team to scale the data processing components of...


  • San Francisco Bay Area, United States Acceler8 Talent Full time

    Join Us as a Data Infrastructure EngineerOur mission is to deepen the partnership between humans and computers, unlocking collaborative capabilities that far exceed what could be achieved today. We believe that building delightful end-user experiences requires innovating across the stack - from the UX all the way down to models that achieve the best user...


  • San Francisco, California, United States AngelList Full time

    About AngelListWe accelerate innovation by increasing the number of successful startups. Our financial infrastructure helps more people invest in world-changing startups and build tools for them to run their operations efficiently.We support over $124B+ assets on our platform, driving capital to 12,000+ startups, including 282 unicorns. 57% of top-tier U.S....


  • San Francisco, California, United States Baton Full time

    About UsWe are Baton, the Silicon Valley-based technology innovation lab for Ryder, a leading logistics company with a massive footprint and vast expertise. Our vision is to transform the freight and logistics industry through innovative technologies, creating a more efficient, sustainable, and connected ecosystem.We're looking for skilled professionals like...


  • San Francisco, California, United States Cypress HCM Full time

    Cypress HCM is looking for a Data Center Infrastructure Engineer to design, deploy, operate, and optimize network infrastructure within our data centers. This role requires expertise in troubleshooting network issues ranging from load balancers to switches.Key Responsibilities:Design and implement network infrastructure for data centersTroubleshoot complex...


  • San Francisco, California, United States Crunchyroll Full time

    Crunchyroll is a leading platform for anime and manga fans worldwide.We are seeking an experienced Staff Site Reliability Engineer to join our Data Engineering team. As a key member of this team, you will be responsible for maintaining and enhancing the reliability of our data infrastructure.This role will directly impact the availability and performance of...

  • Software Engineer

    13 hours ago


    San Francisco, United States Snorkel AI Full time

    We're on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of today. But one thing has remained constant: the data you use to build AI is the key to...


  • San Mateo, United States Notable Full time

    Overview Notable is the leading intelligent automation company for healthcare. Customers use Notable to drive patient acquisition, retention, and reimbursement, scaling growth without hiring more staff. We don't just make software. We are on a mission to fix the broken U.S. healthcare system by helping to eliminate the massive administrative burden that is...


  • San Francisco, California, United States EvenUp Full time

    About EvenUpEVENUP IS ON A MISSION TO SUPPORT INJURY LAW FIRMS ACROSS AMERICA IN PROVIDING A CONSISTENT AND HIGH STANDARD OF REPRESENTATION, ENSURING THAT EVERY INJURY VICTIM WHO SEEKS LEGAL ASSISTANCE CAN EXPECT A FAIR RESOLUTION. We've helped thousands of victims get fair compensation by empowering their representation with best-in-class insights,...


  • San Francisco, California, United States Databricks Inc. Full time

    Innovative solutions are the backbone of our company's success. At Databricks, we empower data teams to tackle complex problems by providing a cutting-edge data and AI infrastructure platform. Our Vector Search technology plays a pivotal role in this endeavor, enabling developers to improve the accuracy of Retrieval Augmented Generation (RAG) and other...


  • San Francisco, California, United States Amazon Data Services, Inc. Full time

    At Amazon Data Services, Inc., we are committed to providing our customers with the best possible experience. Our team of engineers is responsible for designing and implementing mission critical data centers that meet the ever-evolving needs of our business.About the JobWe are seeking a highly skilled Mission Critical Data Center Design Engineer to join our...


  • San Francisco, United States Unity Technologies Full time

    The opportunityThe Data & ML Platform team at Unity manages a comprehensive range of data and machine learning systems and tools, covering every stage of the lifecycle—from data ingestion and storage to processing, feature generation, machine learning model training, deployment, and serving. Our work facilitates diverse applications such as simulations,...


  • San Francisco, California, United States Amazon Full time

    About the RoleWe are seeking a Data Center Infrastructure Manager to join our team. As a Data Center Infrastructure Manager, you will be responsible for designing, building, and maintaining high-performance data centers that meet the needs of our customers.Key Responsibilities:Designing and implementing data center infrastructure solutions that meet the...


  • San Francisco, California, United States OpenAI Full time

    About the RoleThe Applied Data Platform team is responsible for designing, building, and operating the foundational data infrastructure that enables products and teams at OpenAI. This includes maintaining the health and operability of critical data infrastructure systems such as Kafka, Azure EventHub, while ensuring scalability, reliability, and security.We...


  • San Francisco, California, United States Orb Full time

    At Orb, we're revolutionizing billing infrastructure for the modern era of AI and software. As an Infrastructure Engineer, you'll play a pivotal role in designing and maintaining product features that require a deep understanding of reliable and scalable systems.About the Role:You will be responsible for building and scaling our data ingestion pipelines to...