Data Infrastructure Engineer

1 week ago


San Francisco, United States OpenAI Full time

About the Team

You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering and product teams core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and more.

About the Role

The Applied Data Platform team designs, builds, and operates the foundational data infrastructure that enables products and teams at OpenAI.

You are comfortable with work such as scaling Kubernetes services, OLAP systems, debugging Kafka consumer lag, diagnosing distributed kv store failures, and designing a system to retrieve image vectors with low latency.

You are well versed with infrastructure tooling such as Terraform, have worked with Kubernetes, and possess SRE skill sets.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, and streaming infrastructure while ensuring scalability, reliability, and security.

  • Ensure our data platform can scale reliably to the next several orders of magnitude.

  • Accelerate company productivity by empowering your fellow engineers and teammates with excellent data tooling and systems, providing a best-in-class experience.

  • Bring new features and capabilities to the world by partnering with product engineers, trust & safety, and other teams to build the technical foundations.

  • Like all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed.

You might thrive in this role if you:

  • Have 4+ years in data infrastructure engineering OR

  • Have 4+ years in infrastructure engineering with a strong interest in data.

  • Take pride in building and operating scalable, reliable, secure systems.

  • Are comfortable with ambiguity and rapid change.

  • Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.

Some of the technologies you’ll be working with include Apache Spark, Clickhouse, Python, Terraform, Kafka, Azure EventHub, and Vector DBs.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected status.

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

#J-18808-Ljbffr

  • San Francisco, United States OpenAI Full time

    You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations,...


  • San Francisco, United States AngelList - Jobboard Full time

    Our CompanyAt Sentio, we are building the infrastructure and developer tools for blockchain to accelerate dApp proliferation. Trusted by over 100 teams across different chains and use cases, our customers include leading Web3 projects like Pendle, Renzo, Pyth, Pancake, and Zircuit.Sentio was founded by a team of serial entrepreneurs and veteran engineers...


  • San Francisco, California, United States Acceler8 Talent Full time

    We are seeking an experienced software engineer to join our team at the forefront of data-centric AI. Our company is a leading innovator in enhancing data quality for AI models.About the Role:As a Senior Data Infrastructure Engineer, you will be responsible for designing, implementing, and optimizing scalable infrastructure to support our data processing...


  • San Francisco, California, United States Unity Technologies Full time

    About the RoleWe're seeking a skilled Senior Data and ML Infrastructure Engineer to join our team at Unity. As a key member of our Data & ML Platform team, you will design and optimize large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Key Responsibilities:Design and optimize large-scale...


  • San Francisco, United States Tempus Ex Full time

    Senior Software Engineer, Data InfrastructureUnited States (Remote)About UsInfinite Athlete’s mission is to build an operating system for sports that powers infinite innovation and makes sports better for the fan, the game, and the athlete. Our goal is to create a single technological foundation across all major sports upon which innovative sports...


  • San Francisco, United States salesforce Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job CategorySoftware EngineeringJob DetailsAbout SalesforceWe’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across...

  • Data Architect

    3 weeks ago


    San Francisco, United States Unreal Gigs Full time

    Are you passionate about designing data architectures that support seamless access, scalability, and security for modern applications? Do you excel at creating robust data infrastructure that empowers data-driven insights and decision-making? If you’re ready to architect data solutions that are both innovative and resilient, our client has the perfect role...


  • San Francisco, United States Autodesk Full time

    Job Requisition ID # 24WD77185Autodesk's Platform Services and Emerging Technology (PSET) team is hiring a Principal Platform Engineer with experience automating large-scale cloud infrastructure services. In this exciting role, you will help us automate a robust and scalable data platform used by numerous teams across the company. Reporting to our Software...


  • San Francisco, United States Coatue Management L.L.C. Full time

    RDQ225R487Job DescriptionDatabricks is looking for a Senior Manager, Infrastructure Data Science to shape the future of Databricks infrastructure through data science. You will tackle some of the most complex challenges related to capacity planning, performance optimization, reliability engineering, infrastructure efficiency, and customer experience. You...

  • Data Architect

    3 weeks ago


    San Francisco, United States Unreal Gigs Full time

    Are you passionate about designing data architectures that support seamless access, scalability, and security for modern applications? Do you excel at creating robust data infrastructure that empowers data-driven insights and decision-making? If you're ready to architect data solutions that are both innovative and resilient, our client has the perfect role...

  • Data Architect

    1 week ago


    San Francisco, United States ZipRecruiter Full time

    Job DescriptionAre you passionate about designing data architectures that support seamless access, scalability, and security for modern applications? Do you excel at creating robust data infrastructure that empowers data-driven insights and decision-making? If you’re ready to architect data solutions that are both innovative and resilient, our client has...


  • San Francisco, United States Anrok, Inc Full time

    Anrok is pioneering the way in addressing a crucial challenge for businesses worldwide: navigating the complex realm of sales tax and VAT. As tax regulations continue to change and become more intricate, companies require a dependable and automated solution to manage risk and ensure global compliance doesn't become a drag on their revenue. Anrok's...


  • San Francisco, United States OpenAI Full time

    You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, and alignment teams that are core to the work we do at OpenAI.The Streaming Infrastructure team within Data Platform is responsible for building and maintaining our streaming platform. This platform plays a crucial role in facilitating the...


  • San Francisco, United States Mach9 Robotics Inc Full time

    About Mach9Mach9 is at the forefront of leveraging advanced machine learning and computer vision techniques to transform raw geospatial data into actionable insights to help civil engineers build and maintain infrastructure globally. Our first product, Mach9 Digital Surveyor, helps surveyors automatically extract features from large-scale imagery and 3D...


  • San Francisco, United States Amazon Data Services, Inc. Full time

    This unique vantage point allows the Design Managers to maintain a close connection with project details and drive project outcomes.You will have an impact on the design direction and ability to improve design delivery for an entire region, while establishing design procedures and protocols for the development of future systems.Responsibilities...

  • Data Engineer

    1 month ago


    san francisco, United States Replicate Full time

    You’re a generalist data and analytics expert who builds data infrastructure at scale. You act like an owner and have a desire to lead. You’ve likely been a data engineer at traditional companies but you’re ready to be the first data hire at a startup.Replicate is a complex business and we need solid data infrastructure to guide it. You’ll own this...

  • Data Engineer

    2 months ago


    san francisco, United States Replicate Full time

    You’re a generalist data and analytics expert who builds data infrastructure at scale. You act like an owner and have a desire to lead. You’ve likely been a data engineer at traditional companies but you’re ready to be the first data hire at a startup.Replicate is a complex business and we need solid data infrastructure to guide it. You’ll own this...

  • Data Engineer

    2 months ago


    San Francisco, United States Replicate Full time

    You’re a generalist data and analytics expert who builds data infrastructure at scale. You act like an owner and have a desire to lead. You’ve likely been a data engineer at traditional companies but you’re ready to be the first data hire at a startup.Replicate is a complex business and we need solid data infrastructure to guide it. You’ll own this...


  • San Francisco, United States ZipRecruiter Full time

    Job DescriptionCompany Overview: Welcome to the forefront of machine learning infrastructure! At our company, we're passionate about pushing the boundaries of artificial intelligence and machine learning. Our mission is to develop robust and scalable infrastructure solutions that empower data scientists and machine learning engineers to build, deploy, and...


  • San Francisco, United States Unreal Gigs Full time

    Company Overview: Welcome to the forefront of machine learning infrastructure! At our company, we're passionate about pushing the boundaries of artificial intelligence and machine learning. Our mission is to develop robust and scalable infrastructure solutions that empower data scientists and machine learning engineers to build, deploy, and manage...