Software Engineer, Distributed Systems

2 months ago


San Francisco, United States OpenAI Full time
About the Team

The Platform Runtime team builds the low level framework components to power our ML training systems. We work on building robust, scalable, high performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress towards AGI.

About the Role

As a Distributed Systems engineer, you will work to deliver powerful APIs orchestrating thousands of computers moving and persisting vast amounts of data. This requires both providing easy to use, introspectable systems that can promote a fast debugging and development cycle, while also enabling that experience to scale to our newest supercomputers maintaining stability and performance throughout.

We're looking for people who love optimizing an end to end system, understanding high performance I/O to maximize local performance and distributed across our supercomputers. We want someone excited by the rapid pace of responding to the dynamic and evolving needs of our training systems architectures.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:
  • Work across our Python and Rust stack
  • Profile and optimize and help design for scale our compute and data capabilities
  • Work on deploying our training framework to our latest supercomputers rapidly responding to the changing shapes and needs of the ML systems.
You might thrive in this role if you:
  • Have worked on large distributed systems
  • Love figuring out how systems work and continuously come up with ideas for how to make them faster while minimizing complexity and maintenance burden
  • Have strong software engineering skills and are proficient in Python and Rust or equivalent.


About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

  • San Francisco, California, United States Nextdoor Full time

    Job DescriptionWe are seeking a skilled Software Engineer to join our Core Services team at Nextdoor, responsible for operating critical high-throughput services that power communities worldwide.As a member of this team, you will work in a large-scale distributed system environment, identifying opportunities to increase performance, scalability, and...

  • Software Engineer

    4 weeks ago


    San Francisco, California, United States Gopowerev Full time

    Overview:GopowerEV is revolutionizing the EV charging industry with innovative solutions for multi-family properties.Job Description:We are seeking a seasoned Backend Software Engineer to join our team and help design and implement robust backend systems for our EV charging solutions.Key Responsibilities:Design and implement scalable, distributed systems and...


  • San Francisco, California, United States Informal Systems, Inc. Full time

    At Informal Systems, we're pushing the boundaries of software and monetary systems to foster trust. Our team spent years building state-of-the-art distributed systems, but they were complex and error-prone. We knew we could do better.About UsWe believe that trust between people and protocols can thrive with the right tools and systems. Our team of experts in...


  • San Francisco, California, United States Discord Full time

    About the RoleAs a Staff Software Engineer at Discord, you will play a key role in building and maintaining our real-time features and services. With over 200 million active users per month, we are looking for someone who can help us scale our systems to meet the demands of our growing user base.With a strong understanding of distributed systems, you will be...


  • San Mateo, California, United States Alluxio Inc Full time

    About Alluxio Inc: We are a global leader in data orchestration software, empowering large-scale analytics and AI applications with our open-source platform. Our cutting-edge technology serves data to thousands of nodes across clusters, regions, clouds, and countries, providing simplified data access to files and objects. With intelligent caching, unified...


  • San Francisco, California, United States Windfall Full time

    Company OverviewWindfall is a people intelligence and AI company that gives go-to-market teams actionable insights. By democratizing access to people data, organizations can intelligently prioritize go-to-market resources to drive greater business outcomes.Salary RangeThe range displayed on each job posting reflects the minimum and maximum target for new...


  • San Francisco, United States OpenAI Full time

    About the TeamThe Platform Runtime team builds the low-level framework components to power our ML training systems. We work on building robust, scalable, high-performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress towards...


  • San Francisco, United States Mixpanel Full time

    We are actively recruiting for multiple Software Engineers across different levels for our org!About the RoleMixpanel is powered by a custom distributed database. This system ingests more than 1 Trillion user-generated events every month while ensuring end-to-end latencies of under a minute and queries typically scan more than 1 Quadrillion events over the...


  • San Francisco, United States Amplitude Full time

    About The Role & TeamWe're looking for a Staff Software Engineer to help build our query engine and tackle big challenges in a fast-growing data company. Our engineers are leading the efforts to drive our large-scale distributed systems to the 10x level while making innovations to our industry-leading analytics capabilities. As a Staff engineer of the Query...


  • San Francisco, California, United States Databricks Full time

    Role OverviewWe are seeking a highly skilled Software Engineer to join our Runtime team at Databricks. This role involves building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse...


  • San Francisco, United States Salesforce.Com Inc Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category Software Engineering Job Details About Salesforce We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across...


  • San Francisco, California, United States Airwallex Full time

    About the TeamOur team is creating a new generation of Foreign Exchange and digital Banking platform. We plan to streamline our FX pricing & risk management capability and re-architect the APIs and workflow engines that power our FX enabled customer experience across payments, conversions and multi-currency card spend. You will help us build functionality...


  • San Francisco, California, United States ZipRecruiter Full time

    Software Engineer (Full Stack) Job DescriptionWe are seeking a highly skilled Software Engineer (Full Stack) to join our team. As a key member of our engineering department, you will play a crucial role in designing and developing robust backend systems that provide exceptional travel insurance services for guests worldwide.The ideal candidate will have a...


  • San Francisco, California, United States Ripple Full time

    Company OverviewRipple is a pioneering company that is changing the way value moves around the world. Our goal is to build a world where value can move like information does today, making it faster, cheaper, and more efficient. We are committed to innovation, collaboration, and customer satisfaction, and we strive to create a workplace culture that is...


  • San Francisco, United States Cloudflare, Inc. Full time

    About UsAt Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without...


  • San Francisco, United States salesforce Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job Category: Software EngineeringJob Details:About Salesforce: We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across...


  • San Francisco, California, United States Mixpanel Full time

    About MixpanelWe are a leading product analytics software company, helping businesses answer critical questions about their products.Our event-based tracking solution enables teams to gain insights into user behavior across web and mobile platforms.We serve nearly 7,000 customers worldwide through seven offices globally.The RoleWe are seeking an experienced...


  • San Francisco, United States ZipRecruiter Full time

    Job DescriptionJob Description We are looking for a senior distributed systems engineer to join the Core Team (aka our Distributed Systems) Team. Our Core Team handles the scheduling, planning, and execution of data syncing. They work on the systems that power our core syncing engine that other engineering teams, as well as customers, rely on. A pain point...


  • San Francisco, California, United States Unreal Gigs Full time

    About the RoleAs a Software Engineer, Infrastructure at Unreal Gigs, you will be responsible for building and maintaining the underlying infrastructure that supports our research efforts.Key ResponsibilitiesDevelop tools to scale single-host code to large GPU clusters.Enhance the logging and tracing stack for better debugging of distributed systems.Design...


  • San Francisco, California, United States MongoDB Full time

    MongoDB empowers innovators to transform industries by unleashing software and data power. Our mission is to enable organizations to easily build, scale, and run modern applications using our industry-leading developer data platform, MongoDB Atlas. With over 175,000 new developers signing up every month, leading companies like Samsung and Toyota trust...