Software Engineer, Distributed Systems
1 month ago
About the Team
The Platform Runtime team builds the low-level framework components to power our ML training systems. We work on building robust, scalable, high-performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress towards AGI.
About the Role
As a Distributed Systems engineer, you will work to deliver powerful APIs orchestrating thousands of computers moving and persisting vast amounts of data. This requires both providing easy-to-use, introspectable systems that can promote a fast debugging and development cycle, while also enabling that experience to scale to our newest supercomputers maintaining stability and performance throughout.
We’re looking for people who love optimizing an end-to-end system, understanding high-performance I/O to maximize local performance and distributed across our supercomputers. We want someone excited by the rapid pace of responding to the dynamic and evolving needs of our training systems architectures.
This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.
In this role, you will:
- Work across our Python and Rust stack
- Profile, optimize, and help design for scale our compute and data capabilities
- Work on deploying our training framework to our latest supercomputers, rapidly responding to the changing shapes and needs of the ML systems.
You might thrive in this role if you:
- Have worked on large distributed systems
- Love figuring out how systems work and continuously come up with ideas for how to make them faster while minimizing complexity and maintenance burden
- Have strong software engineering skills and are proficient in Python and Rust or equivalent.
About OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability, or any other legally protected status.
#J-18808-Ljbffr-
Senior Software Engineer
4 weeks ago
San Francisco, California, United States USM Business Systems Full timeJob Title: Senior Software Engineer - Distributed SystemsWe are seeking a highly skilled Senior Software Engineer to join our team in San Francisco, CA. As a key member of our development team, you will be responsible for designing and implementing scalable distributed systems using Java, Kafka, Cassandra, and Spring.About the Role:Develop high-performance,...
-
Software Engineer
4 weeks ago
San Francisco, United States High-Tech Professionals Full timeSoftware Engineer - Distributed Systems Job ID: 1782 Location: San Francisco Bay Area Type: Permanent Status: Closed Key Skills: Distributed, parallel system software, C, C++, UNIX, storage architecture, cluster, database, storage IO data, full stack engineering, system development. Description: Seeking Software Engineer to design and build distributed...
-
Software Engineer for Distributed Systems
6 days ago
San Francisco, California, United States Nextdoor Full timeJob DescriptionWe are seeking a skilled Software Engineer to join our Core Services team at Nextdoor, responsible for operating critical high-throughput services that power communities worldwide.As a member of this team, you will work in a large-scale distributed system environment, identifying opportunities to increase performance, scalability, and...
-
Software Engineer
2 weeks ago
San Francisco, California, United States Gopowerev Full timeOverview:GopowerEV is revolutionizing the EV charging industry with innovative solutions for multi-family properties.Job Description:We are seeking a seasoned Backend Software Engineer to join our team and help design and implement robust backend systems for our EV charging solutions.Key Responsibilities:Design and implement scalable, distributed systems and...
-
Software Engineer
4 weeks ago
San Francisco, California, United States MongoDB Full timeAbout MongoDBMongoDB empowers innovators to build a better world by unleashing the power of software and data. Our industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi-cloud database available in over 115 regions across major cloud providers.Our team is building cloud-based distributed systems software responsible...
-
Distributed Systems Software Engineer
4 days ago
San Francisco, California, United States OpenAI Full timeAbout the RoleWe are seeking a skilled Distributed Systems engineer to join our team. As a key member, you will be responsible for designing and implementing powerful APIs that orchestrate thousands of computers and manage vast amounts of data.This requires a deep understanding of high-performance I/O and the ability to optimize end-to-end systems for...
-
Software Engineer for Distributed Systems
2 weeks ago
San Francisco, California, United States Cisco Full timeOverviewCisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network. Our goal is to equip our customers with complete visibility into end-user connectivity, wherever they may be located.About the RoleThis Senior Software Engineer will be working in the Endpoint team,...
-
Software Engineer for Distributed Systems
2 weeks ago
San Francisco, California, United States Discord Full timeAbout the RoleAs a Staff Software Engineer at Discord, you will play a key role in building and maintaining our real-time features and services. With over 200 million active users per month, we are looking for someone who can help us scale our systems to meet the demands of our growing user base.With a strong understanding of distributed systems, you will be...
-
Software Engineer, Distributed Systems
3 weeks ago
San Francisco, United States Openai Full timeAbout the Team The Platform Runtime team builds the low-level framework components to power our ML training systems. We work on building robust, scalable, high-performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress...
-
Senior Software Engineer
2 weeks ago
San Francisco, California, United States Eventual Computing Full timeAt Eventual Computing, we are building a cutting-edge data platform to help data scientists and engineers build data applications. As a Senior Software Engineer - Distributed Systems, you will play a key role in designing and implementing our distributed data engine Daft, which runs on 800k CPU cores daily.The ideal candidate has a strong foundation in...
-
Senior Software Engineer, Distributed Systems
4 weeks ago
San Francisco, United States Mixpanel Full timeWe are actively recruiting for multiple Software Engineers across different levels for our org! About the Role Mixpanel is powered by a custom distributed database. This system ingests more than 1 Trillion user-generated events every month while ensuring end-to-end latencies of under a minute and queries typically scan more than 1 Quadrillion events over the...
-
Senior Software Engineer, Distributed Systems
1 month ago
San Francisco, United States Mixpanel Full timeWe are actively recruiting for multiple Software Engineers across different levels for our org!About the RoleMixpanel is powered by a custom distributed database. This system ingests more than 1 Trillion user-generated events every month while ensuring end-to-end latencies of under a minute and queries typically scan more than 1 Quadrillion events over the...
-
Staff Software Engineer, Distributed Systems
4 days ago
San Francisco, United States Amplitude Full timeAbout The Role & TeamWe're looking for a Staff Software Engineer to help build our query engine and tackle big challenges in a fast-growing data company. Our engineers are leading the efforts to drive our large-scale distributed systems to the 10x level while making innovations to our industry-leading analytics capabilities. As a Staff engineer of the Query...
-
Software Engineer for Distributed Data Systems
4 weeks ago
San Francisco, California, United States Databricks Full timeRole OverviewWe are seeking a highly skilled Software Engineer to join our Runtime team at Databricks. This role involves building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance, yet provide the expressiveness and programming abstractions to support diverse...
-
San Francisco, California, United States Amplitude Full timeAmplitude is a leading digital analytics platform that empowers businesses to unlock the full potential of their products. With a portfolio of over 3,200 customers, including household names like Atlassian and Under Armour, our solutions provide unparalleled visibility into customer behavior and enable data-driven decision making.We're passionate about...
-
Software Engineer, Distributed Systems
2 months ago
San Francisco, United States OpenAI Full timeAbout the Team The Platform Runtime team builds the low level framework components to power our ML training systems. We work on building robust, scalable, high performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress...
-
Software Engineer, Distributed Systems
1 month ago
San Francisco, United States OpenAI Full timeAbout the Team The Platform Runtime team builds the low level framework components to power our ML training systems. We work on building robust, scalable, high performance components to support our distributed training workloads. Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress...
-
Distributed System Architect
2 weeks ago
San Francisco, California, United States Intelliswift Software Full timeWe are looking for a talented Distributed System Architect to design and implement our Kafka infrastructure at Intelliswift Software. The ideal candidate will have extensive experience with Confluent Kafka and be able to architect and implement scalable, high-performance distributed systems.Responsibilities include designing and implementing scalable Kafka...
-
Distributed Systems Engineer
2 weeks ago
San Francisco, California, United States Ripple Full timeCompany OverviewRipple is a pioneering company that is changing the way value moves around the world. Our goal is to build a world where value can move like information does today, making it faster, cheaper, and more efficient. We are committed to innovation, collaboration, and customer satisfaction, and we strive to create a workplace culture that is...
-
Distributed Systems Engineer
2 weeks ago
San Francisco, California, United States Mixpanel Full timeAbout MixpanelWe are a leading product analytics software company, helping businesses answer critical questions about their products.Our event-based tracking solution enables teams to gain insights into user behavior across web and mobile platforms.We serve nearly 7,000 customers worldwide through seven offices globally.The RoleWe are seeking an experienced...