Anyscale Distributed Training Infrastructure Engineer

20 hours ago


San Francisco, California, United States Anyscale Full time

We are seeking a talented Distributed Training Infrastructure Engineer to join our team at Anyscale. This role is ideal for a passionate software engineer with a strong foundation in algorithms, data structures, and system design.

">
  • In this role, you will develop scalable, fault-tolerant distributed machine learning libraries that power leading ML platforms.
  • You will create an exceptional end-to-end experience for training machine learning models.
  • Solve complex architectural challenges and transform them into practical solutions.

Analyze the opportunity:

  • Minimum 5+ years of experience building, scaling, and maintaining software systems in production environments.
  • Strong fundamentals in algorithms, data structures, and system design.
  • Proficiency with machine learning frameworks and libraries (e.g., PyTorch, TensorFlow, XGBoost).
  • Experience designing fault-tolerant distributed systems.

The estimated salary for this role is $170,112 ~ $237,000.

We offer a range of benefits, including stock options, healthcare plans, 401k retirement plan, wellness stipend, education stipend, paid parental leave, fertility benefits, flexible time off, commute reimbursement, and 100% of in-office meals covered.



  • San Francisco, California, United States Anyscale Full time

    About AnyscaleWe're on a mission to make distributed computing accessible to software developers of all skill levels.Our platform commercializes Ray, an open-source project creating an ecosystem of libraries for scalable machine learning.Companies like OpenAI, Uber, Spotify, and Instacart use Ray in their tech stacks to accelerate AI applications.The...


  • San Francisco, California, United States Anyscale Full time

    About AnyscaleWe're a leading provider of distributed computing solutions, dedicated to empowering software developers with accessible and scalable tools.Our mission is to democratize distributed computing and make it accessible to developers of all skill levels. We're commercializing Ray, a popular open-source project that's creating an ecosystem of...


  • San Francisco, California, United States Anyscale Full time

    We're seeking a skilled Distributed Systems Architect to join our team at Anyscale. Our mission is to democratize distributed computing and make it accessible to software developers of all skill levels.As a Distributed Systems Architect, you'll play a key role in building the best place to run Ray, a popular open-source project that's creating an ecosystem...


  • San Francisco, California, United States Anyscale Full time

    Are you a skilled Machine Learning Systems Architect looking for a new challenge? We are seeking a talented individual to join our team at Anyscale.">In this role, you will drive the development and optimization of Ray's distributed training libraries, focusing on features and performance enhancements for large-scale machine learning workloads.You will build...


  • San Francisco, California, United States Anyscale Full time

    About the CompanyAnyscale is a cutting-edge technology company dedicated to democratizing distributed computing and making it accessible to developers of all skill levels. We're a leader in the field, commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning.With Anyscale, you'll have the...


  • San Francisco, California, United States Anyscale Full time

    We are seeking a strong leader to grow and activate the Ray community, focusing on AI infrastructure and applications. The successful candidate will develop a strategy for our Developer Advocacy function, design instrumentation, collaborate with users, build a vibrant developer community, and lead in-person efforts.Key qualifications include extensive...


  • San Francisco, California, United States Anyscale Full time

    We are looking for a highly skilled Senior Software Engineer - Distributed Machine Learning to join our team at Anyscale.">In this role, you will contribute to and engage with the open-source community, collaborating with ML researchers, engineers, and data scientists to build new scalable machine learning abstractions.You will share your work and expertise...


  • San Francisco, California, United States Anyscale Full time

    About UsAnyscale is a company that's passionate about democratizing distributed computing. We're committed to making it accessible to software developers of all skill levels.The JobWe're seeking a strong ML Engineer and Researcher to join our team. In this role, you'll collaborate with our Ray Core and Ray Train teams to adapt and optimize Ray for efficient,...


  • San Francisco, California, United States Anyscale Full time

    Role Overview:This is a critical role at Anyscale as it allows us to provide market-leading performance and price point for AI infrastructure. As a Large Scale Inference Specialist, you will help push the boundaries of performance for inference at large scale. This involves iterating quickly with product teams to ship end-to-end solutions for Batch and...


  • San Francisco, California, United States Anyscale Full time

    About AnyscaleWe are a pioneering company in democratizing distributed computing, making it accessible to software developers of all skill levels. Our mission is to create an ecosystem of scalable machine learning libraries through the commercialization of Ray.Anyscale's vision is to build the best platform for running Ray, empowering any developer or data...

  • Scalable AI Architect

    3 weeks ago


    San Francisco, California, United States Anyscale Full time

    About Anyscale:Anyscale is a pioneering technology company that empowers software developers to harness the full potential of distributed computing. By commercializing Ray, an open-source project, we're creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, and Cruise trust Ray as a critical...


  • San Francisco, California, United States Anyscale Full time

    About AnyscaleAnyscale is revolutionizing AI governance by harnessing the potential of Generative AI, empowering leaders to adopt and deploy AI responsibly and efficiently. We provide AI platform leaders with critical guardrails—establishing a secure foundation, flexible controls, and seamless integrations—all without compromising developer...


  • San Francisco, California, United States Anyscale Full time

    About AnyscaleWe are on a mission to make distributed computing accessible to software developers of all skill levels. Our platform is commercializing Ray, a popular open-source project for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, and Cruise use Ray in their tech stacks to accelerate AI applications.Our goal is to build the...


  • San Francisco, California, United States Anyscale Full time

    About AnyscaleAnyscale is revolutionizing the way companies approach distributed computing. Our mission is to make it accessible to software developers of all skill levels, and we're achieving this through our commercialization of Ray, a popular open-source project.We've partnered with leading companies like OpenAI, Uber, Spotify, Instacart, Cruise, and many...


  • San Francisco, California, United States Anyscale Full time

    About Us:Anyscale is a company on a mission to democratize distributed computing, making it accessible to software developers of all skill levels. We are commercializing Ray, a popular open-source project that's creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, Cruise, and many more, have Ray...


  • San Francisco, California, United States Discord Full time

    **Real-Time Engineering at Discord**We're a company that's passionate about making it easier and more fun for people to talk and hang out before, during, and after playing games. Our team uses a lot of open source technologies, and contributes back too.This role will involve building and operating large-scale, reliable and performant distributed systems to...


  • San Francisco, California, United States Cloudflare Inc Full time

    About the TeamOur team of Distributed Infrastructure Developers is responsible for building and maintaining the distributed systems that power Cloudflare's global network. We are seeking a talented engineer to join our team and help us tackle some of the most challenging problems in distributed systems.As a member of our team, you will have the opportunity...


  • San Francisco, California, United States Tbwa ChiatDay Inc Full time

    At Skild AI, we're revolutionizing the field of artificial intelligence by creating a robust and adaptable robotic intelligence that can thrive in unpredictable environments. Our team is comprised of individuals with diverse backgrounds and experiences, from recent graduates to domain experts. While relevant industry experience is valuable, we place greater...


  • San Francisco, California, United States Mixpanel Full time

    About MixpanelWe are a leading product analytics software company, helping businesses answer critical questions about their products.Our event-based tracking solution enables teams to gain insights into user behavior across web and mobile platforms.We serve nearly 7,000 customers worldwide through seven offices globally.The RoleWe are seeking an experienced...


  • San Francisco, California, United States Airtable Full time

    Airtable Overview Airtable is the no-code app platform that empowers people closest to the work to accelerate their most critical business processes. We are looking for experienced engineers who can help improve our infrastructure and build systems with a great developer experience.Airtable's infrastructure is evolving to meet the needs of our fast-growing...