Software Engineer, LLM Platform

2 days ago


San Jose, United States Acceler8 Talent Full time

Software Engineer, LLM Platform


A leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate about building scalable, reliable systems and want to be part of a dynamic, innovative team, this role could be a great fit for you.


This organization is focused on providing enterprises with the tools to create their own Expert AI. The company’s platform enables customers to train and deploy custom models on their own data, with an emphasis on enterprise-grade security, flexibility, and minimal hallucination. Their team is made up of engineers and researchers working on highly impactful technologies, and the company is backed by top-tier VCs and tech firms. The Software Engineer, LLM Platform will play a crucial role in the development and maintenance of these next-gen AI systems.


As a Software Engineer, LLM Platform, you’ll be responsible for the design, implementation, and maintenance of LLM platforms running on Kubernetes. You will work on complex distributed systems, debug challenging issues across Kubernetes clusters, and ensure that customer-facing systems operate seamlessly. This is a hands-on, problem-solving role that demands strong technical expertise and a proactive mindset. You will also collaborate with cross-functional teams, including ML engineers, app engineers, and product managers, to deliver high-quality platform features.


What we can offer you:

  • Equity and benefits as part of the total compensation package
  • Opportunity to work with top engineers on cutting-edge AI platforms
  • Collaborative and inclusive team environment
  • Access to a state-of-the-art infrastructure and tools
  • A fast-paced, mission-driven work culture focused on innovation and impact


Key Responsibilities:

  • Design, implement, and maintain an LLM platform on Kubernetes, supporting LLM tuning and inference workloads
  • Troubleshoot complex distributed system problems across Kubernetes environments, often without direct access
  • Provide quick and effective responses to customer issues, ensuring a high level of satisfaction
  • Manage internal GPU fleet and optimize cluster resources in the data center
  • Collaborate with engineering and product teams to define and implement platform features
  • Write clear, concise documentation to assist customers in using the product efficiently


The ideal candidate will have proficiency in Python (or similar programming languages) and be familiar with Kubernetes, distributed systems, and machine learning workflows. Experience with LLM training, inference, and RAG systems will be advantageous. If you have built or worked on platforms for large-scale AI applications, we would like to hear from you. A strong understanding of open-source tools, GPU management, and CI/CD pipelines will also help you thrive in this role.



  • San Francisco Bay Area, United States Acceler8 Talent Full time

    Software Engineer, LLM PlatformA leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate...


  • san francisco bay area, United States Acceler8 Talent Full time

    Software Engineer, LLM PlatformA leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate...


  • San Francisco, United States Acceler8 Talent Full time

    Introduction: As a Distributed Systems Engineer – LLM Platform, you’ll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.About the Company: A leader in AI innovation, this company empowers...


  • San Francisco Bay Area, United States Acceler8 Talent Full time

    Introduction: As a Distributed Systems Engineer – LLM Platform, you’ll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.About the Company: A leader in AI innovation, this company empowers...


  • San Francisco, United States Waveforms Full time

    Job title: Software Engineer, LLM Inference Engine and Product / Member of Technical StaffWho We AreWaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.Role overview: The...


  • San Francisco, California, United States ChipStack Full time

    About the RoleWe're looking for a highly skilled Senior Software Engineer to join our team and contribute to the development of our LLM-driven chip design platform. As a key member of our founding team, you'll be responsible for designing and developing high-performance, scalable software systems for LLM-powered chip design workflows.Your Key...


  • San Francisco, United States Rungalileo Full time

    Galileo is a late-stage Series A business, founded in 2021 by Engineering Leaders from Google AI and Uber AI. Backed by tier 1 investors such as Battery Ventures and Walden Catalyst and over $23M in funding, Galileo is poised to build an enduring business centered around AI/ML on unstructured data. The founding team spent years building cutting-edge Machine...

  • Software Engineer

    1 week ago


    San Francisco, United States Alldus Full time

    My client is searching for a talented engineer to work on ML/LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you...

  • Software Engineer

    1 week ago


    San Francisco, United States Alldus Full time

    My client is searching for a talented engineer to work on ML/LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you...


  • san francisco, United States ChipStack Full time

    Are you a strong SWE wanting to build impactful generative AI products and be at a fast-growing startup? Read on!About UsAt ChipStack, we're on a mission to revolutionize chip design using the power of AI. Think of us as the future of silicon development, bringing cutting-edge LLMs to a traditionally hardware-focused industry. We're a fast-growing startup...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Software is eating the world, but AI is eating software. We live in unprecedented times – AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, coach, assistant, personal shopper, travel guide, and therapist throughout life. As the world adjusts to this new reality, leading platform companies are...

  • Software Engineer

    2 days ago


    San Francisco, United States Refuel AI Full time

    About Refuel.ai The Mission Great companies are built on great data. The most successful companies - think Amazon, Google and Meta - employ thousands of data scientists, and spend billions on infrastructure and human operations to solve this problem today. Refuel's platform enables enterprises to clean, enrich and label their mountains of messy,...

  • Software Engineer

    2 days ago


    San Francisco, United States Refuel.ai, Inc. Full time

    About Refuel.aiThe MissionGreat companies are built on great data. The most successful companies - think Amazon, Google and Meta - employ thousands of data scientists, and spend billions on infrastructure and human operations to solve this problem today.Refuel’s platform enables enterprises to clean, enrich and label their mountains of messy, unstructured...


  • San Francisco, United States Yurts Full time

    Company Overview: At Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior Software Engineer for our Platform...


  • San Mateo, United States Snowflake Full time

    Build the future of the AI Data Cloud. Join the Snowflake team.The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models with the ever increasing data in Snowflake but face several challenges including infrastructure...


  • San Mateo, United States Snowflake Computing Full time

    Build the future of the AI Data Cloud. Join the Snowflake team. The Snowflake Machine Learning Platform team's mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models with the ever increasing data in Snowflake but face several challenges including infrastructure...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    At Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior Software Engineer for our Platform team who possesses...


  • San Francisco, United States Yurts Full time

    Job DescriptionJob DescriptionCompany Overview:At Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior...


  • San Jose, United States Triunity Software Full time

    Title: Lead ML Engineer (Remote) W2 OnlyJob Description: We are seeking a talented Technical Lead to drive development and adoption of AI Solutions. In this role you will contribute to product roadmap, product design , development and onboaring of users to the platform. Your primary responsibility will be to lead development & adoption of Generative AI...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Software Engineer, ML Infrastructure - Evaluation Platform As a software engineer on the ML Infrastructure team, you will work on developing the platform for orchestrating post-training and model evaluation jobs. At Scale, we are constantly developing new data sources and running experiments to understand their impact on ML models. To support this effort, we...