Software Engineer, LLM Platform
2 days ago
Software Engineer, LLM Platform
A leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate about building scalable, reliable systems and want to be part of a dynamic, innovative team, this role could be a great fit for you.
This organization is focused on providing enterprises with the tools to create their own Expert AI. The company’s platform enables customers to train and deploy custom models on their own data, with an emphasis on enterprise-grade security, flexibility, and minimal hallucination. Their team is made up of engineers and researchers working on highly impactful technologies, and the company is backed by top-tier VCs and tech firms. The Software Engineer, LLM Platform will play a crucial role in the development and maintenance of these next-gen AI systems.
As a Software Engineer, LLM Platform, you’ll be responsible for the design, implementation, and maintenance of LLM platforms running on Kubernetes. You will work on complex distributed systems, debug challenging issues across Kubernetes clusters, and ensure that customer-facing systems operate seamlessly. This is a hands-on, problem-solving role that demands strong technical expertise and a proactive mindset. You will also collaborate with cross-functional teams, including ML engineers, app engineers, and product managers, to deliver high-quality platform features.
What we can offer you:
- Equity and benefits as part of the total compensation package
- Opportunity to work with top engineers on cutting-edge AI platforms
- Collaborative and inclusive team environment
- Access to a state-of-the-art infrastructure and tools
- A fast-paced, mission-driven work culture focused on innovation and impact
Key Responsibilities:
- Design, implement, and maintain an LLM platform on Kubernetes, supporting LLM tuning and inference workloads
- Troubleshoot complex distributed system problems across Kubernetes environments, often without direct access
- Provide quick and effective responses to customer issues, ensuring a high level of satisfaction
- Manage internal GPU fleet and optimize cluster resources in the data center
- Collaborate with engineering and product teams to define and implement platform features
- Write clear, concise documentation to assist customers in using the product efficiently
The ideal candidate will have proficiency in Python (or similar programming languages) and be familiar with Kubernetes, distributed systems, and machine learning workflows. Experience with LLM training, inference, and RAG systems will be advantageous. If you have built or worked on platforms for large-scale AI applications, we would like to hear from you. A strong understanding of open-source tools, GPU management, and CI/CD pipelines will also help you thrive in this role.
-
Software Engineer, LLM Platform
2 days ago
San Francisco Bay Area, United States Acceler8 Talent Full timeSoftware Engineer, LLM PlatformA leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate...
-
san francisco bay area, United States Acceler8 Talent Full timeSoftware Engineer, LLM PlatformA leading AI solutions company is seeking an experienced Software Engineer, LLM Platform to join their R&D team in Menlo Park. This is an exciting opportunity to work on cutting-edge large language model (LLM) technologies while contributing to mission-critical platforms used by enterprise customers. If you’re passionate...
-
Distributed Systems Engineer
2 days ago
San Francisco, United States Acceler8 Talent Full timeIntroduction: As a Distributed Systems Engineer – LLM Platform, you’ll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.About the Company: A leader in AI innovation, this company empowers...
-
Distributed Systems Engineer
2 days ago
San Francisco Bay Area, United States Acceler8 Talent Full timeIntroduction: As a Distributed Systems Engineer – LLM Platform, you’ll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.About the Company: A leader in AI innovation, this company empowers...
-
San Francisco, United States Waveforms Full timeJob title: Software Engineer, LLM Inference Engine and Product / Member of Technical StaffWho We AreWaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.Role overview: The...
-
High-Performance Software Developer for LLMs
7 days ago
San Francisco, California, United States ChipStack Full timeAbout the RoleWe're looking for a highly skilled Senior Software Engineer to join our team and contribute to the development of our LLM-driven chip design platform. As a key member of our founding team, you'll be responsible for designing and developing high-performance, scalable software systems for LLM-powered chip design workflows.Your Key...
-
Senior Software Engineer, Platform
2 days ago
San Francisco, United States Rungalileo Full timeGalileo is a late-stage Series A business, founded in 2021 by Engineering Leaders from Google AI and Uber AI. Backed by tier 1 investors such as Battery Ventures and Walden Catalyst and over $23M in funding, Galileo is poised to build an enduring business centered around AI/ML on unstructured data. The founding team spent years building cutting-edge Machine...
-
Software Engineer
1 week ago
San Francisco, United States Alldus Full timeMy client is searching for a talented engineer to work on ML/LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you...
-
Software Engineer
1 week ago
San Francisco, United States Alldus Full timeMy client is searching for a talented engineer to work on ML/LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you...
-
ChipStack | Senior Software Engineer
1 week ago
san francisco, United States ChipStack Full timeAre you a strong SWE wanting to build impactful generative AI products and be at a fast-growing startup? Read on!About UsAt ChipStack, we're on a mission to revolutionize chip design using the power of AI. Think of us as the future of silicon development, bringing cutting-edge LLMs to a traditionally hardware-focused industry. We're a fast-growing startup...
-
Software Engineer, Platform Engineering
12 hours ago
San Francisco, United States Tbwa ChiatDay Inc Full timeSoftware is eating the world, but AI is eating software. We live in unprecedented times – AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, coach, assistant, personal shopper, travel guide, and therapist throughout life. As the world adjusts to this new reality, leading platform companies are...
-
Software Engineer
2 days ago
San Francisco, United States Refuel AI Full timeAbout Refuel.ai The Mission Great companies are built on great data. The most successful companies - think Amazon, Google and Meta - employ thousands of data scientists, and spend billions on infrastructure and human operations to solve this problem today. Refuel's platform enables enterprises to clean, enrich and label their mountains of messy,...
-
Software Engineer
2 days ago
San Francisco, United States Refuel.ai, Inc. Full timeAbout Refuel.aiThe MissionGreat companies are built on great data. The most successful companies - think Amazon, Google and Meta - employ thousands of data scientists, and spend billions on infrastructure and human operations to solve this problem today.Refuel’s platform enables enterprises to clean, enrich and label their mountains of messy, unstructured...
-
Sr. Software Engineer
2 days ago
San Francisco, United States Yurts Full timeCompany Overview: At Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior Software Engineer for our Platform...
-
Senior Software Engineer
1 day ago
San Mateo, United States Snowflake Full timeBuild the future of the AI Data Cloud. Join the Snowflake team.The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models with the ever increasing data in Snowflake but face several challenges including infrastructure...
-
Senior Software Engineer
1 day ago
San Mateo, United States Snowflake Computing Full timeBuild the future of the AI Data Cloud. Join the Snowflake team. The Snowflake Machine Learning Platform team's mission is to enable customers to bring their machine learning and deep learning workloads to Snowflake. Our customers want to build powerful models with the ever increasing data in Snowflake but face several challenges including infrastructure...
-
Sr. Software Engineer
2 days ago
San Francisco, United States Tbwa ChiatDay Inc Full timeAt Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior Software Engineer for our Platform team who possesses...
-
Sr. Software Engineer
2 days ago
San Francisco, United States Yurts Full timeJob DescriptionJob DescriptionCompany Overview:At Yurts, we are on a mission to revolutionize the world of artificial intelligence and machine learning. We are passionate about pushing the boundaries of technology to build innovative platforms that empower enterprises to leverage Generative AI (LLMs) successfully. We are seeking an exceptional Senior...
-
Immediately Looking For
2 days ago
San Jose, United States Triunity Software Full timeTitle: Lead ML Engineer (Remote) W2 OnlyJob Description: We are seeking a talented Technical Lead to drive development and adoption of AI Solutions. In this role you will contribute to product roadmap, product design , development and onboaring of users to the platform. Your primary responsibility will be to lead development & adoption of Generative AI...
-
Software Engineer, ML Infrastructure
12 hours ago
San Francisco, United States Tbwa ChiatDay Inc Full timeSoftware Engineer, ML Infrastructure - Evaluation Platform As a software engineer on the ML Infrastructure team, you will work on developing the platform for orchestrating post-training and model evaluation jobs. At Scale, we are constantly developing new data sources and running experiments to understand their impact on ML models. To support this effort, we...