Software Engineer, AI Infrastructure
2 weeks ago
Who We Are
WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.
Role overview: The Software Engineer, AI Infrastructure (Training + Inference) will be responsible for designing, building, and optimizing the infrastructure that powers our large scale training and real-time inference pipelines. This role combines expertise in distributed computing, system reliability, and performance optimization. The candidate will collaborate with researchers with a focus on building scalable systems to support novel multimodal training and maintaining uptime to deliver consistent results for real-time applications.
Key Responsibilities
Infrastructure Development: Design and implement infrastructure to support large-scale AI training and real-time inference with a focus on multimodal inputs. Distributed Computing: Build and maintain distributed systems to ensure scalability, efficient resource allocation, and high throughput. Training Stability: Monitor and enhance the stability of training workflows by addressing bottlenecks, failures, and inefficiencies in large-scale AI pipelines. Real-time Inference Optimization: Develop and optimize real-time inference systems to deliver low-latency, high-throughput results across diverse applications. Uptime & Reliability: Implement tools and processes to maintain high uptime and ensure infrastructure reliability during both training and inference phases. Performance Tuning: Identify and resolve performance bottlenecks, improving overall system throughput and response times. Collaboration: Work closely with research and engineering teams to integrate infrastructure with AI workflows, ensuring seamless deployment and operation. Required Skills & Qualifications
Distributed Systems Expertise: Proven experience in designing and managing distributed systems for large-scale AI training and inference. Infrastructure for AI: Strong background in building and optimizing infrastructure for real-time AI systems, with a focus on multimodal data (audio + text). Performance Optimization: Expertise in optimizing resource utilization, improving system throughput, and reducing latency in both training and inference. Training Stability: Experience in troubleshooting and stabilizing AI training pipelines for high reliability and efficiency. Technical Proficiency: Strong programming skills (Python preferred), proficiency with PyTorch, and familiarity with cloud platforms (AWS, GCP, Azure). Minimum Experience
4-5 years of relevant professional experience is required #J-18808-Ljbffr
-
Software Engineer, AI Infrastructure
4 weeks ago
San Francisco, California, United States Waveforms AI, Inc Full timeJob title:Software Engineer, AI Infrastructure (Training + Inference) / Member of Technical StaffWho We Are WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.Role...
-
Software Engineer, Infrastructure
3 weeks ago
San Francisco, California, United States Runloop AI, Inc Full timeAbout Runloop Runloop is pioneering the next generation of AI-driven software engineering. Our platform empowers developers to build, scale, and optimize AI-powered coding solutions, accelerating the future of software development. We're a small team of former Google and Stripe engineers dedicated to solving the complex challenges of productionizing AI for...
-
Software Engineer, Infrastructure
3 days ago
San Francisco, California, United States Runloop AI, Inc Full timeAbout Runloop Runloop is pioneering the next generation of AI-driven software engineering. Our platform empowers developers to build, scale, and optimize AI-powered coding solutions, accelerating the future of software development. We're a small team of former Google and Stripe engineers dedicated to solving the complex challenges of productionizing AI for...
-
Software Engineer, Infrastructure
4 weeks ago
San Francisco, California, United States Runloop AI, Inc Full timeAbout RunloopRunloop is pioneering the next generation of AI-driven software engineering. Our platform empowers developers to build, scale, and optimize AI-powered coding solutions, accelerating the future of software development. We're a small team of former Google and Stripe engineers dedicated to solving the complex challenges of productionizing AI for...
-
Software Engineer, Infrastructure
2 weeks ago
San Francisco, California, United States Runloop AI, Inc Full timeAbout Runloop Runloop is pioneering the next generation of AI-driven software engineering. Our platform empowers developers to build, scale, and optimize AI-powered coding solutions, accelerating the future of software development. We're a small team of former Google and Stripe engineers dedicated to solving the complex challenges of productionizing AI for...
-
Distributed AI Infrastructure Engineer
7 days ago
San Francisco, California, United States Together AI Full timeAbout Together AIWe are a research-driven artificial intelligence company. Our mission is to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models.Our team has made significant contributions to open-source research, models, and datasets that advance the frontier of AI. We invite you to join our...
-
Staff Software Engineer — Infrastructure
2 weeks ago
San Francisco, California, United States Snorkel AI Full timeStaff Software Engineer — Infrastructure Hybrid / San Francisco, CA or Redwood City, CAWe're on a mission to democratize AI by building the definitive AI data development platform. The AI landscape has gone through incredible change between 2016, when Snorkel started as a research project in the Stanford AI Lab, to the generative AI breakthroughs of...
-
Software Engineer, Infrastructure
2 weeks ago
San Francisco, California, United States Skild AI Full timeCompany OverviewAt Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for the widespread deployment of robots within society. Our team consists of individuals...
-
Senior Software Engineer, Infrastructure
6 days ago
San Francisco, California, United States Together AI Full timeAs a Senior Infrastructure Software Engineer, you will focus on automating infrastructure installations and decommissions at scale. You will build tools to constantly improve our scale and speed of deployment. You will nurture a passion for an "automate everything" approach that makes systems failure-resistant and ready-to-scale.Your work will enable our...
-
Senior AI Infrastructure Engineer
4 weeks ago
San Francisco, California, United States Together AI Full timeAs a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth.This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and provides a cohesive...