AI Infrastructure Engineer
2 days ago
Are you passionate about designing and building the robust infrastructure that powers cutting-edge AI solutions? Do you thrive on creating scalable, high-performance systems that support AI workloads, from training machine learning models to deploying real-time inference? If you're excited about building the backbone for the future of AI, then our client has the perfect opportunity for you. We’re looking for an AI Infrastructure Engineer (aka The AI Backbone Builder) to design, deploy, and maintain the infrastructure that powers AI innovation.
As an AI Infrastructure Engineer at our client , you’ll play a critical role in building the platforms that support machine learning and AI development across the organization. You’ll work closely with data scientists, software engineers, and DevOps teams to ensure that AI systems run efficiently, securely, and at scale. Your work will enable fast experimentation, seamless deployments, and the continuous delivery of AI models into production.
Key Responsibilities:
- Design and Build AI Infrastructure: Architect and implement scalable infrastructure that supports AI workloads, including machine learning model training, large-scale data processing, and real-time inference. You’ll design solutions that ensure high availability, fault tolerance, and performance optimization.
- Support AI Model Development and Deployment: Collaborate with data scientists and engineers to build pipelines that automate the end-to-end machine learning lifecycle, from data ingestion to model training, deployment, and monitoring. You’ll ensure smooth integration of AI models into production environments.
- Optimize AI Workloads for Performance: Implement strategies to optimize compute resources for AI workloads, including GPU/TPU provisioning, memory management, and parallel processing. You’ll ensure that infrastructure is optimized for the unique demands of AI and machine learning tasks.
- Cloud and On-Premise Infrastructure Management: Manage cloud-based AI platforms (AWS, GCP, Azure) as well as on-premise infrastructure for AI development. You’ll handle everything from infrastructure as code (IaC) to container orchestration (Docker, Kubernetes), ensuring seamless scalability and automation.
- Automation and Continuous Integration/Deployment (CI/CD): Implement and maintain CI/CD pipelines for machine learning models to enable rapid experimentation, testing, and deployment. You’ll automate workflows, model updates, and monitor the performance of AI systems in production.
- Security and Compliance: Ensure that the AI infrastructure complies with security best practices and regulatory requirements. You’ll implement robust access controls, encryption, and other security measures to protect sensitive data and AI models.
- Monitor and Troubleshoot AI Infrastructure: Continuously monitor the health and performance of AI infrastructure, identifying bottlenecks, reducing latency, and troubleshooting issues. You’ll ensure the reliability of systems, optimizing them as AI demands grow.
Required Skills:
- AI Infrastructure Expertise: Deep experience in designing and building infrastructure that supports AI and machine learning workloads. You’re familiar with both cloud and on-premise infrastructure solutions and know how to optimize them for AI.
- Cloud Platforms and Tools: Strong experience with cloud platforms like AWS, GCP, or Azure, particularly with AI services and infrastructure management. You’re comfortable with tools like SageMaker, AI Platform, or Azure ML, as well as container orchestration with Kubernetes.
- Automation and DevOps: Expertise in automating infrastructure provisioning and model deployment using tools such as Terraform, Ansible, Jenkins, or GitLab CI. You’re skilled at managing CI/CD pipelines for AI model deployment.
- GPU/TPU Optimization: Hands-on experience with GPU/TPU optimization for machine learning and deep learning tasks. You understand how to manage compute resources to maximize efficiency for AI workloads.
- Security and Compliance: Strong understanding of security best practices, including data encryption, access management, and compliance with regulations like GDPR and HIPAA.
Educational Requirements:
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field. Equivalent experience in AI infrastructure or DevOps is highly valued.
- Certifications in cloud platforms (AWS, GCP, Azure) or DevOps tools are a plus.
Experience Requirements:
- 3+ years of experience in infrastructure engineering, with a focus on building and maintaining AI or machine learning infrastructure in production environments.
- Proven experience with cloud services, containerization, orchestration tools, and optimizing infrastructure for AI workloads.
- Experience working with data scientists and machine learning engineers to support model development, testing, and deployment.
-
AI Infrastructure Engineer
5 days ago
San Francisco, California, United States Naptha AI Full timeAbout Naptha AIWe are seeking exceptional Software Engineering interns to join Naptha AI and contribute to building the future of AI agent infrastructure.This internship offers hands-on experience working with frontier AI technology, backed by industry veterans and technical leaders through NVIDIA Inception, Google for Startups, and Microsoft for Startups.As...
-
AI Cloud Infrastructure Platform Engineer
1 week ago
San Francisco, California, United States Scale AI Full timeCloud AI Engineer Position at ScaleWe are seeking an experienced Cloud AI Engineer to join our team at Scale, a leading provider of AI solutions. As a Cloud AI Engineer, you will play a key role in designing and developing our cloud infrastructure platforms and systems.The ideal candidate will have extensive experience in software development and a deep...
-
San Francisco, California, United States Together AI Full timeAre you a skilled DevOps engineer looking to take your career to the next level? Do you have a passion for designing and building automated infrastructure pipelines? We are seeking a talented Senior DevOps Engineer to join our cloud engineering team at Together AI. About the RoleWe are hiring a highly experienced Senior DevOps Engineer to lead the...
-
AI Data Infrastructure Engineer
1 week ago
San Francisco, California, United States Magic AI Full timeCompany OverviewMagic AI is a cutting-edge technology company dedicated to building safe Artificial General Intelligence (AGI) that accelerates humanity's progress on the world's most important problems.We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than...
-
San Francisco, California, United States Unum AI Full timeAt Unum AI, we're revolutionizing data infrastructure with our cutting-edge technology. We're seeking a highly skilled Ai Infrastructure Engineer to join our team in designing and implementing next-generation database management systems.About the RoleThis is an exciting opportunity for a passionate engineer to orchestrate software development and hardware...
-
AI Infrastructure Architect
3 days ago
San Francisco, California, United States Abridge AI Inc. Full timeAbridge AI Inc. is a pioneering force in healthcare technology, utilizing artificial intelligence to empower deeper understanding and improve clinical documentation efficiency.Role OverviewWe are seeking an exceptional ML Systems Engineer to join our team, responsible for scaling and deploying machine learning models to handle increasing traffic demands and...
-
AI Infrastructure Engineer
4 weeks ago
San Francisco, United States Unreal Gigs Full timeAre you passionate about designing and building the robust infrastructure that powers cutting-edge AI solutions? Do you thrive on creating scalable, high-performance systems that support AI workloads, from training machine learning models to deploying real-time inference? If you're excited about building the backbone for the future of AI, then our client has...
-
Advanced AI Infrastructure Engineer
7 days ago
San Francisco, California, United States Together AI Full timeAbout the RoleWe are seeking an experienced Systems Research Engineer to join our team at Together AI. As a key member of our research-driven artificial intelligence company, you will play a crucial role in researching and building the next generation AI platform.Company OverviewTogether AI is committed to creating open and transparent AI systems that drive...
-
AI Infrastructure Engineer
2 weeks ago
San Francisco, United States ZipRecruiter Full timeJob DescriptionAre you passionate about designing and building the robust infrastructure that powers cutting-edge AI solutions? Do you thrive on creating scalable, high-performance systems that support AI workloads, from training machine learning models to deploying real-time inference? If you're excited about building the backbone for the future of AI, then...
-
AI Infrastructure Engineer
1 week ago
San Francisco, United States Unreal Gigs Full timeAre you passionate about designing and building the robust infrastructure that powers cutting-edge AI solutions? Do you thrive on creating scalable, high-performance systems that support AI workloads, from training machine learning models to deploying real-time inference? If you're excited about building the backbone for the future of AI, then our client has...
-
Senior Software Engineer
4 weeks ago
San Francisco, CA, United States Acceler8 Talent Full timeSenior Software Engineer (AI Infrastructure / MLOps) Introduction: We are seeking a Senior Software Engineer (AI Infrastructure / MLOps) to join our team. This role offers a unique opportunity to work on cutting-edge MLOps technologies and develop large-scale web applications for data-centric AI.About the Company: Our team comprises MIT PhDs who have worked...
-
Software Engineering Infrastructure Specialist
3 weeks ago
San Francisco, California, United States Abridge AI Inc. Full timeAbridge AI Inc. is a trailblazing organization that empowers deeper understanding in healthcare through innovative AI solutions. Our mission-driven approach has led to the development of industry-leading natural language understanding products.Job OverviewWe are seeking a highly skilled Software Engineering Infrastructure Specialist to join our growing team...
-
AI/ML Front End Engineer
2 days ago
San Francisco, CA, United States Abridge AI Inc. Full timeAbridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most—their patients. Our enterprise-grade technology transforms patient-clinician conversations...
-
Software Engineer
1 week ago
San Francisco, California, United States Stack AI Full timeAbout Stack AIWe're a fast-growing startup on a mission to democratize access to Large Language Models. Our user-friendly and intuitive No-Code platform integrates the best AI models, common data sources, and SaaS tools.Our Traction is impressive: launched 8 months ago with over 65,000 users and 300+ paying customers, including public companies and...
-
San Francisco, California, United States Perplexity AI Full timeAI-Driven Search Solutions: Technical Lead PositionWe're looking for an experienced Senior DevOps Engineer to join our team at Perplexity AI. As a key member of our infrastructure team, you'll play a crucial role in shaping the technical direction and implementing scalable solutions for our rapidly growing search platform.Technical RequirementsYou will be...
-
AI Engineering Lead
1 week ago
San Francisco, California, United States Avala AI Full timeUnlock Your Potential as an AI Engineering LeadAvala AI is a cutting-edge technology company that empowers communities through dignified digital work. We believe in connecting people to equitable wages, ensuring the highest quality of service for our customers and the highest quality of life for our team.We are seeking an experienced Full Stack Engineer who...
-
AI Infrastructure Engineering Director
1 week ago
San Francisco, California, United States ZipRecruiter Full timeExciting Opportunity at ZipRecruiterAbout the RoleWe are seeking an exceptional AI Infrastructure Engineering Director to join our team. As a key member of our infrastructure engineering group, you will be responsible for leading the design, development, and optimization of our machine learning infrastructure solutions.The ideal candidate will have a strong...
-
AI Infrastructure Specialist
7 days ago
San Francisco, California, United States Unreal Gigs Full timeDesign and Build AI InfrastructureArchitect and implement scalable infrastructure that supports AI workloads, including machine learning model training, large-scale data processing, and real-time inference.As an AI Infrastructure Engineer, you'll design solutions that ensure high availability, fault tolerance, and performance optimization.
-
AI Infrastructure Engineering Manager
5 days ago
San Francisco, California, United States Unreal Gigs Full timeJob OverviewWe are seeking an experienced Cloud and Machine Learning Architect to lead our AI infrastructure engineering initiatives. As a key member of our team, you will design, develop, and optimize scalable and reliable infrastructure solutions to support machine learning workflows.
-
San Francisco, California, United States Abridge AI Full timeUnlock the Future of Healthcare with Abridge AIAbridge AI is a pioneering organization dedicated to revolutionizing medical conversations through AI. We are seeking an experienced Full Stack Software Engineer to join our team and help us build innovative solutions for healthcare professionals.About the RoleThis position offers a unique opportunity to design,...