Principal DevOps Engineer
1 day ago
Principal DevOps Engineer - AI/ML
We are partnered with a Series A AI Lab (backed by top-tier investors and advised by pioneering figures in generative and interactive media) that is hiring a Principal DevOps Engineer.
They're backed by leading global VCs and AI research leaders (from OpenAI, DeepMind, Meta, and others), and guided by renowned figures in computer graphics and autonomous systems. The founding team brings deep expertise from frontier AI research and large-scale distributed systems, blending academic excellence with proven startup execution.
As a Principal DevOps Engineer, you will work directly with the founders to architect, build, and scale the compute substrate that powers this next generation of AI. You'll design and optimize the inference platform, GPU-based training clusters, and data processing pipelines that drive real-time creativity and discovery. You'll play a key role in scaling systems for both research and production - ensuring low-latency performance, high availability, and efficient utilization across petabyte-scale data and model-serving workloads.
Key Experience Required
- 5+ years of experience in Software / ML Infrastructure Engineering.
- Deep experience with distributed systems and GPU orchestration for high-performance ML workloads.
- Proficiency in Python, Go, or similar, and strong grasp of software engineering best practices.
- Hands-on expertise with Kubernetes, Docker, and IaC (Terraform).
- Experience optimizing model serving and data pipelines for latency and scalability.
- A builder's mindset - you thrive in ambiguity, pick the right tools for the job, and ship.
This is a chance to join a team working at the frontier of real-time AI systems - please apply ASAP for more info.
-
Lead DevOps Engineer
1 day ago
Menlo Park, CA, United States Strativ Group Full timeLead DevOps Engineer - AI and Machine Learning Join a dynamic and innovative Series A AI Lab, backed by leading investors and advised by luminaries in generative and interactive media. We are seeking a talented Lead DevOps Engineer. This lab, supported by top global VCs and experts from OpenAI, DeepMind, Meta, and more, represents a fusion of advanced AI...
-
Senior DevOps Engineer/SRE
2 weeks ago
Menlo Park, CA, United States Saxon Global Full time7+ years in DevOps/SRE/Platform Engineering Strong experience with Kubernetes (EKS), Helm, networking, and security Expertise in CI/CD pipelines using Harness and Git Proficient in AWS services (EKS, EC2, S3, IAM, RDS, etc.) Strong scripting skills in Python or Golang Experience with Linux systems, performance tuning, and troubleshooting Familiarity with...
-
Senior DevOps Engineer/SRE
2 weeks ago
Menlo Park, CA, United States Saxon Global Full time7+ years in DevOps/SRE/Platform Engineering Strong experience with Kubernetes (EKS), Helm, networking, and security Expertise in CI/CD pipelines using Harness and Git Proficient in AWS services (EKS, EC2, S3, IAM, RDS, etc.) Strong scripting skills in Python or Golang Experience with Linux systems, performance tuning, and troubleshooting Familiarity with...
-
Senior DevOps Engineer/SRE
2 weeks ago
Menlo Park, CA, United States Saxon Global Full time7+ years in DevOps/SRE/Platform Engineering Strong experience with Kubernetes (EKS), Helm, networking, and security Expertise in CI/CD pipelines using Harness and Git Proficient in AWS services (EKS, EC2, S3, IAM, RDS, etc.) Strong scripting skills in Python or Golang Experience with Linux systems, performance tuning, and troubleshooting Familiarity with...
-
Principal Software Engineer I
3 days ago
Menlo Park, CA, United States Snowflake Computing Full timeSnowflake is about empowering enterprises to achieve their full potential - and people too. With a culture that's all in on impact, innovation, and collaboration, Snowflake is the sweet spot for building big, moving fast, and taking technology - and careers - to the next level. There is only one Data Cloud. Snowflake's founders started from scratch and...
-
Principal Software Engineer I
6 days ago
Menlo Park, CA, United States Snowflake Computing Full timeSnowflake is about empowering enterprises to achieve their full potential - and people too. With a culture that's all in on impact, innovation, and collaboration, Snowflake is the sweet spot for building big, moving fast, and taking technology - and careers - to the next level. There is only one Data Cloud. Snowflake's founders started from scratch and...
-
Principal Software Engineer I
7 days ago
Menlo Park, CA, United States Snowflake Computing Full timeSnowflake is about empowering enterprises to achieve their full potential - and people too. With a culture that's all in on impact, innovation, and collaboration, Snowflake is the sweet spot for building big, moving fast, and taking technology - and careers - to the next level. There is only one Data Cloud. Snowflake's founders started from scratch and...
-
Senior Manager/Principal Engineer
2 weeks ago
Menlo Park, CA, United States Exponent Full timeSenior Manager/Principal Engineer - Energy Transition (MS/PhD)ID 2025-1961LocationUS-MA-NatickPractice Thermal SciencesPosition Type Full-timePosted Salary Range USD $176,000.00 - USD $260,000.00 /Yr.About ExponentExponent is the only premium engineering and scientific consulting firm with the depth and breadth of expertise to solve our clients' most...
-
Software Engineer
2 weeks ago
Menlo Park, CA, United States Reconstruct Full timeAbout the job: Software Engineer - SaaS Platform How often do you get the chance to make a global impact developing the latest AI inside of the "built world"? Reconstruct's Visual Command Center (VCC) uses AI and Machine Learning inside of computer vision to track the lifecycle of large capital assets like data centers, airports, hospitals, water treatment...
-
Software Engineer
2 weeks ago
Menlo Park, CA, United States Reconstruct Full timeAbout the job: Software Engineer - SaaS Platform How often do you get the chance to make a global impact developing the latest AI inside of the "built world"? Reconstruct's Visual Command Center (VCC) uses AI and Machine Learning inside of computer vision to track the lifecycle of large capital assets like data centers, airports, hospitals, water treatment...