Current jobs related to AI System Infrastructure and MLOps Engineering Manager - Redwood City, California - Promote Project


  • Redwood City, California, United States Snorkel AI Inc. Full time

    Lead the AI Platform Team at Snorkel AI Inc.We're on a mission to democratize AI by building the definitive AI data development platform. Our AI Platform team builds innovative software systems to power the Snorkel Flow platform, including services to train and serve generative AI and machine learning models using novel data-centric techniques, libraries to...


  • Redwood City, California, United States C3 AI Full time

    We are seeking a seasoned software engineer with expertise in machine learning and artificial intelligence to join our Generative AI team at C3 AI.As a member of our team, you will be responsible for developing the infrastructure and tools to improve the state-of-the-art and enable the use of Generative AI technology in our enterprise applications.You will...


  • Redwood City, California, United States Snorkel AI Full time

    Join Snorkel AI as a Principal Software EngineerWe're on a mission to make machine learning practical for everyone, and we're looking for a talented Principal Software Engineer to help us achieve this goal. As a key member of our engineering team, you'll work on designing and building customer-facing software systems for cloud-native applications, leveraging...


  • Redwood City, California, United States C3, Inc. Full time

    Job SummaryC3 AI is seeking an experienced professional to join our AI Solution Architecture team in a post-sales capacity. As an AI Solutions Architect, you will design, develop, and deploy custom and pre-built Enterprise AI applications using the C3 AI Platform. Key Responsibilities:Configure and implement full-stack AI solutions according to functional...


  • Redwood City, California, United States Snorkel AI Full time

    Join Our Team as a Principal Software EngineerWe're on a mission to make machine learning practical for everyone, and we're looking for a talented Principal Software Engineer to help us achieve this goal. As a key member of our team, you'll work across the stack to deliver major new features and infrastructure, improve our practices and culture, and align...


  • Redwood City, California, United States Snorkel AI Full time

    We're on a mission to democratize AI by building the definitive AI data development platform.The AI landscape has undergone significant change since Snorkel AI started as a research project in the Stanford AI Lab.However, one constant remains: the data used to build AI is the key to achieving differentiation, high performance, and production-ready systems.We...


  • Redwood City, California, United States C3 AI Full time

    About the Role:C3 AI is seeking a highly skilled AI Solutions Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, developing, and deploying full-stack AI-driven web applications on the C3 AI Platform.Key Responsibilities:Develop and deploy full-stack, AI-based enterprise applications on the C3 AI...


  • Redwood City, California, United States C3 AI Full time

    Senior Software Engineer, PlatformC3 AI is seeking a highly skilled Senior Software Engineer to join the Platform Engineering department. As a key member of the team, you will design, develop, and maintain various features in a highly scalable and extensible AI/ML platform for large-scale applications.You will work on high-value technologies at the...


  • Redwood City, California, United States C3 AI Full time

    About C3 AIC3 AI is a leading Enterprise AI software provider that accelerates digital transformation. Our proven C3 AI Platform offers comprehensive services to build enterprise-scale AI applications efficiently and cost-effectively.Job Title: Customer Success ManagerWe are seeking a highly qualified Customer Success Manager to join our support team in...


  • Redwood City, California, United States C3 AI Full time

    About the RoleC3 AI is seeking a highly skilled Senior Software Engineer to join our rapidly growing Data org within the Platform Engineering department. As a key member of our team, you will design, develop, and maintain various features in a highly scalable and extensible AI/ML platform for large-scale applications, involving data science, distributed...


  • Redwood City, California, United States C3 AI Full time

    Job DescriptionC3 AI is seeking a highly skilled Lead QA Automation Engineer to join our team. As a key member of our Engineering, Product Development, QA, Operations, and Customer Services teams, you will be responsible for driving software quality automation for our existing and emerging products.Reporting to the Director of Quality Assurance Engineering,...


  • Redwood City, California, United States C3 AI Full time

    About the Role:C3 AI is seeking a highly skilled AI Solutions Architect to join our team. As an AI Solutions Architect, you will be responsible for collaborating with business and IT leaders at Fortune 500 customers to identify and scope use cases for the C3 AI offering.You will articulate the AI-driven resolution approach and lead high-performing C3 AI...


  • Redwood City, California, United States C3 AI Full time

    About the RoleC3 AI is seeking a Senior Product Designer to join our team and contribute to the design of innovative AI solutions. As a key member of our product design team, you will be responsible for leading the full spectrum of the design journey for new features and enhancements, from ideation to user research, creating wireframes, refining visual...


  • Redwood City, California, United States Snorkel AI Full time

    Product Development ManagerWe're on a mission to democratize AI by building the definitive AI data development platform. Our team is passionate about empowering scientists, engineers, financial experts, product creators, and journalists to build custom AI with their data faster than ever before. As a Product Development Manager, you will work...

  • Product Designer

    2 weeks ago


    Redwood City, California, United States Snorkel AI Full time

    About the RoleWe're on a mission to make machine learning practical for everyone, and we're looking for a talented Product Designer to join our team. As a Product Designer at Snorkel AI, you'll play a key role in delivering our objectives and creating intelligent, thoughtful, and user-centric experiences across all of our products and...


  • Redwood City, California, United States C3 AI Full time

    About the RoleC3 AI is seeking a highly skilled Customer Success Manager to join our support team in Redwood City, CA. As a key member of our customer success team, you will be responsible for overseeing and addressing our customers' technical needs, providing accurate technical service, and achieving set expectations to ensure customer satisfaction.Key...


  • Redwood City, California, United States Zilliz Full time

    About ZillizZilliz is a fast-growing startup that develops innovative vector database technologies for enterprise-grade AI applications. We're committed to simplifying data management for AI and making vector databases accessible to every organization.As a Database Systems Engineer at Zilliz, you'll be responsible for developing distributed database systems...


  • Redwood City, California, United States C3 AI Full time

    C3 AI is seeking a talented Software Engineer to join our platform team. As a member of this team, you will be responsible for designing, developing, and maintaining the next generation C3 AI Platform.The ideal candidate will have a strong background in computer science, experience with distributed systems, and a passion for finding elegant solutions to...


  • Redwood City, California, United States C3 AI Full time

    C3 AI is a leading Enterprise AI software provider for accelerating digital transformation. Our proven C3 AI Platform offers comprehensive services to build enterprise-scale AI applications efficiently and cost-effectively. The platform supports the value chain in any industry with prebuilt, configurable, high-value AI applications for reliability, fraud...

  • Product Manager

    4 weeks ago


    Redwood City, California, United States Oracle Full time

    Job DescriptionWe are seeking a highly influential technical Product Manager to define and drive the development of advanced product capabilities that leverage Generative AI, RAG, and intelligent agent workflows in the context of Oracle's Fusion Applications.Fusion Applications is a complete suite of SaaS offerings that include Human Capital Management...

AI System Infrastructure and MLOps Engineering Manager

2 months ago


Redwood City, California, United States Promote Project Full time
About the Role

We are seeking a highly skilled and experienced Manager of AI System Infrastructure and MLOps Engineering to join our team at Promote Project. As a key member of our AI/ML and Data Engineering team, you will be responsible for the stability and scalable operations of our leading-edge GPU Cloud Compute Cluster.

This role will involve guiding our AI Systems Infrastructure and MLOps efforts focused on our GPU Cloud Cluster operations, ensuring that our systems are highly utilized, performant, and stable. You will work in collaboration with other members of our AI Engineering team as well as the Science Initiative's AI Research team as they iterate and train their deep learning code, optimizing systems operations and helping to troubleshoot problems encountered by jobs running on the cluster.

Key Responsibilities
  • Build out the MLOps and Systems Infrastructure Engineering team to support large-scale capacity systems and AI training efforts.
  • Drive MLOps processes and System Infrastructure Engineering efforts to ensure our GPU Cloud computing systems are highly utilized and stable.
  • Own the on-call efforts for our GPU Cloud computing systems, building out the MLOps and Systems Infrastructure Engineering alerting and monitoring efforts for our leading-edge Kubernetes-based AI platform.
  • Responsibility for AI/ML development infrastructure, instrumentation, and telemetry projects that empower our team in supporting our users across the AI/ML lifecycle.
  • Mentor and manage your team in fulfilling their roles to the best of their abilities, providing skill and career coaching to help team members grow along their own career and life paths.
Requirements
  • Hands-on AI/ML Model Training Platform Operations experience in an environment with challenging data and systems platform challenges.
  • MLOps experience working with medium to large-scale GPU clusters in Kubernetes, HPC environments, or large-scale Cloud-based ML deployments (Kubernetes preferred).
  • BS, MS, or PhD degree in Computer Science or a related technical discipline or equivalent experience.
  • 2+ years of experience managing MLOps teams.
  • 7+ years of relevant coding and systems experience.
  • 7+ years of systems Architecture and Design experience, with a broad range of experience across Data, AI/ML, Core Infrastructure, and Security Engineering.
  • Strong understanding of scaling containerized applications on Kubernetes or Mesos, including solid understanding of AI/ML training with containers using secure AMIs and continuous deployment systems that integrate with Kubernetes or Mesos (Kubernetes preferred).
  • Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, and experience with On-Prem and Colocation Service hosting environments.
  • Solid coding ability with a systems language such as Rust, C/C++, C#, Go, Java, or Scala.
  • Extensive experience with a scripting language such as Python, PHP, or Ruby (Python preferred).
  • Working knowledge of Nvidia CUDA and AI/ML custom libraries.
  • Knowledge of Linux systems optimization and administration.
  • Understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms.
  • PyTorch, Karas, or Tensorflow experience a strong nice to have.
What We Offer

We offer a competitive salary range of $214,000 - $321,000, depending on location and experience. We also provide a comprehensive benefits package, including 100% match on employee 401(k) contributions, annual funding for employees, and relocation support for employees who need assistance moving to the Bay Area.

We believe that the strongest teams and best thinking are defined by the diversity of voices at the table. We are committed to fair treatment and equal access to opportunity for all CZI team members and to maintaining a workplace where everyone feels welcomed, respected, supported, and valued.