Lead ML Operations Engineer

4 days ago


San Francisco, United States ZipRecruiter Full time

Job Description

Company Overview: Welcome to the forefront of machine learning operations (MLOps) Our company is dedicated to leveraging the power of machine learning to drive innovation and transform industries. We're committed to developing cutting-edge ML solutions that deliver real-world impact and value to our customers. Join us and lead our team in shaping the future of MLOps.

Position Overview: As the Lead ML Operations Engineer, you'll be responsible for leading our MLOps efforts and driving the design, implementation, and optimization of infrastructure and processes for deploying, monitoring, and managing machine learning models at scale. You'll lead a team of talented engineers and collaborate closely with data scientists, software engineers, and DevOps teams to streamline the machine learning lifecycle and ensure reliable and efficient model operations. If you're a seasoned engineer with a passion for machine learning and a track record of designing and implementing MLOps solutions, we want you on our team.

Key Responsibilities:

  1. Technical Leadership: Lead and mentor a team of ML Operations Engineers, providing guidance, direction, and support in driving MLOps innovation and execution.
  2. Infrastructure Design: Design and implement scalable and reliable infrastructure for deploying and serving machine learning models, leveraging cloud platforms and containerization technologies.
  3. Model Deployment: Develop automated pipelines for deploying machine learning models into production environments, ensuring consistency, reliability, and reproducibility.
  4. Monitoring and Alerting: Implement monitoring and alerting systems to track model performance, data drift, and other metrics, enabling proactive detection and mitigation of issues.
  5. Model Versioning and Management: Establish version control and management processes for machine learning models, enabling easy tracking, rollback, and experimentation.
  6. Continuous Integration/Continuous Deployment (CI/CD): Implement CI/CD pipelines for automating model training, testing, and deployment, reducing time to market and improving agility.
  7. Scalability and Efficiency: Optimize the performance and scalability of machine learning infrastructure, leveraging techniques such as distributed computing, parallelization, and resource management.
  8. Security and Compliance: Ensure machine learning systems comply with security and privacy standards, implementing access controls, encryption, and other security measures as needed.
  9. Documentation and Best Practices: Document MLOps processes, best practices, and standards, providing guidance and training to data scientists and engineers.
  10. Collaboration: Collaborate with cross-functional teams, including data scientists, software engineers, and DevOps teams, to streamline the machine learning lifecycle and drive continuous improvement.
  11. Research and Innovation: Stay informed about the latest advancements in MLOps tools and technologies, exploring innovative approaches and techniques to enhance machine learning operations.

Qualifications:

  • Bachelor's degree or higher in Computer Science, Engineering, Mathematics, or related field.
  • 7+ years of experience in software engineering, DevOps, or related roles, with a focus on building and maintaining infrastructure for machine learning operations.
  • Leadership experience, with a demonstrated ability to lead and mentor a team of engineers.
  • Strong understanding of machine learning concepts and techniques, with experience working with data science teams and machine learning models.
  • Proficiency in programming such as Python, Java, or Scala, and experience with cloud platforms such as AWS, Azure, or Google Cloud Platform.
  • Experience with containerization technologies such as Docker and orchestration tools such as Kubernetes.
  • Familiarity with machine learning frameworks and libraries such as TensorFlow, PyTorch, scikit-learn, or MLflow.
  • Experience with CI/CD pipelines, version control systems, and automation tools such as Jenkins, GitLab, or CircleCI.
  • Strong problem-solving skills and analytical thinking, with the ability to troubleshoot complex issues and optimize system performance.
  • Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams and communicate technical concepts to non-technical stakeholders.

Benefits:

  • Competitive salary: The industry standard salary for Lead ML Operations Engineers typically ranges from $150,000 to $250,000 per year, depending on experience and qualifications.
  • Comprehensive health, dental, and vision insurance plans.
  • Flexible work hours and remote work options.
  • Generous vacation and paid time off.
  • Professional development opportunities, including access to training programs, conferences, and workshops.
  • State-of-the-art technology environment with access to cutting-edge tools and resources.
  • Vibrant and inclusive company culture with opportunities for growth and advancement.
  • Exciting projects with real-world impact at the forefront of MLOps innovation.

Join Us: Ready to lead the charge in MLOps innovation? Apply now to join our team and shape the future of machine learning operations

#J-18808-Ljbffr

  • San Francisco, United States Unreal Gigs Full time

    Company Overview: Welcome to the forefront of machine learning operations (MLOps)! Our company is dedicated to leveraging the power of machine learning to drive innovation and transform industries. We're committed to developing cutting-edge ML solutions that deliver real-world impact and value to our customers. Join us and lead our team in shaping the future...


  • San Francisco, United States RemoteWorker CA Full time

    Company Overview: Welcome to the forefront of machine learning operations! At our company, we're driving the next wave of AI revolution through cutting-edge ML operations technologies. Our mission is to develop scalable and reliable ML systems that empower businesses and revolutionize industries. Join us and be part of a dynamic team committed to pushing the...

  • ML Engineer

    3 weeks ago


    San Francisco, United States LOG10 LLC Full time

    About Log10 Inc Log10 is addressing the challenges around reliability and consistency of LLM-powered applications via a platform that provides AI-powered evaluations, fine-tuning and debugging tools. We are currently a team of 8 having previously worked in AI and infra roles at companies such as Intel, MosaicML, Adobe, Docker, PostEra, Starburst and Second...


  • San Francisco, United States Unreal Gigs Full time

    Company Overview: Welcome to the forefront of artificial intelligence and machine learning innovation! Our company is dedicated to leveraging the power of data science to drive transformative change and solve complex problems across industries. We're committed to developing cutting-edge AI and ML solutions that push the boundaries of what's possible. Join us...

  • AI/ML Tech Lead

    2 weeks ago


    san francisco, United States Programmers.io Full time

    Programmers.io is currently looking for an AI/ML Tech LeadHybrid role in San Francisco, California (1-2 Days Onsite/Week)Contract to HireResponsibilities: Design, develop, and maintain a AI/ML platform that is accurate, secure, and fast Lead workstreams within a lean and agile environment from requirements gathering to creating a plan of actionable tasks for...

  • AI/ML Tech Lead

    2 weeks ago


    San Francisco, United States Programmers.io Full time

    Programmers.io is currently looking for an AI/ML Tech LeadHybrid role in San Francisco, California (1-2 Days Onsite/Week)Contract to HireResponsibilities: Design, develop, and maintain a AI/ML platform that is accurate, secure, and fast Lead workstreams within a lean and agile environment from requirements gathering to creating a plan of actionable tasks for...

  • Senior ML Engineer

    1 week ago


    San Francisco, United States Kodif Full time

    A Silicon Valley-based startup - KODIF - is seeking a talented and experienced Lead Machine Learning Engineer to join our team. As a key member of our engineering leadership, you'll play a crucial role in shaping our AI-driven customer experience solutions.About us:Kodif is a B2B seed-stage, venture-backed startup. We're empowering CX (customer experience)...


  • San Francisco, United States Strava Full time

    About This Role Strava is the leading digital community for active people with more than 125 million athletes, in more than 190 countries. The platform offers a holistic view of your active lifestyle, no matter where you live, which sport you love and/or what device you use. Everyone belongs on Strava when they are pursuing an active life. As the Director of...


  • San Francisco, United States Strava Full time

    About This RoleStrava is the leading digital community for active people with more than 125 million athletes, in more than 190 countries. The platform offers a holistic view of your active lifestyle, no matter where you live, which sport you love and/or what device you use. Everyone belongs on Strava when they are pursuing an active life.As the Director of...


  • San Francisco, United States Relyance AI Full time

    As Relyance AI's Senior Software Engineer, ML, you will strategize, drive, and execute on the initiatives in NLP for information extraction from legal documents, ML/NLP for information extraction from code and general ML in code analysis, as well as overall AI backend initiatives. You will partner with cross-functional stakeholders to design and build...


  • San Francisco, California, United States Unity Technologies Full time

    About the RoleWe're seeking a skilled Senior Data and ML Infrastructure Engineer to join our team at Unity. As a key member of our Data & ML Platform team, you will design and optimize large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Key Responsibilities:Design and optimize large-scale...


  • San Francisco, United States Delphina Full time

    About Delphina Today’s Data Scientists are in pain - spending their time manually wrangling data, building models through slow trial and error, taking on painstaking rewrites for deployment, and dealing with countless other frustrating bottlenecks. And the tools they are using for much of this work – e.g. Jupyter notebooks and Pandas – are over a...


  • San Francisco, United States Tbwa ChiatDay Inc Full time

    Navier AI is building engineering software for physics-driven design. We are starting by making Computational Fluid Dynamics simulations that are 1000x faster than current solutions. We're leveraging cutting-edge physics machine learning to accelerate traditionally compute-intensive simulations. Why? Because Engineers deserve better tools to drive...


  • San Antonio, United States Stellar IT Solutions LLC Full time

    Role: ML Engineer with Azure Location: San Antonio Tx 4 days per week Contract Length: 12 Months with possible extension Job Description Position Summary We are seeking a highly skilled Azure Cloud Engineer with a strong background in Azure Cloud Infrastructure Microsoft Intelligent Data Platform (ADLS Azure Synapse) and Azure Machine Learning...

  • AI / ML Engineer

    4 weeks ago


    San Francisco, United States Seven Seven Software Full time

    AI / ML (Artificial Intelligence , Machine Learning) Engineer 1. Experience in engineering and deploying Generative AI models, specifically focusing on Retrieval-Augmented Generation (RAG) systems and multi-agent workflows. 2. Strong software engineering foundation in developing and implementing state-of-the-art generative techniques and designing advanced...


  • San Antonio, United States Stellar IT Solutions LLC Full time

    Role: ML Engineer with Azure Location: San Antonio Tx 4 days per week Contract Length: 12 Months with possibleextension JobDescription Position Summary We are seeking a highly skilled Azure Cloud Engineer witha strong background in Azure Cloud Infrastructure MicrosoftIntelligent Data Platform (ADLS Azure Synapse) and Azure MachineLearning (AzureML) to join...


  • San Francisco, United States Dealpath Full time

    Job DescriptionJob DescriptionDealpath is looking for an experienced Principal AI/ML Engineer to join our growing team, delivering best-in-class solutions for the Commercial Real Estate industry. This is an opportunity to join a team of innovators and play a critical role by utilizing your NLP expertise to define, explore, build and deliver state-of-the-art...

  • Staff ML Engineer

    2 weeks ago


    San Francisco, United States PennyJar Capital Full time

    Assured is on a mission to modernize insurance. Claims processing (i.e. should we pay this claim?), while often overlooked, is the foundation of the entire industry. It’s currently highly manual, involving phone calls, faxes, and gut instinct—costing tens of billions of dollars a year. We can do better.At Assured, we provide large insurers the software...


  • San Francisco, United States ats.rippling.com- ATS Full time

    Bronco is an applied AI lab helping chipmakers keep Moore’s law going. Our mission is to build AI silicon engineers that can automate chip design and verification from initial spec to final tape-out. We are starting with the first AI Design Verification Engineer to help close the verification gap, which is where the bottleneck currently is and where we...


  • San Francisco, United States Abridge AI Inc. Full time

    Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversations, improving clinical documentation efficiencies while enabling clinicians to focus on what matters most—their patients.Our enterprise-grade technology transforms patient-clinician conversations into...