Machine Learning Engineer/Technical Lead

2 weeks ago


Sunnyvale, United States FedML, Inc. Full time

Responsibilities

  • Participate in the development of MLOps/AIOps machine learning platform and open source communities
  • Responsible for the foundational research and product development, and continuously improve the R&D efficiency
  • Responsible for feature development, algorithm optimization of the platform, improving user experience and usability through cutting-edge or mature technologies
  • Participate in or lead design reviews with peers and stakeholders to decide amongst available technologies
  • Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency)
  • Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback

Minimum Qualifications

  • Bachelor’s degree or equivalent practical experience in computer science or related areas.
  • 2 years of experience with software development in one or more programming languages (Python, Java, JavaScript, C/C++), or 1 year of experience with an advanced degree
  • 2 years of experience with data structures or algorithms in either an academic or industry setting
  • Good communication and writing skills in English environment

Preferred Qualifications

  • Proficient in Python language, familiar with typical deep learning frameworks (TensorFlow/PyTorch) and models such as CNN, Transformer, GBDT, LR, etc.
  • Experience in developing MLOps features, including workflow orchestration, model training, model serving, monitoring/observability, versioning of data, code, model, data pipeline, logging, etc.
  • Familiar with communication backends (MPI, NCCL, RPC, MQTT), GPU CUDA, and other core modules of deep learning frameworks, those who have participated in the development of specific modules of famous deep learning frameworks are preferred
  • Experience with federated learning, distributed training on large-scale model is preferred
  • Combine the platform and the open source library to improve the training efficiency of deep learning end-to-end through task scheduling, elastic disaster recovery, performance optimization and other measures, involving K8S/KubeFlow, network optimization, and distributed training

About the Job

FedML, Inc. (https:/fedml.ai) empowers our clients to build & scale any machine learning or artificial intelligence models anywhere. That includes the latest foundation models as well as more traditional ML models.  Our products cover both training, serving with a low-code UI MLOPs & LLMOps platform. We also offer a Federated Machine Learning solution for cross-silo training for data privacy sensitive applications.

Our earliest products power federated machine learning missions for clients in several industries, where data privacy, low latency serving, and low cost of data storage are important to the client.  Our easy-to-use FedML MLOps solution enables data science and machine learning engineering to work seamlessly together to deploy & manage their model to production machines. Our federated learning and serving solutions support siloed edge devices, smartphones, and IoT.

Our next generation of solutions includes geo-distributed machine learning and serving that continues our tradition of delivering easy-to-use, simple, low-cost, and enterprise grade MLOPs solutions.  Our MLOps and evolving LLMOps platform will always empower experimentation, observability, evaluation, governance, and collaboration for our clients’ AI & ML training and serving needs, as well as other general computing needs.

FedML supports vertical solutions across a broad range of industries (healthcare, finance, insurance, automotive, advertising, smart cities, IoT etc,) and applications (computer vision, natural language processing, data mining, and time-series forecasting). Its core technology is backed by more than 3 years of cutting-edge research of its co-founders who are recognized leaders in the federated machine learning community.

FedML's researchers and software engineers and product teams are busy developing the next-generation FedML platform for machine learning and artificial intelligence and we're looking to grow our team with skilled professionals who bring fresh ideas from all areas, including machine learning and its applications, computer vision, natural language processing, large-scale system design, distributed/cloud computing/systems, MLOps, security/privacy, mobile/IoT systems, and networking.  We’re an early stage startup, hence you will work on projects which are critical to our customers' and our business needs.  If you love to learn, and love to convert ideas into real and scalable machine learning infrastructure products and applications, FedML may be a great place for you.

Location

Our HQ is in Sunnyvale California.  Preference is for someone local who can be at our office regularly. Hybrid is ok.

How to apply

If you are interested, pleaseapply via the link.



  • Sunnyvale, United States FedML, Inc. Full time

    Job DescriptionJob DescriptionResponsibilities Participate in the development of MLOps/AIOps machine learning platform and open source communities Responsible for the foundational research and product development, and continuously improve the R&D efficiency Responsible for feature development, algorithm optimization of the platform, improving user...


  • Sunnyvale, United States FedML, Inc. Full time

    Job DescriptionJob DescriptionResponsibilities Participate in the development of MLOps/AIOps machine learning platform and open source communities Responsible for the foundational research and product development, and continuously improve the R&D efficiency Responsible for feature development, algorithm optimization of the platform, improving user...


  • Sunnyvale, United States The Learning Experience #328 Full time

    Meta is seeking an Machine Learning ASIC Engineer, Architecture to join our Infrastructure organization. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. By holding this role, you will be an integral member of an ASIC team to build...


  • Sunnyvale, United States The Learning Experience #328 Full time

    Meta is seeking an Machine Learning ASIC Engineer, Architecture to join our Infrastructure organization. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure efficiently operates and upon which our innovative services are delivered. By holding this role, you will be an integral member of an ASIC team to build...


  • Sunnyvale, California, United States The Learning Experience #328 Full time

    Meta Product Managers work with cross-functional teams of engineers, designers, data scientists and researchers to build products. We are looking for extremely entrepreneurial Product Managers with Machine Learning expertise who value moving quickly, and can help innovate and coherently drive product initiatives across the company.Product Manager, Machine...

  • Engineering Manager

    2 months ago


    Sunnyvale, United States iHealth Labs Full time

    Founded in 2010, iHealth is dedicated to empowering people to live healthier lives. The company is a leader in designing and manufacturing consumer-friendly, mobile personal healthcare products connected through the cloud that allows consumers to easily measure, track, and share vital health information with their doctors. With a focus on delivering...

  • Engineering Manager

    3 weeks ago


    Sunnyvale, United States iHealth Labs Full time

    Founded in 2010, iHealth is dedicated to empowering people to live healthier lives. The company is a leader in designing and manufacturing consumer-friendly, mobile personal healthcare products connected through the cloud that allows consumers to easily measure, track, and share vital health information with their doctors. With a focus on delivering...


  • Sunnyvale, United States A CUBED Full time

    Staff Machine Learning Engineer Founded in 2015, Acubed is the Silicon Valley innovation center of Airbus. As a global leader in aerospace, Airbus aims to make things fly. Our mission is to provide a lens into the future for the industry, transforming risk into opportunity to build the future of flight now. At Acubed, we strive to propel innovation to market...

  • Engineering Manager

    2 months ago


    Sunnyvale, CA, United States iHealth Labs, Inc. Full time

    Founded in 2010, iHealth is dedicated to empowering people to live healthier lives. The company is a leader in designing and manufacturing consumer-friendly, mobile personal healthcare products connected through the cloud that allows consumers to easily measure, track, and share vital health information with their doctors. With a focus on delivering...


  • Sunnyvale, United States AppLab Systems, Inc Full time

    Senior Machine Learning EngineerLocation: Sunnyvale CA (Onsite)Fulltime/W2 contracting (no C2C-third party payroll) Machine Learning - Job Description: Screening form will be provided shortly.6-8 years of experience in machine learning and computer vision, with a proven track record in image processing and analysis.development and optimization of Computer...


  • Sunnyvale, United States AppLab Systems, Inc Full time

    Senior Machine Learning EngineerLocation: Sunnyvale CA (Onsite)Fulltime/W2 contracting (no C2C-third party payroll) Machine Learning - Job Description: Screening form will be provided shortly.6-8 years of experience in machine learning and computer vision, with a proven track record in image processing and analysis.development and optimization of Computer...


  • Sunnyvale, United States Alibaba Cloud Full time

    Sinian team focuses on heterogeneous compute and software-hardware cooperative technologies. We have worked on a unified heterogeneity-aware lowering and optimization platform, accelerating applications on various heterogeneous hardware. Our goal is to unleash the hardware computing power and deploy deep learning applications for improving portability,...


  • Sunnyvale, California, United States Apple Inc. Full time

    Machine Learning Scientist/Engineer, Applied Machine Learning Sunnyvale , California , United States Corporate Functions To view your favorites, sign in with your Apple ID. Do you want to work on building groundbreaking ML technologies and make real world impact? At Apple, great ideas have a way of becoming great products, services, and customer...


  • Sunnyvale, United States Apple Inc. Full time

    Machine Learning Scientist/Engineer, Applied Machine Learning Sunnyvale , California , United States Corporate Functions To view your favorites, sign in with your Apple ID. Do you want to work on building groundbreaking ML technologies and make real world impact? At Apple, great ideas have a way of becoming great products, services, and customer...


  • Sunnyvale, United States Apple Inc. Full time

    Machine Learning Scientist/Engineer, Applied Machine Learning Sunnyvale , California , United States Corporate Functions To view your favorites, sign in with your Apple ID. Do you want to work on building groundbreaking ML technologies and make real world impact? At Apple, great ideas have a way of becoming great products, services, and customer...


  • Sunnyvale, United States Alibaba Cloud Full time

    Sinian team focuses on heterogeneous compute and software-hardware cooperative technologies. We have worked on a unified heterogeneity-aware lowering and optimization platform, accelerating applications on various heterogeneous hardware. Our goal is to unleash the hardware computing power and deploy deep learning applications for improving portability,...


  • Sunnyvale, United States Alibaba Cloud Full time

    Sinian team focuses on heterogeneous compute and software-hardware cooperative technologies. We have worked on a unified heterogeneity-aware lowering and optimization platform, accelerating applications on various heterogeneous hardware. Our goal is to unleash the hardware computing power and deploy deep learning applications for improving portability,...


  • Sunnyvale, United States Alibaba Cloud Full time

    Sinian team focuses on heterogeneous compute and software-hardware cooperative technologies. We have worked on a unified heterogeneity-aware lowering and optimization platform, accelerating applications on various heterogeneous hardware. Our goal is to unleash the hardware computing power and deploy deep learning applications for improving portability,...


  • Sunnyvale, United States Grid Dynamics Full time

    Description Position at Grid Dynamics Our customer is an American multinational technology company headquartered in San Ramon, California. Our customer is one of the world's largest technology companies based in Silicon Valley with operations all over the world. On this project, we are working with bleeding-edge big data technologies to develop a...


  • Sunnyvale, United States Mercedes-Benz R&D North America Full time

    The Sensor Fusion team is seeking a Machine Learning Engineering Intern for research, design, and development of cutting-edge technology applied in order to bring SAE level 4 autonomous vehicles to the market safely. In this role you will help the team to design, document and present novel machine learning techniques and applications. You will be working...