AIML - Senior Data Engineer, Machine Learning Platform Technologies

1 week ago


Seattle, WA, United States Apple Full time

Weekly Hours: 40

Role Number: 200627055-3337

Summary

Join us in building the machine learning platform that enables teams at Apple to build Apple Intelligence and many other intelligent experiences across hardware, software and service products.

As a Machine Learning Data Platform Engineer, you'll design and build the scalable dataset management platform that enables teams across Apple to discover, curate, version, share, process, and consume ML datasets with enterprise-grade compliance and governance.

We're looking for an engineer with deep expertise in big data infrastructure and a passion for building platforms that make ML practitioners more productive. You'll work at the intersection of large-scale data systems, ML workflows, and data governance.

Description

In this role, you'll be architecting and building Apple's next-generation ML dataset management platform. This platform enables ML teams across the company to efficiently manage the full lifecycle of datasets, from initial curation and annotation through versioning, model training and evaluation, sharing, and compliance.

You'll design scalable infrastructure that supports dataset operations at massive scale while maintaining strong governance guarantees. Your work will include building data lineage tracking systems, implementing automated compliance workflows, creating intuitive APIs and SDKs for dataset access, and ensuring seamless integration with ML training and evaluation pipelines,

You'll collaborate with teams building customer-facing ML features across iOS, macOS, and other Apple platforms, as well as compute infrastructure teams and ML framework owners. Your platform work directly enables the ML innovations that millions of customers experience daily. This role offers the opportunity to have broad impact across Apple's ML initiatives and to shape how thousands of ML practitioners build the intelligent experiences our customers love.

Minimum Qualifications

  • Bachelor's degree in Computer Science, related field, or equivalent practical experience.

  • 10+ years building and scaling data infrastructure for petabyte-scale ML workloads with high reliability

  • Deep expertise in modern data technologies (Apache Iceberg, Spark, S3, distributed systems), data modeling, schema evolution, and efficient storage formats (Parquet, Arrow, ORC)

  • Experience building data pipelines that handle diverse ML data types: structured/tabular data, unstructured media (images, video, audio), embeddings, and multimodal datasets

  • Proven track record building dataset management systems including versioning, metadata management, discovery, and integration with production ML training pipelines

  • Experience designing data governance frameworks including lineage tracking, access control, retention policies, and compliance workflows

  • Experience with cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes)

  • Strong cross-functional collaboration skills to understand diverse stakeholder needs and articulate technical decisions across ML engineering, data science, legal, and product teams

Preferred Qualifications

  • Hands-on experience curating or managing datasets for production ML models

  • Experience with data cataloging systems, metadata platforms, MLOps tools, or ML training frameworks

  • Knowledge of privacy-preserving technologies and data quality/validation frameworks

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .



  • Seattle, WA, United States GSK LLP Full time

    The Onyx Research Data Tech organization is GSK's Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data at scale. We partner with scientists across GSK to define and understand their challenges and develop tailored solutions that meet their needs. The goal is to ensure scientists have the right data...


  • Seattle, WA, United States GlaxoSmithKline Full time

    Site Name: Seattle Sixth Ave, Cambridge 300 Technology Square Posted Date: Dec 10 2025 The Onyx Research Data Tech organization is GSK's Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data at scale. We partner with scientists across GSK to define and understand their challenges and develop tailored...


  • Seattle, WA, United States Apple Full time

    Weekly Hours: 40 Role Number: 200629484-3337 Summary Join us in building the ML platform that enables teams at Apple to build Apple Intelligence and many other intelligent experiences across hardware, software and service products! We are looking for a front-end facing full stack software engineer who’s passionate about ML tooling. The ideal candidate has...


  • Seattle, WA, United States Apple Full time

    Role Number: 200619905-3337 Summary Shape the future of search at Apple. The Search Information Intelligence organization is at the forefront of innovation in algorithmic search, machine learning, natural language processing, and artificial intelligence—transforming how hundreds of millions of users discover and interact with information. Our mission: to...


  • Seattle, WA, United States Apple Full time

    Role Number: 200619905-3337 Summary Shape the future of search at Apple. The Search Information Intelligence organization is at the forefront of innovation in algorithmic search, machine learning, natural language processing, and artificial intelligence—transforming how hundreds of millions of users discover and interact with information. Our mission: to...


  • Seattle, WA, United States Apple Full time

    Role Number: 200631663-3337 Summary As part of Apple's AI and Machine Learning org, we encourage and create groundbreaking technology for large-scale ML systems, computer vision, natural language processing, and multi-modal understanding and generation. The Data and Machine Learning Innovation (DMLI) team is looking for a passionate Machine Learning Engineer...


  • Seattle, WA, United States Apple Full time

    Weekly Hours: 40 Role Number: 200629489-3337 Summary We’re building the foundation for intelligent, adaptive AI systems from multi-agent platforms and RAG pipelines to advanced evaluation and reasoning frameworks. We’re looking for a Senior Applied ML Engineer to design, build, and scale machine learning systems that power next-generation AI...


  • Seattle, WA, United States Apple Full time

    Role Number: 200643329-3337 Summary We are looking for a Senior Data Scientist to define how we measure the success and reliability of Apple Intelligence and Siri software. This role connects feature delivery, on-device telemetry, and data insights to help teams understand real user experience and make better engineering decisions using trusted metrics. You...


  • Seattle, WA, United States Apple Full time

    Role Number: 200643329-3337 Summary We are looking for a Senior Data Scientist to define how we measure the success and reliability of Apple Intelligence and Siri software. This role connects feature delivery, on-device telemetry, and data insights to help teams understand real user experience and make better engineering decisions using trusted metrics. You...


  • Seattle, WA, United States Highspot Full time

    About Highspot Highspot is pioneering the category that is fundamentally changing the way companies increase sales productivity. On a mission to transform the way millions of people work with sales enablement, Highspot is committed to building breakthrough software with a spark of magic. We believe a great place to work is about more than the work - it's...