Lead Data Engineer
4 days ago
Join our dynamic team as a Lead Data Engineer where you will play a pivotal role in designing and optimizing robust, scalable data pipelines that empower our global data and analytics efforts. If you are passionate about leveraging cutting-edge technologies like Apache Spark, Airflow, and Python to create high-performance data processing systems, we want to hear from you
In this exciting opportunity, you'll ensure data quality, reliability, and lineage across Mastercard's expansive data ecosystem. Your expertise will directly contribute to impactful, data-driven solutions at an enterprise scale.
Key Responsibilities:
- Design and optimize Spark-based ETL pipelines for large-scale data processing.
- Develop and manage Airflow Directed Acyclic Graphs (DAGs) for efficient scheduling, orchestration, and checkpointing.
- Employ partitioning and shuffling strategies to enhance Spark performance.
- Guarantee data lineage, quality, and traceability across various systems.
- Create Python scripts for data transformation, aggregation, and validation tasks.
- Execute and fine-tune Spark jobs utilizing spark-submit for optimal performance.
- Utilize DataFrames for complex joins and aggregations to derive analytical insights.
- Automate multi-step data processes through shell scripting and variable management techniques.
- Work collaboratively with data, DevOps, and analytics teams to deliver scalable solutions.
Qualifications:
- Bachelor's degree in Computer Science, Data Engineering, or a related field, or equivalent practical experience.
- Minimum of 7 years of experience in data engineering or big data development.
- Strong expertise in Apache Spark architecture, optimization, and job configuration.
- Proficient in creating and managing Airflow DAGs, including authoring, scheduling, checkpointing, and monitoring.
- Skilled in data shuffling, partitioning strategies, and performance tuning within distributed systems.
- Expertise in Python programming, focusing on data structures and algorithmic problem-solving.
- Hands-on experience with Spark DataFrames and performing PySpark transformations including joins, aggregations, and filters.
- Proficient in shell scripting, particularly in managing and passing variables between scripts.
- Experienced with spark-submit for deployment and performance tuning.
- Solid understanding of ETL design principles, workflow automation, and distributed data systems.
- Excellent debugging and problem-solving skills in large-scale environments.
- Experience with AWS Glue, EMR, Databricks, or similar Spark platforms.
- Knowledge of data lineage and data quality frameworks such as Apache Atlas.
- Familiarity with CI/CD pipelines, Docker/Kubernetes, and data governance tools.
-
Founding Data Engineer
7 days ago
Sonoma, CA, United States Strativ Group Full timeOur client is an elite applied AI research and product lab focused on building AI-native systems for finance and deploying cutting-edge models into real production environments. Their work navigates the critical interface of data, research, and high-stakes financial decision-making. As a Founding Data Engineer, you will take charge of the data platform that...
-
Founding Data Engineer
1 week ago
Sonoma, CA, United States Strativ Group Full timeOur client is an elite applied AI research and product lab focused on building AI-native systems for finance and deploying cutting-edge models into real production environments. Their work navigates the critical interface of data, research, and high-stakes financial decision-making. As a Founding Data Engineer, you will take charge of the data platform that...
-
Senior Data Engineer
7 days ago
Sonoma, CA, United States Sigmaways Full timeIf you're hands on with modern data platforms, cloud tech, and big data tools and you like building solutions that are secure, repeatable, and fast, this role is for you.As a Senior Data Engineer, you will design, build, and maintain scalable data pipelines that transform raw information into actionable insights. The ideal candidate will have strong...
-
Lead Engineer for Spacecraft Avionics
4 days ago
Sonoma, CA, United States EVONA Full timeLead Engineer for Spacecraft Avionics Join a groundbreaking space company that is redefining orbital mobility solutions as a Spacecraft Avionics and Software Lead. This unique opportunity invites you to architect, build, and validate a comprehensive flight system from the ground up, working on advanced propulsion-enabled platforms designed for responsive...
-
Lead Machine Learning Engineer
7 days ago
Sonoma, CA, United States Strativ Group Full timeLead Machine Learning Engineer - Agentic Systems Join a cutting-edge AI research company dedicated to developing advanced AI systems that can navigate complex, real-world engineering challenges. As a well-funded organization executing large-scale projects, we're at the forefront of creating AI systems that reason, plan, and act reliably in physical...
-
Lead Machine Learning Engineer
7 days ago
Sonoma, CA, United States Attis Full timeLead Machine Learning Engineer - Generative AI for the Physical WorldOverviewA rare opportunity has emerged for a visionary Lead Machine Learning Engineer to build the core intelligence for a stealth-mode, well-funded AI company. This foundational leadership role is for someone passionate about teaching machines to understand and engineer the physical world,...
-
AWS Data Architect
7 days ago
Sonoma, CA, United States Fractal, Inc. Full timeFractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get...
-
AWS Data Architect
6 days ago
Sonoma, CA, United States Fractal, Inc. Full timeFractal is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets; an ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get...
-
Staff Data Scientist
5 days ago
Sonoma, CA, United States Harnham Full timeStaff Data Scientist - Sales AnalyticsLocation: San Francisco (Hybrid)Salary: $200-250k base + RSUs This fast-growing Series E AI SaaS company is redefining how modern engineering teams build and deploy applications. We're looking for a Staff Data Scientist to drive Sales and Go-to-Market (GTM) analytics, applying advanced modeling and experimentation to...
-
Sonoma, CA, United States Harrison Clarke Full timeA rapidly evolving and highly technical AI company is seeking a Lead Research Engineer to join an innovative and high-performing team focused on developing the next generation of agentic LLM systems specifically for software engineering. This is your chance to be at the cutting edge of AI, contributing to the design and evaluation of models that comprehend,...