Senior Data Engineer

4 weeks ago


Washington, United States Sparibis Full time
Location: 100% Remote

Years' Experience: 10+ years

Education: Bachelor's in IT related field

Work Authorization: Must show that applicant is legally permitted to work in the United States.

Clearance: Applicants must be able to meet the requirements to obtain an Public Trust security clearance. NOTE: United States Citizenship is required to be eligible to obtain this security clearance.

Key Skills:
  • 10+ years of IT experience focusing on enterprise data architecture and management
  • Experience with Databricks required
  • 8+ years experience in Conceptual/Logical/Physical Data Modeling & expertise in Relational and Dimensional Data Modeling
  • Experience with Great Expectations or other data quality validation frameworks
  • Experience with ETL and ELT tools such as SSIS, Pentaho, and/or Data Migration Services
  • Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization)
  • Experience with AWS environment, CI/CD pipelines, and Python (Python 3)
Responsibilities
  • Plan, create, and maintain data architectures, ensuring alignment with business requirements
  • Obtain data, formulate dataset processes, and store optimized data
  • Identify problems and inefficiencies and apply solutions
  • Determine tasks where manual participation can be eliminated with automation.
  • Identify and optimize data bottlenecks, leveraging automation where possible
  • Create and manage data lifecycle policies (retention, backups/restore, etc)
  • In-depth knowledge for creating, maintaining, and managing ETL/ELT pipelines
  • Create, maintain, and manage data transformations
  • Maintain/update documentation
  • Create, maintain, and manage data pipeline schedules
  • Monitor data pipelines
  • Create, maintain, and manage data quality gates (Great Expectations) to ensure high data quality
  • Support AI/ML teams with optimizing feature engineering code
  • Expertise in Spark/Python/Databricks, Data Lake and SQL
  • Create, maintain, and manage Spark Structured Steaming jobs, including using the newer Delta Live Tables and/or DBT
  • Research existing data in the data lake to determine best sources for data
  • Create, manage, and maintain ksqlDB and Kafka Streams queries/code
  • Data driven testing for data quality
  • Maintain and update Python-based data processing scripts executed on AWS Lambdas
  • Unit tests for all the Spark, Python data processing and Lambda codes
  • Maintain PCIS Reporting Database data lake with optimizations and maintenance (performance tuning, etc)
  • Streamlining data processing experience including formalizing concepts of how to handle lake data, defining windows, and how window definitions impact data freshness.
Qualifications
  • 10+ years of IT experience focusing on enterprise data architecture and management
  • Experience in Conceptual/Logical/Physical Data Modeling & expertise in Relational and Dimensional Data Modeling
  • Experience with Databricks, Structured Streaming, Delta Lake concepts, and Delta Live Tables required
    • Additional experience with Spark, Spark SQL, Spark DataFrames and DataSets, and PySpark
    • Data Lake concepts such as time travel and schema evolution and optimization
    • Structured Streaming and Delta Live Tables with Databricks a bonus
  • Experience leading and architecting enterprise-wide initiatives specifically system integration, data migration, transformation, data warehouse build, data mart build, and data lakes implementation / support
    • Advanced level understanding of streaming data pipelines and how they differ from batch systems
    • Formalize concepts of how to handle late data, defining windows, and data freshness
    • Advanced understanding of ETL and ELT and ETL/ELT tools such as SSIS, Pentaho, Data Migration Service etc
    • Understanding of concepts and implementation strategies for different incremental data loads such as tumbling window, sliding window, high watermark, etc.
    • Familiarity and/or expertise with Great Expectations or other data quality/data validation frameworks a bonus
    • Understanding of streaming data pipelines and batch systems
    • Familiarity with concepts such as late data, defining windows, and how window definitions impact data freshness
  • Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization)
    • Indexing and partitioning strategy experience
  • Debug, troubleshoot, design and implement solutions to complex technical issues
  • Experience with large-scale, high-performance enterprise big data application deployment and solution
  • Understanding how to create DAGs to define workflows
  • Familiarity with CI/CD pipelines, containerization, and pipeline orchestration tools such as Airflow, Prefect, etc a bonus but not required
  • Architecture experience in AWS environment a bonus
    • Familiarity working with Kinesis and/or Lambda specifically with how to push and pull data, how to use AWS tools to view data in Kinesis streams, and for processing massive data at scale a bonus
    • Experience with Docker, Jenkins, and CloudWatch
    • Ability to write and maintain Jenkinsfiles for supporting CI/CD pipelines
    • Experience working with AWS Lambdas for configuration and optimization
    • Experience working with DynamoDB to query and write data
    • Experience with S3
  • Knowledge of Python (Python 3 desired) for CI/CD pipelines a bonus
    • Familiarity with Pytest and Unittest a bonus
  • Experience working with JSON and defining JSON Schemas a bonus
  • Experience setting up and management Confluent/Kafka topics and ensuring performance using Kafka a bonus
    • Familiarity with Schema Registry, message formats such as Avro, ORC, etc.
    • Understanding how to manage ksqlDB SQL files and migrations and Kafka Streams
  • Ability to thrive in a team-based environment
  • Experience briefing the benefits and constraints of technology solutions to technology partners, stakeholders, team members, and senior level of management


About Sparibis

Sparibis LLC is a professional solution firm that Clients rely on to access the best talent to drive their business success.

Sparibis is an equal opportunity employer that values diversity at all levels. All individuals, regardless of personal characteristics, are encouraged to apply.
  • Senior Data Engineer

    2 weeks ago


    Washington, United States Shara Inc Full time

    Job DescriptionJob DescriptionSalary: Job Title: Senior Data EngineerLocation: Ideally based in one of Shara's hubs in Washington DC, Nairobi, Lagos, or Abuja.Travel: occasional trips (2-4x per year) to Washington DC, Nairobi or LagosAt Shara, we’re building for a future where African banking is free, frictionless, and driven by user experience. Where...


  • Washington, United States Citian Full time

    Citian is seeking a passionate and self-motivated Senior Data Engineer with expertise in architecting and building data pipelines to join our dynamic team! You will contribute to developing secure, scalable, and efficient data-driven solutions to tackle pressing needs in our transportation systems.Who We Are:Citian is a fast growing, venture backed SaaS...


  • Washington, United States Citian Full time

    Citian is seeking a passionate and self-motivated Senior Data Engineer with expertise in architecting and building data pipelines to join our dynamic team! You will contribute to developing secure, scalable, and efficient data-driven solutions to tackle pressing needs in our transportation systems.Who We Are:Citian is a fast growing, venture backed SaaS...


  • Washington, United States Citian Full time

    Citian is seeking a passionate and self-motivated Senior Data Engineer with expertise in architecting and building data pipelines to join our dynamic team! You will contribute to developing secure, scalable, and efficient data-driven solutions to tackle pressing needs in our transportation systems.Who We Are:Citian is a fast growing, venture backed SaaS...


  • Washington, Washington, D.C., United States Amentum Full time

    Amentum is seeking a Senior Data Engineer (RDT&E) to support a DIA Analytic Innovations Office Advanced Analytics & Product Evaluation contract. This position is based in Washington, D.C. All employees will start in the Washington D.C office but upon request and with customer approval may be allowed to work from Reston VA, Quantico VA, or College Park...


  • Washington, United States CleanChoice Energy Full time

    Job DescriptionJob Description​​​ Job Title: Senior Data EngineerFull-Time 40 Hours/WeekClassification: exempt under FLSALocation: 100% Remote About CleanChoice Energy, a national renewable energy company that empowers people and businesses to cut emissions and live cleaner, healthier lives, is seeking a Senior Data Engineer to join the Data...


  • Washington, United States CleanChoice Energy Full time

    ​ ​​​​​Job Title: Senior Data EngineerFull-Time 40 Hours/WeekClassification: exempt under FLSALocation: 100% Remote About CleanChoice Energy, a national renewable energy company that empowers people and businesses to cut emissions and live cleaner, healthier lives, is seeking a Senior Data Engineer to join the Data Development department. This is...


  • Washington, Washington, D.C., United States Yellowbrick Data Full time

    Yellowbrick Data is a modern cloud Data Warehouse start-up headquartered in Silicon Valley. We are a flexible multi-cloud solution, winning customers with our incredible speed and controllable costs. Our technology is used by the world's largest insurers, credit card companies, telcos and healthcare firms, all of whom depend on our products to make critical...


  • Washington, United States Actif.ai Full time

    Job DescriptionJob DescriptionThe MLOps Team at Actifai is seeking a Senior Software Engineer, Data to help design, build, and maintain the systems that process and store our data. The Senior Software Engineer, Data will be based in Washington, D.C.The CompanyActifai is a dynamic, 4-year-old AI software startup (AIaaS company) serving some of the largest...


  • Washington, United States D&G Support Services LLC Full time

    Senior Data Engineer Job ID# 40936 Job Description Summary: D&G is seeking a highly motivated and talented Senior Data Engineer to support the Department of Homeland Security (DHS) Office of Intelligence and Analysis (I&A). I&A’s vision is to be a dominant and superior intelligence enterprise that drives intelligence integration at all levels. I&A...


  • Washington, United States Strategic Employment Partners Full time

    This small, mission-driven digital solutions company is looking to hire a Senior Backend Java Engineer with a specialization in FHIR (Fast Healthcare Interoperability Resources) Data. You will work on a flagship API over the course of a multi-year project that's built to scale. The team is looking for smart, compassionate engineers who are dedicated to...


  • Washington, United States Amentum Full time

    Amentum is seeking a Senior Data Engineer (RDT&E) to support a DIA Analytic Innovations Office Advanced Analytics & Product Evaluation contract. This position is based in Washington, D.C. All employees will start in the Washington D.C office but upon request and with customer approval may be allowed to work from Reston VA, Quantico VA, or College Park...


  • Washington, United States Smoothstack Full time

    SENIOR SOFTWARE ENGINER - DATA SCIENCE SMEUNABLE TO WORK WITH THIRD PARTIESSmoothstack is recruiting for a Senior Software Engineer with a strong focus in Data Science and significant hands-on experience in Cloud Native development to drive the creation of next-generation, data-driven applications. You will be responsible for the following:Lead Cloud Native...

  • Data Engineer

    1 month ago


    Washington, United States ITR Full time

    Job DescriptionJob DescriptionSenior Data Engineer – Top Secret Clearance RequiredWill implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms. • Leverage automation, cognitive and science-based techniques to manage data,...


  • Washington, United States TCG Full time

    You've stumbled upon the rare B Corp government contractor! At TCG, we aim to prove that businesses can be good to their employees and responsible to their community while being profitable. We're an award-winning IT solutions provider to the Federal government seeking a Senior Data Warehouse Engineer in anticipation of a new award. In this role, you will...


  • Washington DC, United States Smoothstack, Inc. Full time

    SENIOR SOFTWARE ENGINER - DATA SCIENCE SME UNABLE TO WORK WITH THIRD PARTIES Smoothsta ck is recruiting for a Senior Software Engineer with a strong focus in Data Science and significant hands-on experience in Cloud Native development to drive the creation of next-generation, data-driven applications. This position is hybrid in the DMV area. You will...


  • Washington DC, United States Strategic Employment Partners (SEP) Full time

    This small, mission-driven digital solutions company is looking to hire a Senior Backend Java Engineer with a specialization in FHIR (Fast Healthcare Interoperability Resources) Data. You will work on a flagship API over the course of a multi-year project that's built to scale. The team is looking for smart, compassionate engineers who are dedicated to...


  • Ft. Washington, Maryland, United States ENSCO Inc. Full time

    ENSCO Mission Systems Group (MSG) is recruiting senior engineering and support staff supporting the Office of the Undersecretary of Defense - Intelligence & Security Branch (OUSD/I&S). We are currently seeking a Data Analyst/Data Engineer who will interface with functional and technical experts. Successful candidates will support the modernization,...


  • Washington, United States NTT DATA Full time

    Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: Create Alert Apple MacOS Workspace ONE AirWatch Engineer Date: May 18, 2024 Location: Washington, DC, US Company: NTT DATA Services NTT DATA Services strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of...


  • Washington, United States NTT DATA Full time

    Job Description Req ID: 252378 NTT DATA Services strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.We are currently seeking a Data Center Infrastructure Leader to join our team in DC, District of Columbia (US-DC),...