Senior PySpark Engineer

4 weeks ago


Plano, United States Ascentt Full time

Job Summary: We are seeking an experienced Senior PySpark Engineer to join our team of data professionals. In this role, you will collaborate closely with Data Scientists to prepare and transform large-scale datasets using PySpark, a popular open-source Python library for Apache Spark. You will play a crucial role in enabling effective data analysis and modeling by ensuring the availability of high-quality, feature-rich datasets.


Responsibilities

1. Data Preparation and Transformation:

Leverage PySpark to efficiently process and transform large-scale datasets from various sources.

Develop robust and scalable PySpark code to handle data cleaning, munging, and feature engineering tasks.

Collaborate with Data Scientists to understand their data requirements and translate them into efficient PySpark workflows.

2. Feature Engineering:

Work closely with Data Scientists to identify and implement relevant feature engineering techniques.

Employ advanced feature engineering methods, such as one-hot encoding, scaling, binning, and feature creation/selection, to enhance the predictive power of machine learning models.

Stay up-to-date with the latest feature engineering techniques and best practices in the industry.

3. Distributed Computing:

Leverage Apache Spark's distributed computing capabilities to process and analyze large-scale datasets efficiently.

Optimize PySpark code for performance, scalability, and fault tolerance.

Implement and maintain data pipelines using PySpark to automate data preparation and transformation processes.

4. Code Quality and Documentation:

Write clean, maintainable, and well-documented PySpark code following best practices and coding standards.

Collaborate with team members through code reviews and knowledge sharing sessions.

Contribute to the development and maintenance of PySpark-related documentation and best practices within the organization.

5. Continuous Learning and Improvement:

Stay current with the latest developments in PySpark, Apache Spark, and related big data technologies.

Actively participate in professional development opportunities, such as attending conferences, workshops, or online training.

Identify areas for improvement in existing data preparation and transformation processes and propose solutions.

Qualifications:

Bachelor's or Master's degree in Computer Science, Data Science, or a related field.

Proven experience as a Senior PySpark Engineer or a similar role, with a minimum of 5 years of experience working with PySpark and Apache Spark.

Strong proficiency in Python programming and experience with PySpark APIs and libraries.

Solid understanding of distributed computing principles and experience working with large-scale datasets.

Familiarity with feature engineering techniques and their application in machine learning pipelines.

Experience in developing and maintaining data pipelines and workflows using PySpark.

Excellent problem-solving, analytical, and critical thinking skills.

Strong communication and collaboration skills to work effectively with Data Scientists and cross-functional teams.

Passion for staying up-to-date with the latest developments in the big data and data engineering domains.



  • Plano, United States Ascentt Full time

    Job Summary: We are seeking an experienced Mid-Senior Data Engineer (PySpark Engineer) to join our team of data professionals. In this role, you will collaborate closely with Data Scientists to prepare and transform large-scale datasets using PySpark, a popular open-source Python library for Apache Spark. You will play a crucial role in enabling effective...


  • Plano, United States Ascentt Full time

    Job Summary: We are seeking an experienced Mid-Senior Data Engineer (PySpark Engineer) to join our team of data professionals. In this role, you will collaborate closely with Data Scientists to prepare and transform large-scale datasets using PySpark, a popular open-source Python library for Apache Spark. You will play a crucial role in enabling effective...


  • Plano, United States Ascentt Full time

    Job Summary: We are seeking an experienced Mid-Senior Data Engineer (PySpark Engineer) to join our team of data professionals. In this role, you will collaborate closely with Data Scientists to prepare and transform large-scale datasets using PySpark, a popular open-source Python library for Apache Spark. You will play a crucial role in enabling effective...


  • Plano, United States CrackaJack Digital Solutions Full time

    Databricks DLT Lead EngineerPlano, TX (Hybrid, 3 Days a week)12 Months *** Local Candidates Highly Preferred** Job Summary: We are seeking an experienced Senior Databricks Engineer to join our team of data professionals. In this role, you will collaborate closely with Data Scientists to prepare and transform large-scale datasets using PySpark, a popular...

  • Data Engineer

    1 month ago


    Plano, United States Promantus Inc Full time

    Data Engineer Location Plano Texas (Remote until COVID) Duration Long-term Job Description Develops Python/PySpark HQL queries. Develops new data models as necessary with Lead Backend Developer and Architect. Performs data visualization and analysis. Produces data samples for UI/UX Designer. Communicates insights to Architect Lead Backend Developer ...

  • Data Engineer

    1 month ago


    Plano, Texas, United States Promantus Inc Full time

    Data Engineer Location Plano Texas (Remote until COVID) Duration Long-term Job Description Develops Python/PySpark HQL queries. Develops new data models as necessary with Lead Backend Developer and Architect. Performs data visualization and analysis. Produces data samples for UI/UX Designer. Communicates insights to Architect Lead Backend Developer Lead...

  • Software Engineer

    19 hours ago


    Plano, United States The Judge Group Full time

    Title: React JS – Advanced Location : Plano, TX Hybrid need someone with AWS & Pyspark with Databricks experience Description: Expert in React JS and design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs (e.g., Expert Engineers) starting to be firm-wide resources...


  • Plano, United States JPMorgan Chase Full time

    You have the opportunity to unleash your full potential at a world-renowned company and take the lead in shaping the future of technology.As a Senior Manager of Data Engineering at JPMorgan Chase within the Corporate Sector, Data Services, you serve in a leadership role by providing technical coaching and advisory for multiple technical teams, as well as...

  • Software Engineer

    2 days ago


    Plano, United States The Judge Group Full time

    Title: React JS – AdvancedLocation : Plano, TX Hybrid need someone with AWS & Pyspark with Databricks experienceDescription:Expert in React JS and design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs (e.g., Expert Engineers) starting to be firm-wide resourcesFormal...

  • Software Engineer

    2 days ago


    Plano, United States The Judge Group Full time

    Title: React JS – AdvancedLocation : Plano, TX Hybrid need someone with AWS & Pyspark with Databricks experienceDescription:Expert in React JS and design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs (e.g., Expert Engineers) starting to be firm-wide resourcesFormal...

  • Software Engineer

    2 days ago


    Plano, United States The Judge Group Full time

    Title: React JS – AdvancedLocation : Plano, TX Hybrid need someone with AWS & Pyspark with Databricks experienceDescription:Expert in React JS and design technique as well as experience working across large environments with multiple operating systems/infrastructure for large-scale programs (e.g., Expert Engineers) starting to be firm-wide resourcesFormal...

  • Data Engineer

    1 day ago


    Plano, United States Apex Systems Full time

    Data EngineerRate: $70/HRLocation: Plano, TX (Hybrid, Tuesday, Wednesday, Thursday on site)Duration: 6-12 months contract with possibility for extension.Contract Type: W2 (No C2C at this time) Job Description: In this contingent resource assignment, you may: Consult on complex initiatives with broad impact and large-scale planning for Software Engineering....

  • Data Engineer

    20 hours ago


    Plano, United States Apex Systems Full time

    Data EngineerRate: $70/HRLocation: Plano, TX (Hybrid, Tuesday, Wednesday, Thursday on site)Duration: 6-12 months contract with possibility for extension.Contract Type: W2 (No C2C at this time) Job Description: In this contingent resource assignment, you may: Consult on complex initiatives with broad impact and large-scale planning for Software Engineering....


  • Plano, United States Olsson Full time

    Olsson provides multidisciplinary design services for transportation-related projects including urban and rural highways, interchanges, and municipal roadways. As a Senior Transportation Engineer on our Roadway and Bridge team, you will provide overa Transportation, Engineer, Transport, Senior, Technical Support


  • Plano, United States Teleworld Solutions Full time

    OverviewTeleWorld Solutions is seeking a experienced Software engineer with focus on Data Engineering, ETL processes, preferably with exposure to both batch and streaming data. The candidate should have familiarity with use of Databases and DataLake infrastructure and associated tools for ingestion, transformation, and efficient querying across distributed...


  • Plano, United States Carnegie Mellon University Full time

    Carnegie Mellon University Senior Robotics Engineer PITTSBURGH , Pennsylvania Apply Now Reference #: 2021448Carnegie Mellon University seeks a Senior Robotics Engineer in Pittsburgh, PA, responsible for developing robotic systems and simulation tools to facilitate the validation and testing of robotic applications. Specific duties include: (i) designing,...


  • Plano, United States r2 Technologies, Inc. Full time

    Senior Security Engineer Great Company to work for! No C2C or Third Parties. No H1b Transfer/Sponsorship On-site in Plano, TX Join our family-friendly, fast growing company as a Senior Security Engineer. We prioritize creating a supportive environment where you'll have the opportunity to design, implement, manage, and monitor security measures to protect our...


  • Plano, United States JPMorgan Chase & Co. Full time

    Job responsibilities Executes software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problemsCreates secure and high-quality production code and maintains algorithms that run synchronously with appropriate systemsProduces architecture...


  • Plano, United States JPMorgan Chase & Co. Full time

    Job responsibilities Executes software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problemsCreates secure and high-quality production code and maintains algorithms that run synchronously with appropriate systemsProduces architecture...


  • Plano, United States r2 Technologies, Inc. Full time

    Senior Security EngineerGreat Company to work for!No C2C or Third Parties. No H1b Transfer/SponsorshipOn-site in Plano, TXJoin our family-friendly, fast growing company as a Senior Security Engineer. We prioritize creating a supportive environment where you'll have the opportunity to design, implement, manage, and monitor security measures to protect our...