See more Collapse

Azure Databricks

2 months ago


Iselin, United States Diverse Lynx Full time

Develop deep understanding of the data sources, implement data standards, maintain data quality and master data management.• Expert in building Databricks notebooks in extracting the data from various source systems like DB2, Teradata and perform data cleansing, data wrangling, data ETL processing and loading to AZURE SQL DB.• Expert in building Ephemeral Notebooks in Databricks like wrapper, driver and config for processing the data, back feeding the data to DB2 using multiprocessing thread pool.• Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.• Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.• Performed ETL operations in Azure Databricks by connecting to different relational database source systems using jdbc connectors.• Developed Python scripts to do file validations in Databricks and automated the process using ADF.• Analyzed the SQL scripts and designed it by using Pyspark SQL for faster performance.• Worked on reading and writing multiple data formats like JSON, Parquet, and delta from various sources using Pyspark.• Developed an automated process in Azure cloud which can ingest data daily from web service and load in to Azure SQL DB.• Expert in optimizing the Pyspark jobs to run on different Cluster for faster data processing.• Developed spark applications in python (Pyspark) on distributed environment to load huge number of CSV files with different schema in to Pyspark Dataframes and process them to reload in to Azure SQL DB tables.• Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.• Used Logic App to take decisional actions based on the workflow and developed custom alerts using Azure Data Factory, SQLDB and Logic App.• Developed Databricks ETL pipelines using notebooks, Spark Dataframes, SPARK SQL and python scripting.• Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.• Good Knowledge and exposure to the Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.• Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.• Expert in understanding current production state of application and determine the impact of new implementation on existing business processes.• Involved in Migration of data from On-prem server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).• Good Hands on experience in setting up Azure infrastructure like storage accounts, integration runtime, service principal id, and app registrations to enable scalable and optimized utilization of business user analytical requirements in Azure.• Expert in ingesting streaming Digital : Databricks 10 & Above • Develop deep understanding of the data sources, implement data standards, maintain data quality and master data management.• Expert in building Databricks notebooks in extracting the data from various source systems like DB2, Teradata and perform data cleansing, data wrangling, data ETL processing and loading to AZURE SQL DB.• Expert in building Ephemeral Notebooks in Databricks like wrapper, driver and config for processing the data, back feeding the data to DB2 using multiprocessing thread pool.• Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.• Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.• Performed ETL operations in Azure Databricks by connecting to different relational database source systems using jdbc connectors.• Developed Python scripts to do file validations in Databricks and automated the process using ADF.• Analyzed the SQL scripts and designed it by using Pyspark SQL for faster performance.• Worked on reading and writing multiple data formats like JSON, Parquet, and delta from various sources using Pyspark.• Developed an automated process in Azure cloud which can ingest data daily from web service and load in to Azure SQL DB.• Expert in optimizing the Pyspark jobs to run on different Cluster for faster data processing.• Developed spark applications in python (Pyspark) on distributed environment to load huge number of CSV files with different schema in to Pyspark Dataframes and process them to reload in to Azure SQL DB tables.• Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.• Used Logic App to take decisional actions based on the workflow and developed custom alerts using Azure Data Factory, SQLDB and Logic App.• Developed Databricks ETL pipelines using notebooks, Spark Dataframes, SPARK SQL and python scripting.• Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.• Good Knowledge and exposure to the Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.• Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.• Expert in understanding current production state of application and determine the impact of new implementation on existing business processes.• Involved in Migration of data from On-prem server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).• Good Hands on experience in setting up Azure infrastructure like storage accounts, integration runtime, service principal id, and app registrations to enable scalable and optimized utilization of business user analytical requirements in Azure.• Expert in ingesting streaming • Develop deep understanding of the data sources, implement data standards, maintain data quality and master data management.• Expert in building Databricks notebooks in extracting the data from various source systems like DB2, Teradata and perform data cleansing, data wrangling, data ETL processing and loading to AZURE SQL DB.• Expert in building Ephemeral Notebooks in Databricks like wrapper, driver and config for processing the data, back feeding the data to DB2 using multiprocessing thread pool.• Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.• Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.• Performed ETL operations in Azure Databricks by connecting to different relational database source systems using jdbc connectors.• Developed Python scripts to do file validations in Databricks and automated the process using ADF.• Analyzed the SQL scripts and designed it by using Pyspark SQL for faster performance.• Worked on reading and writing multiple data formats like JSON, Parquet, and delta from various sources using Pyspark.• Developed an automated process in Azure cloud which can ingest data daily from web service and load in to Azure SQL DB.• Expert in optimizing the Pyspark jobs to run on different Cluster for faster data processing.• Developed spark applications in python (Pyspark) on distributed environment to load huge number of CSV files with different schema in to Pyspark Dataframes and process them to reload in to Azure SQL DB tables.• Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.• Used Logic App to take decisional actions based on the workflow and developed custom alerts using Azure Data Factory, SQLDB and Logic App.• Developed Databricks ETL pipelines using notebooks, Spark Dataframes, SPARK SQL and python scripting.• Developed Spark applications using Pyspark and Spark-SQL for data extraction, transformation and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns.• Good Knowledge and exposure to the Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Driver Node, Worker Node, Stages, Executors and Tasks.• Involved in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.• Expert in understanding current production state of application and determine the impact of new implementation on existing business processes.• Involved in Migration of data from On-prem server to Cloud databases (Azure Synapse Analytics (DW) & Azure SQL DB).• Good Hands on experience in setting up Azure infrastructure like storage accounts, integration runtime, service principal id, and app registrations to enable scalable and optimized utilization of business user analytical requirements in Azure.• Expert in ingesting streaming • Develop deep understanding of the data sources, implement data standards, maintain data quality and master data management.• Expert in building Databricks notebooks in extracting the data from various source systems like DB2, Teradata and perform data cleansing, data wrangling, data ETL processing and loading to AZURE SQL DB.• Expert in building Ephemeral Notebooks in Databricks like wrapper, driver and config for processing the data, back feeding the data to DB2 using multiprocessing thread pool.• Expert in developing JSON Scripts for deploying the Pipeline in Azure Data Factory (ADF) that process the data.• Expert in using Databricks with Azure Data Factory (ADF) to compute large volumes of data.• Performed ETL operations in Azure Databricks by connecting to different relational database source systems using jdbc connectors.• Developed Python scripts to do file validations in Databricks and automated the process using ADF.• Analyzed the SQL scripts and designed it by using Pyspark SQL for faster performance.• Worked on reading and writing multiple data formats like JSON, Parquet, and delta from various sources using Pyspark.• Developed an automated process in Azure cloud which can ingest data daily from web service and load in to Azure SQL DB.• Expert in optimizing the Pyspark jobs to run on different Cluster for faster data processing.• Developed spark applications in python (Pyspark) on distributed environment to load huge number of CSV files with different schema in to Pyspark Dataframes and process them to reload in to Azure SQL DB tables.• Analyzed data where it lives by Mounting Azure Data Lake and Blob to Databricks.• Used Logic App to take decisional actions based on the workflow and developed custom alerts using Azure Data Factory, SQLDB and Logic App.• Developed Databricks ETL pipelines using notebooks, Spark Dataframes, SPARK SQL and python scripting.• Developed Spark applications using Pyspark and Spark-SQL for data extraction, transfo

Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.


We have other current jobs related to this field that you can find below

  • Data Engineer

    1 day ago


    Iselin, United States HAN IT Staffing Full time

    Role: Data Engineer Location: New Jersey (Hybrid 2 days onsite) Summary: We are seeking a highly skilled and experienced Data Engineer to join our team. The ideal candidate will have a strong background in data processing and transformation using Databricks, proficiency in SQL and Python, experience with Azure Data Factory, and a solid understanding...

  • Data Engineer

    3 weeks ago


    Iselin, United States Diverse Lynx Full time

    JD: Role name: Engineer Role Description: 1 Developing/design solution from detail design specification. 2 Playing an active role in defining standard in coding, system design and architecture.3 Revise, refactor, update and debug code. 4 Customer interaction.5 Must have strong technical background and hands on coding experience in Azure Data Factory. Azure...


  • Iselin, United States Hexaware Technologies Full time

    Data Warehousing Specialist II (HEX461; multiple positions; full-time)Hexaware seeks Data Warehousing Specialists II to work in Iselin, NJ and various unanticipated locations throughout the US to design and implement scalable data solutions. Research and develop technical patterns in Data and Analytics practices in cloud native platforms and tools. Develop...


  • Iselin, United States Hexaware Technologies Full time

    Hexaware Technologies, Inc Data Warehousing Specialists II Iselin , New Jersey Apply Now Data Warehousing Specialist II (HEX461; multiple positions; full-time)Hexaware seeks Data Warehousing Specialists II to work in Iselin, NJ and various unanticipated locations throughout the US to design and implement scalable data solutions. Research and develop...


  • Iselin, United States Synechron Full time

    About the jobSummary:We are looking to hire Sr. Snowflake Data Warehouse Specialist with a financial background. Who will play a crucial role in leveraging Snowflake, a cloud-based data warehousing platform, to manage and analyze financial data.Primary Responsibilities:Able to provide architecture, design, implementation and operationalization of large-scale...