Big Data Developer

3 weeks ago


Houston, United States Diverse Lynx Full time
Title: Big Data Developer
Location: Houston, TX (Onsite)
Type: Contract (C2C or W2)

Job Description:

As a Big Data Developer within our technology team, you'll be instrumental in building, maintaining, and optimizing big data platforms and solutions. Leveraging the Cloudera suite, you will craft robust, scalable data processing pipelines and contribute to the development of our data-centric applications using cutting-edge technologies.

Key Responsibilities:
  • Design and develop big data applications using Cloudera suite components like Kafka, HDFS, HBASE, KUDU, Zookeeper, HIVE, and Impala.
  • Write robust, efficient, and maintainable code in Java/Scala/Python, with a strong emphasis on Spark and Flink for real-time data processing.
  • Develop and maintain data pipelines using Apache NIFI, ensuring seamless data collection, ingestion, and distribution.
  • Utilize Spring-Boot and Flask frameworks for creating microservices that interact with Big Data systems.
  • Manage data workflows with Apache Oozie and orchestrate complex data processes with Apache Airflow.
  • Implement solutions for data security and governance using Atlas, Ranger, RangerKMS, and KTS within Cloudera ecosystems.
  • Work on cloud-native Big Data technologies, integrating solutions with cloud services for enhanced scalability and performance.
  • Scripting and automation of routine tasks within Linux environments to enhance development and deployment processes.
  • Optimize data retrieval with advanced HQL queries and tune performance for HIVE and Impala databases.
  • Employ Kubernetes container orchestration for deployment, scaling, and operations of application containers across clusters of hosts.
  • Ensure the development of high-quality applications by writing test cases and maintaining a continuous integration and deployment pipeline.
  • Contribute to the architectural decisions and create documentation outlining design and technical specifications.
  • Maintain a proactive approach to troubleshoot and resolve issues in the production environment.
  • Engage with cross-functional teams to translate business requirements into technical implementations.
  • Stay current with industry trends and evaluate new technologies for adoption into existing or new data infrastructure components.
  • Run Data Engineering Jobs on GPUs using SPARK, tweak Jobs to utilize distributed GPUs.
  • Design and implement security measures in the application development process, leveraging Kerberos authentication to secure communication within the cluster. Work closely with data administrators to ensure that all applications comply with established security protocols and access control measures. Develop scripts and automation tools to streamline the security aspects of the big data applications lifecycle.
  • Implement and maintain the encryption standards for securing data-in-transit using TLS protocols to prevent eavesdropping, tampering, and forgery. Architect solutions that encrypt data-at-rest to safeguard sensitive information using industry-standard encryption algorithms and manage encryption keys with a focus on maintaining performance while enforcing data security. Regularly review and audit the code-base for compliance with data protection regulations and organizational policies.
Technical Qualifications:
  • Advanced programming skills in Java/Scala/Python with a focus on Spark and Flink for large-scale data processing.
  • Strong understanding of the Cloudera suite, including in-depth knowledge of data management and processing services.
  • Proficient in building applications with Spring-Boot and Flask, with experience in creating RESTful services.
  • Experience with scripting and automation in a Linux environment, along with expertise in shell scripting.
  • In-depth knowledge of SQL, HQL, and the ability to perform query optimization on big data sets.
  • Proficient in Kubernetes, including deploying applications in OpenShift or other Kubernetes environments.
  • Familiarity with Neo4j or similar graph database technologies, and the ability to integrate them into big data solutions.
  • Experience with Cloudera Data Services such as Cloudera Data Engineering, Cloudera Data Warehouse, and Cloudera Machine Learning is highly desirable.
  • Experience in RAPIDS & GPU-Aware Scheduling.
  • Knowledge of data modeling, data access, and data storage techniques for big data environments.
  • Proficient in integrating Kerberos authentication within big data applications for secure access and communication with big data services. Capable of scripting and automation to manage Kerberos ticket lifecycles, renewals, and troubleshooting common Kerberos issues in a development environment.
  • Skilled in implementing and managing security protocols for data in transit, including setting up and configuring TLS for secure data transfer within big data solutions. Knowledgeable in encryption standards and tools for encrypting data at rest, and understanding of key management systems and cryptographic practices to ensure data privacy and regulatory compliance.
Education:
  • Bachelor's degree in Computer Science, Information Technology, or a related field.

Certifications:
  • Cloudera Certified Professional (CCP) or any relevant Big Data certification is preferred.

Soft Skills:
  • Strong analytical and problem-solving skills.
  • Excellent verbal and written communication abilities.
  • Collaborative team player with an ability to work in dynamic environments.
  • Self-motivated with a keen interest in technology and continuous learning.


Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.
  • Big Data Developer

    4 weeks ago


    Houston, Texas, United States Involgix Full time

    Role Big Data Developer Duration 18+ Months Contract Location Austin TX Roles and Responsibilities Must have * Prior experience in developing application with big data technologies(hadoop spark kafka amazon aws) * Excellent in coding large scale application in java (at least 5+ years of experience in java) * Good understanding of SQL & relational databases *...

  • big data

    4 weeks ago


    Houston, Texas, United States Tangle Systems LLC Full time

    Big Data Lead / EngineerLocation Houston TXMust have Azure and data factor 5+ years of hands on experience in Big Data technologies like Cloudera Spark Hive Scala or Java Oozie Impala SQL queries is mandatory 3+ years of hands on experience ino Azure Data Factory Data Pipelineo Databricks Spark PySpark and Pythono Azure Data Lake Storageo Azure SQL Database...

  • Data Architect

    4 weeks ago


    Houston, United States iLink Systems Inc. Full time

    Bachelor's degree in Computer Science, Software Engineering, or equivalent combination of education and experience. - Total 12 + yrs of IT experience in ETL /Big Data Technologies - 3-5 years experience with Azure storage solutions such as Azure Storage, Azure Storage Disks, and Azure Files - 1-3 years experience with tools such as Databricks (Lakehouse),...

  • Data Scientist

    2 weeks ago


    Houston, United States Diverse Lynx Full time

    nalyze data sources, design and evaluate feasible data pipeline solutions. The solutions might include database modeling and design. Understand the complexity of data and design systems and models to handle different data sources/formats, which includes structured, semi-structured, and unstructured, as well as stream processingSupport and adhere to the...

  • Application Developer

    3 weeks ago


    Houston, United States wipros Full time

    Responsiblties Expertise in creating and maintaining data marts and dimensional model and the logic to keep these updated. Understanding of the Kimball model and data warehousing best practices Ability to use azure data factory to extract from any data source including big data structures. Expertise in the azure platform and its toolset Expertise in...


  • Houston, United States Katalyst Data Managment Full time

    Job DescriptionJob DescriptionKey ResponsibilitiesThe Business Development Manager (BDM) is responsible for promoting and selling data management solutions to E&P companies. The BDM is also responsible for account management, opportunity identification, understanding and analyzing the business challenges of the clients The BDM will use this knowledge to...

  • Application Developer

    4 weeks ago


    Houston, Texas, United States wipros Full time

    Responsiblties Expertise in creating and maintaining data marts and dimensional model and the logic to keep these updated. Understanding of the Kimball model and data warehousing best practices Ability to use azure data factory to extract from any data source including big data structures. Expertise in the azure platform and its toolset Expertise in loading...

  • Data Engineer

    3 weeks ago


    Houston, Texas, United States wipros Full time

    Responsibilities Develop and maintain high volume streaming pipelines as well as batch processes. Maintain and improve existing framework to support data integration pipelines. Conduct design and code reviews with peers business analyst and QA teams relating to data processes. Interface with all areas of TAB to understand and analyze business and functional...

  • Data Engineer

    4 weeks ago


    Houston, Texas, United States Apollose Full time

    Responsiblities Developing and maintaining scalable reusable data pipelines for both streaming and batch requirements to support continuing increases in data volume and complexity for user accesses via dashboard report self-service data platform to structured and unstructured data. Data processing & QC checks across multiple layers of data warehouses...

  • Mulesoft Developer

    4 weeks ago


    Houston, Texas, United States Thoughtbyte Full time

    Develop analytical (predictive optimization) solutions using machine learning and statistical modeling on a big data platform that serve the business objectives. Formulate business problems into mathematical models effectively. Effectively communicate and collaborate within and across teams.Required Qualifications Solid foundation in applied mathematics such...

  • Data Engineer

    4 weeks ago


    Houston, United States Tek Ninjas consulting services Full time

    Strong problem-solving skills with an ability to isolate deconstruct and resolve complex data/engineering challenges · Experience in building highly available distributed systems of data extraction ingestion and processing of large data sets. Experience developing Big Data/Hadoop applications using Spark PySpark Hive Oozie Kafka Hbase. Experience...

  • Data Engineer

    4 weeks ago


    Houston, Texas, United States Tek Ninjas consulting services Full time

    Strong problem-solving skills with an ability to isolate deconstruct and resolve complex data/engineering challenges · Experience in building highly available distributed systems of data extraction ingestion and processing of large data sets. Experience developing Big Data/Hadoop applications using Spark PySpark Hive Oozie Kafka Hbase. Experience with the...

  • Data Architect

    4 weeks ago


    Houston, United States Business integra Full time

    Job Description: - Design and implement end-to-end data solutions (storage, ingestion, integration, processing, serving, visualization) in Azure - Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL, NoSQL DBs - Migrate data from traditional database systems (such as Teradata) to Azure databases - Strong experience on data...


  • Houston, United States Data Cloud Merge Full time

    Company Description Data Cloud Merge is a communications Company that was established in the year 2002 and located in Jersey City, New Jersey. It has gained a lot of experience in providing innovations in IT that were formerly found in large companies alone. The Company provides dedicated IT innovations at friendly prices unlike other companies that operate...


  • Houston, United States Data Cloud Merge Full time

    Company Description Data Cloud Merge is a communications Company that was established in the year 2002 and located in Jersey City, New Jersey. It has gained a lot of experience in providing innovations in IT that were formerly found in large companies alone. The Company provides dedicated IT innovations at friendly prices unlike other companies that operate...

  • Hadoop developer

    4 weeks ago


    Houston, Texas, United States Logging-in Full time

    We are looking for a Big Data Engineer that will work on the collecting storing processing and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes then maintaining implementing and monitoring them. You will also be responsible for integrating them with the architecture used across the...

  • Data Specialist

    4 weeks ago


    Houston, United States Vincent Group Inc. Full time

    Job Overview: We are seeking a highly skilled and detail-oriented Data Specialist to join our team. As a Data Specialist, you will be responsible for managing and analyzing large datasets to extract valuable insights and support data-driven decision making. You will work closely with cross-functional teams to ensure data accuracy, integrity, and...

  • Sr. Data Developer

    1 week ago


    Houston, United States Royal & Ross Full time

    Job Title: SeniorData DeveloperOur client is looking to hire a Senior Data Developer with Snowflake experience to help drive impactful insights through the delivery of pivotal data products tailored to the solar industry. Our data solutions make a meaningful impact on customer success, sales performance, marketing, and internal processes that drive insights...

  • Sr. Data Developer

    2 weeks ago


    Houston, United States Royal & Ross Full time

    Job Title: Senior Data DeveloperOur client is looking to hire a Senior Data Developer with Snowflake experience to help drive impactful insights through the delivery of pivotal data products tailored to the solar industry. Our data solutions make a meaningful impact on customer success, sales performance, marketing, and internal processes that drive insights...

  • Sr. Data Developer

    2 weeks ago


    Houston, United States Royal & Ross Full time

    Job Title: Senior Data DeveloperOur client is looking to hire a Senior Data Developer with Snowflake experience to help drive impactful insights through the delivery of pivotal data products tailored to the solar industry. Our data solutions make a meaningful impact on customer success, sales performance, marketing, and internal processes that drive insights...