Data Engineer

2 weeks ago


Austin, United States Tech Mahindra Full time

Data Engineer (Day 1 onsite)

Auston, TX

Fulltime



Must to have skills


Python

Pyspark

SQL

Data Engineering

Big Data


Job Description


We're seeking a Data Engineer to take the lead in implementing and scaling data

collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within

Conversational Engineering. These data pipelines are crucial for powering our cutting-edge

research, safety systems, and product development. If you're passionate about working with

data and are eager to create solutions that directly impact the advancement of LLMs, we'd love

to hear from you. This role provides the exciting opportunity to collaborate closely with applied

ML engineers, software engineers, and data scientists that create our AI systems today.

In this role, you will:


• Design, build, and manage scalable data pipelines for collecting, storing, processing, and

filtering large volumes of text data for fine-tuning LLMs.

• Develop and optimize data storage architectures to handle the massive scale of data

required for training state-of-the-art language models.

• Implement efficient data preprocessing, cleaning, and feature extraction techniques to

ensure high-quality data for model training.

• Collaborate with machine learning engineers and researchers to understand their data

requirements and provide tailored solutions for LLM fine-tuning.

• Design and implement robust and fault-tolerant systems for data ingestion, processing,

and delivery.

• Optimize data pipelines for performance, scalability, and cost-efficiency, leveraging

distributed computing frameworks and cloud platforms.

• Ensure the security, privacy, and compliance of data according to industry best practices

and regulatory requirements.

You might thrive in this role if you:

• Have 7+ years of experience as a data engineer, with a strong background in designing

and building large-scale data pipelines.

• Possess deep expertise in distributed computing frameworks such as Apache Spark,

Hadoop, or Flink, and have hands-on experience optimizing data processing at scale.

• Are proficient in programming languages commonly used in data engineering, such as

Python, and have a solid understanding of data structures and algorithms.

• Have extensive experience with cloud platforms like AWS, Google Cloud, or Azure for

data storage, processing, and management.

• Are well-versed in various data storage technologies, including distributed file systems

(e.g., HDFS, S3), databases (e.g., Cassandra, HBase), and data warehouses (e.g.,

Redshift, BigQuery).

• Have hands-on experience with ETL orchestration tools such as Apache Airflow, Dagster,

or Prefect for managing complex data workflows.

• Possess knowledge of natural language processing (NLP) techniques and have worked

with text data preprocessing, normalization, and feature extraction.

• Are passionate about staying up-to-date with the latest advancements in data

engineering and NLP, and are eager to apply innovative techniques to solve challenging

problems.

• Have strong problem-solving skills, are detail-oriented, and can effectively communicate

technical concepts to both technical and non-technical stakeholders.



Tech Mahindra is an Equal Employment Opportunity employer. We promote and support a diverse workforce at all levels of the company. All qualified applicants will receive consideration for employment without regard to race, religion, color, sex, age, national origin or disability. All applicants will be evaluated solely on the basis of their ability, competence, and performance of the essential functions of their positions with or without reasonable accommodations. Reasonable accommodations also are available in the hiring process for applicants with disabilities. Candidates can request a reasonable accommodation by contacting the company ADA Coordinator at ADA_Accomodations@TechMahindra.com.”



  • Austin, United States Amazon Data Services, Inc. Full time

    AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely...

  • Data Engineer

    2 weeks ago


    Austin, United States Tech M USAAvance Consulting Full time

    Data Engineer (Day 1 onsite) Austin, TX Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data...

  • Data Engineer

    3 weeks ago


    Austin, United States Avance Consulting Full time

    Job DescriptionJob DescriptionData Engineer (Day 1 onsite)Austin, TXMust to have skillsPythonPysparkSQLData EngineeringBig DataJob DescriptionWe're seeking a Data Engineer to take the lead in implementing and scaling datacollection, storage, processing, and filtering for fine-tuning large language models (LLMs) withinConversational Engineering. These...

  • Data Engineer

    3 weeks ago


    Austin, United States Tech M USAAvance Consulting Full time

    Data Engineer (Day 1 onsite) Austin, TX Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data pipelines are...

  • Data Engineer

    7 days ago


    Austin, United States Tech Mahindra Full time

    Data Engineer (Day 1 onsite) Auston, TX Fulltime Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data...

  • Data Engineer

    5 days ago


    Austin, United States Tech Mahindra Full time

    Data Engineer (Day 1 onsite) Auston, TXFulltimeMust to have skillsPythonPysparkSQLData EngineeringBig DataJob Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data pipelines are...

  • Data Engineer

    3 days ago


    Austin, Texas, United States IBM Full time

    Data EngineerIntroductionAt IBM, work is more than a job – it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the...

  • Data Engineer

    3 days ago


    Austin, United States IBM Full time

    Data Engineer IntroductionAt IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some...

  • Data Engineer

    3 days ago


    Austin, United States augmentjobs Full time

    Job DescriptionJob DescriptionPosition Overview: We are seeking a talented and experienced Data Engineer to join our dynamic tech team. As a Data Engineer, you will be responsible for designing, constructing, and maintaining our data architecture and infrastructure. You will work closely with data scientists, analysts, and other stakeholders to understand...

  • Data Engineer

    2 weeks ago


    Austin, United States XeoMatrix Full time

    Data Engineer We are currently seeking an experienced data engineer with 7 - 10 years with hands-on data engineering experience. This candidate must possess technical, business analysis, and communication skills. This position offers an opportunity to work directly with clients to design strategic data solutions that help them visualize complex data in a...

  • Data Engineer

    2 weeks ago


    Austin, United States XeoMatrix Full time

    Data Engineer We are currently seeking an experienced data engineer with 7 - 10 years with hands-on data engineering experience. This candidate must possess technical, business analysis, and communication skills. This position offers an opportunity to work directly with clients to design strategic data solutions that help them visualize complex data in a...

  • Data Engineer

    4 weeks ago


    Austin, United States Loxo Full time

    As an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...

  • Data Engineer

    1 week ago


    Austin, Texas, United States Loxo Full time

    As an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...

  • Data Engineer

    2 weeks ago


    Austin, United States Loxo Full time

    As an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...

  • Data Engineer

    3 weeks ago


    Austin, United States Loxo Full time

    As an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...


  • Austin, United States T5 Data Centers Full time

    **Company Description** Forever On! From the start in 2008, T5 has been focused on supporting enterprise and hyperscale customers with customized data center solutions. Today, we remain dedicated to an unrivaled level of quality that extends across the lifecycle of the core data center ranging from customized turnkey development, facilities management and...

  • Data Engineer

    4 weeks ago


    Austin, Texas, United States Apple Full time

    SummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...

  • Data Engineer

    1 month ago


    Austin, Texas, United States Apple Full time

    SummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...

  • Data Engineer

    1 week ago


    Austin, Texas, United States Apple Full time

    SummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...

  • Data Engineer

    3 days ago


    Austin, Texas, United States Unity Technologies SF Full time

    Data Engineer, Unity Technologies SF, Austin, TX. Build, scale, and maintain data pipelines to process billions of daily events into our Hadoop and RDBMS data warehouses. Write and tune complex Java, MapReduce, Pig and Hive jobs. Explore available technologies and design solutions to continuously improve our data quality, workflow reliability, scalability...