![Tech Mahindra](https://media.trabajo.org/img/noimg.jpg)
Data Engineer
2 weeks ago
Data Engineer (Day 1 onsite)
Auston, TX
Fulltime
Must to have skills
Python
Pyspark
SQL
Data Engineering
Big Data
Job Description
We're seeking a Data Engineer to take the lead in implementing and scaling data
collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within
Conversational Engineering. These data pipelines are crucial for powering our cutting-edge
research, safety systems, and product development. If you're passionate about working with
data and are eager to create solutions that directly impact the advancement of LLMs, we'd love
to hear from you. This role provides the exciting opportunity to collaborate closely with applied
ML engineers, software engineers, and data scientists that create our AI systems today.
In this role, you will:
• Design, build, and manage scalable data pipelines for collecting, storing, processing, and
filtering large volumes of text data for fine-tuning LLMs.
• Develop and optimize data storage architectures to handle the massive scale of data
required for training state-of-the-art language models.
• Implement efficient data preprocessing, cleaning, and feature extraction techniques to
ensure high-quality data for model training.
• Collaborate with machine learning engineers and researchers to understand their data
requirements and provide tailored solutions for LLM fine-tuning.
• Design and implement robust and fault-tolerant systems for data ingestion, processing,
and delivery.
• Optimize data pipelines for performance, scalability, and cost-efficiency, leveraging
distributed computing frameworks and cloud platforms.
• Ensure the security, privacy, and compliance of data according to industry best practices
and regulatory requirements.
You might thrive in this role if you:
• Have 7+ years of experience as a data engineer, with a strong background in designing
and building large-scale data pipelines.
• Possess deep expertise in distributed computing frameworks such as Apache Spark,
Hadoop, or Flink, and have hands-on experience optimizing data processing at scale.
• Are proficient in programming languages commonly used in data engineering, such as
Python, and have a solid understanding of data structures and algorithms.
• Have extensive experience with cloud platforms like AWS, Google Cloud, or Azure for
data storage, processing, and management.
• Are well-versed in various data storage technologies, including distributed file systems
(e.g., HDFS, S3), databases (e.g., Cassandra, HBase), and data warehouses (e.g.,
Redshift, BigQuery).
• Have hands-on experience with ETL orchestration tools such as Apache Airflow, Dagster,
or Prefect for managing complex data workflows.
• Possess knowledge of natural language processing (NLP) techniques and have worked
with text data preprocessing, normalization, and feature extraction.
• Are passionate about staying up-to-date with the latest advancements in data
engineering and NLP, and are eager to apply innovative techniques to solve challenging
problems.
• Have strong problem-solving skills, are detail-oriented, and can effectively communicate
technical concepts to both technical and non-technical stakeholders.
Tech Mahindra is an Equal Employment Opportunity employer. We promote and support a diverse workforce at all levels of the company. All qualified applicants will receive consideration for employment without regard to race, religion, color, sex, age, national origin or disability. All applicants will be evaluated solely on the basis of their ability, competence, and performance of the essential functions of their positions with or without reasonable accommodations. Reasonable accommodations also are available in the hiring process for applicants with disabilities. Candidates can request a reasonable accommodation by contacting the company ADA Coordinator at ADA_Accomodations@TechMahindra.com.”
-
Austin, United States Amazon Data Services, Inc. Full timeAWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely...
-
Data Engineer
2 weeks ago
Austin, United States Tech M USAAvance Consulting Full timeData Engineer (Day 1 onsite) Austin, TX Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data...
-
Data Engineer
3 weeks ago
Austin, United States Avance Consulting Full timeJob DescriptionJob DescriptionData Engineer (Day 1 onsite)Austin, TXMust to have skillsPythonPysparkSQLData EngineeringBig DataJob DescriptionWe're seeking a Data Engineer to take the lead in implementing and scaling datacollection, storage, processing, and filtering for fine-tuning large language models (LLMs) withinConversational Engineering. These...
-
Data Engineer
3 weeks ago
Austin, United States Tech M USAAvance Consulting Full timeData Engineer (Day 1 onsite) Austin, TX Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data pipelines are...
-
Data Engineer
7 days ago
Austin, United States Tech Mahindra Full timeData Engineer (Day 1 onsite) Auston, TX Fulltime Must to have skills Python Pyspark SQL Data Engineering Big Data Job Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data...
-
Data Engineer
5 days ago
Austin, United States Tech Mahindra Full timeData Engineer (Day 1 onsite) Auston, TXFulltimeMust to have skillsPythonPysparkSQLData EngineeringBig DataJob Description We're seeking a Data Engineer to take the lead in implementing and scaling data collection, storage, processing, and filtering for fine-tuning large language models (LLMs) within Conversational Engineering. These data pipelines are...
-
Data Engineer
3 days ago
Austin, Texas, United States IBM Full timeData EngineerIntroductionAt IBM, work is more than a job – it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the...
-
Data Engineer
3 days ago
Austin, United States IBM Full timeData Engineer IntroductionAt IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some...
-
Data Engineer
3 days ago
Austin, United States augmentjobs Full timeJob DescriptionJob DescriptionPosition Overview: We are seeking a talented and experienced Data Engineer to join our dynamic tech team. As a Data Engineer, you will be responsible for designing, constructing, and maintaining our data architecture and infrastructure. You will work closely with data scientists, analysts, and other stakeholders to understand...
-
Data Engineer
2 weeks ago
Austin, United States XeoMatrix Full timeData Engineer We are currently seeking an experienced data engineer with 7 - 10 years with hands-on data engineering experience. This candidate must possess technical, business analysis, and communication skills. This position offers an opportunity to work directly with clients to design strategic data solutions that help them visualize complex data in a...
-
Data Engineer
2 weeks ago
Austin, United States XeoMatrix Full timeData Engineer We are currently seeking an experienced data engineer with 7 - 10 years with hands-on data engineering experience. This candidate must possess technical, business analysis, and communication skills. This position offers an opportunity to work directly with clients to design strategic data solutions that help them visualize complex data in a...
-
Data Engineer
4 weeks ago
Austin, United States Loxo Full timeAs an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...
-
Data Engineer
1 week ago
Austin, Texas, United States Loxo Full timeAs an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...
-
Data Engineer
2 weeks ago
Austin, United States Loxo Full timeAs an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...
-
Data Engineer
3 weeks ago
Austin, United States Loxo Full timeAs an early hire to our engineering team, you will be responsible for managing Loxo's data integration function. You will be primarily responsible for migrating new clients' legacy data from their previous recruitment platform to Loxo and will work closely with the Customer Success team to deliver a positive onboarding experience for all new Loxo users. You...
-
Data Center Technician
3 weeks ago
Austin, United States T5 Data Centers Full time**Company Description** Forever On! From the start in 2008, T5 has been focused on supporting enterprise and hyperscale customers with customized data center solutions. Today, we remain dedicated to an unrivaled level of quality that extends across the lifecycle of the core data center ranging from customized turnkey development, facilities management and...
-
Data Engineer
4 weeks ago
Austin, Texas, United States Apple Full timeSummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...
-
Data Engineer
1 month ago
Austin, Texas, United States Apple Full timeSummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...
-
Data Engineer
1 week ago
Austin, Texas, United States Apple Full timeSummaryPosted: Apr 24, 2024Weekly Hours: 40Role Number: At Apple, we work every day to create products that enrich people's lives Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today,...
-
Data Engineer
3 days ago
Austin, Texas, United States Unity Technologies SF Full timeData Engineer, Unity Technologies SF, Austin, TX. Build, scale, and maintain data pipelines to process billions of daily events into our Hadoop and RDBMS data warehouses. Write and tune complex Java, MapReduce, Pig and Hive jobs. Explore available technologies and design solutions to continuously improve our data quality, workflow reliability, scalability...