Data Linguist

4 weeks ago


Luling, Texas, United States Amazon Development Center U.S., Inc. Full time

About the Role

Amazon Development Center U.S., Inc. is seeking a skilled Data Linguist to join our team. As a Data Linguist, you will play a critical role in delivering high-quality training data to ensure the best performance of our machine learning systems.

Key Responsibilities

  • Build a thorough understanding of data collection and annotation guidelines and various annotation tools.
  • Annotate, generate, and QA data, identifying linguistic categories based on detailed annotation and adhering to guidelines.
  • Use generative AI to facilitate workflows or automate repetitive tasks.
  • Monitor AI outputs for biases or ethical issues and adjust inputs to mitigate these risks.
  • Perform annotation-related tasks, including data generation, collection, and quality assurance.
  • Collaborate with other ML Data Linguists to resolve data ambiguities and annotation disagreements.
  • Dive deep into the data to perform qualitative error trend analysis and devise action plans to improve data quality.
  • Provide feedback to Language Engineers and Scientists on tool improvements and annotation processes.
  • Dive deep into issues and implement solutions independently.
  • Contribute to process improvements to reduce handling time and improve resource output.
  • Develop various language artifacts crucial for model development, such as datasets for training and evaluation.
  • Support and consult in pre-screening interviews for Data Associates.
  • Collaborate with LEs, scientists, and Ops Manager to innovate processes, tracker automations, and workflows.
  • Assist LEs in communication with vendors to provide detailed feedback to annotators.

About the Team

The Bedrock team is a team of data linguists who primarily support the training of different models in the AWS generative AI platform.

We work with different model types, such as text-to-text, text-to-image, text-to-speech, and others, generating data for ML model training, as well as toxic content evaluation, and categorization.

Some of the aspects of ML development that the Bedrock team works with include Responsible AI, Reinforcement Learning from Human Feedback, Supervised Fine Tuning, and Human Content Evaluation.

Requirements

  • Bachelor's degree in Linguistics, Philosophy, Cognitive Science, a foreign language, or Literature.
  • Fluent/Native in either Finnish, Swedish, Norwegian, Polish, or Vietnamese.
  • Ability to identify linguistic ambiguity and other inaccuracies in linguistic data, as well as identify basic parts of speech, and produce reports of analyzed data.
  • Experience with natural language data labeling, data annotation, linguistic annotation, or other forms of data markup, as well as experience leading a team of peers.
  • Knowledge of different domains such as Finance, Health Care, and/or Insurance.
  • Ability to generate innovative and diverse inputs to explore various aspects of an AI model's capabilities.
  • Familiarity with JSON, YAML, XML, or other forms of text markup.
  • Ability to navigate a Unix terminal and use common command line tools.
  • Knowledge of Python, Java, or any other scripting language.
  • Strong organizational and leadership skills and detail-oriented.
  • Ability to communicate well and actively listen with other data associates on a team.
  • Ability to deliver high-quality results under tight deadlines.
  • Comfortable working in a fast-paced, collaborative work environment.
  • Willingness to support several projects at one time, and to accept re-prioritization as necessary.

Preferred Qualifications

  • Master's degree in a relevant field, such as Linguistics, Communications, a foreign language, computational linguistics, or other language or data-related disciplines.
  • Proficient in another foreign language.
  • Familiarity with common text processing tools.
  • Passion for language, linguistics, human language technology, and AI.
  • Ability to work in different operating systems (Windows, MacOS, or Linux).
  • Strong understanding of NLP concepts and techniques.

Amazon is an Equal Opportunity Employer

Amazon is committed to a diverse and inclusive workplace. We are an equal opportunity employer and do not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Los Angeles County Applicants

Job duties for this position include work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence despite stressful conditions; communicate effectively and respectfully with employees, supervisors, and staff to ensure exceptional customer service; and follow all federal, state, and local laws and Company policies.

Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position.

These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness and professionalism, and safeguard business operations and the Company's reputation.

Pursuant to the Los Angeles County Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Compensation

The base pay for this position ranges from $40,400/year in our lowest geographic market up to $86,500/year in our highest geographic market.

Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience.

Amazon is a total compensation company.

Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits.