Data Scientist 3

2 weeks ago


Louisville, KY, United States Hire Talent Full time
Role Overview:
We are seeking a Senior Data Scientist to build and deploy LLM-based capabilities for working with large, diverse datasets and documents relevant to growth analytics & bid strategy. This role emphasizes ingestion, document processing, information extraction, and retrieval methods to support analytics use cases in production. Experience with modern LLM tooling and Databricks is required; hands-on experience with advanced reasoning models & agentic/orchestration frameworks are a plus.

Key Responsibilities:
Architect, build, and refine retrieval-grounded LLM systems, including basic and advanced RAG patterns, to deliver grounded, verifiable answers and insights.
Design robust pipelines for ingestion, transformation, and normalization of public and internal data, including ETL, incremental processing, and data quality checks.
Build and maintain document processing workflows across PDFs, HTML, and scanned content, including OCR, layout-aware parsing, table extraction, metadata enrichment, and document versioning.
Develop information extraction pipelines using LLM methods and best practices, including schema design, structured outputs, validation, error handling, and accuracy evaluation.
Own the retrieval stack end-to-end, including chunking strategies, embeddings, indexing, hybrid retrieval, reranking, filtering, and relevance tuning across a vector database or search platform.
Implement web data acquisition where needed, including scraping, change detection, source quality checks, and operational safeguards like retries and rate limiting.
Establish evaluation and monitoring practices for retrieval and extraction quality, including golden datasets, regression testing, groundedness checks, and production observability.
Collaborate with subject matter experts to translate business needs into practical retrieval and extraction workflows and measurable success criteria.
Communicate complex findings, tradeoffs, and recommendations to technical and business stakeholders, supporting data-driven forecasting and strategy.
Ensure compliance with data governance and security standards when handling sensitive data and deploying systems to production environments.

Qualifications:
Advanced degree in Computer Science, Data Science, Statistics, Engineering, or a related quantitative field.
Minimum of 4 years experience in data science or applied ML/NLP with focus in NLP & GenAI
Proficiency in Python and SQL, with strong engineering practices for maintainable, testable pipelines.
Strong experience with Databricks for data processing and pipeline development, including Spark and common lakehouse patterns.
Demonstrated experience building retrieval-grounded LLM systems and or LLM-based information extraction for real-world use cases.
Experience with document ingestion and parsing, including OCR and handling messy, semi-structured content such as PDFs, tables, forms, and web pages.
Familiarity with vector databases and retrieval concepts, including indexing, embeddings, hybrid retrieval, reranking, and performance and cost tuning.
Strong understanding of best practices for reasoning models and techniques that improve reliability and reduce hallucinations, including grounding and attribution.
Excellent communication skills, with a track record of partnering with stakeholders and turning ambiguous requests into adopted solutions.

Libraries and Tools:
Proficiency with LLM and orchestration libraries such as: openai, google-genai, langgraph, langchain.
Experience with supporting tooling commonly used in production LLM systems, for example: pydantic for schema validation, tenacity for retries, beautifulsoup4 for html data extraction, and standard Python data tooling such as pandas and numpy.
Experience with retrieval and vector tooling, such as: FAISS, Elasticsearch or OpenSearch, and vector database platforms (for example Pinecone, Weaviate, Milvus, Chroma).

Preferred Qualifications:
Exposure to agentic patterns and tool-calling for workflow automation.
Experience working in regulated environments and implementing governance controls such as access control, auditability, and retention.
ll qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender, identity, national origin, disability, or protected veteran status.
  • Data Scientist

    2 weeks ago


    Louisville, KY, United States NTT DATA Group Corporation Full time

    At NTT DATA, we know that with the right people on board, anything is possible. The quality, integrity, and commitment of our employees have been key factors in our company's growth and market presence. By hiring the best people and helping them grow both professionally and personally, we ensure a bright future for NTT DATA and for the people who work here....

  • Data Scientist

    2 weeks ago


    Louisville, KY, United States NTT DATA Group Corporation Full time

    At NTT DATA, we know that with the right people on board, anything is possible. The quality, integrity, and commitment of our employees have been key factors in our company's growth and market presence. By hiring the best people and helping them grow both professionally and personally, we ensure a bright future for NTT DATA and for the people who work here....

  • Data Scientist

    1 week ago


    Louisville, KY, United States NTT DATA Group Corporation Full time

    At NTT DATA, we know that with the right people on board, anything is possible. The quality, integrity, and commitment of our employees have been key factors in our company's growth and market presence. By hiring the best people and helping them grow both professionally and personally, we ensure a bright future for NTT DATA and for the people who work here....


  • Louisville, KY, United States Ford Motor Company Full time

    We are seeking a highly analytical and results-oriented Data Scientist with 3 years of experience specializing in leveraging IIoT data from manufacturing environments (Rockwell PLCs/VFDs, ControlLogix) to drive operational improvements and develop intelligent solutions. Proficient in the end-to-end data science lifecycle, from data ingestion (MQTT) and...


  • Louisville, KY, United States Ford Motor Company Full time

    We are seeking a highly analytical and results-oriented Data Scientist with 3 years of experience specializing in leveraging IIoT data from manufacturing environments (Rockwell PLCs/VFDs, ControlLogix) to drive operational improvements and develop intelligent solutions. Proficient in the end-to-end data science lifecycle, from data ingestion (MQTT) and...


  • Louisville, KY, United States Ford Motor Company Full time

    We are seeking a highly analytical and results-oriented Data Scientist with who can leverage IIoT data from manufacturing environments to drive operational improvements and develop intelligent solutions. Proficient in the end-to-end data science lifecycle, from data ingestion (MQTT) and cloud-based processing (GCP) to advanced ML modeling (Python, SQL) and...


  • Louisville, KY, United States Purple Drive Full time

    Role: Azure Data Bricks Developer Location: LOUISVILLE, KY Key Responsibilities: Design, develop, and maintain scalable data pipelines using Azure Databricks and Apache Spark Integrate data from various sources including Azure Data Lake, Azure Blob Storage, SQL Server, and external APIs Implement ETL/ELT processes to support data ingestion,...


  • Louisville, KY, United States Relativity Full time

    Posting Type Hyrbid Job Overview At Relativity, we're building a world-class Applied Science team to push the boundaries of intelligent systems in the legal domain. We're looking for a Staff Applied Scientist to join our team. Agentic AI-systems that perceive, think, and act-is not a far-off vision for us. It's already embedded in how Relativity aiR makes...


  • Louisville, KY, United States ChannelAdvisor Full time

    Senior Data Engineer Rithum is the world's most trusted commerce network, accelerating how brands, suppliers, and retailers work together to deliver seamless e-commerce experiences. We provide an unmatched platform for brands and retailers, enabling them to accelerate growth, optimise operations across channels, scale product offerings and enhance margins....


  • Louisville, KY, United States ChannelAdvisor Full time

    Senior Data Engineer Rithum is the world's most trusted commerce network, accelerating how brands, suppliers, and retailers work together to deliver seamless e-commerce experiences. We provide an unmatched platform for brands and retailers, enabling them to accelerate growth, optimise operations across channels, scale product offerings and enhance margins....