Data Scientist

2 weeks ago


Santa Clara, United States Net2Source Inc. Full time

Data Scientist

Location - Santa Clara, CA (Hybrid)

Contract/ Full Time


This is data scientist role focused on transforming complex and large (100s of terabytes) multi-dimensional data e.g. tabular(relational) , unstructured data such as images, videos, audio files and other various file formats. The key responsibility is to be able to curate high quality training data for the large language model training.


Responsibilities:

  • Curate high-quality datasets and synthesize training data where needed to improve model capabilities.
  • Championing modelling , EDA, Transformation, Modernization and Curation of high-quality training data for GPT-4 and GPT-4 Vision
  • Providing data curation leadership on tabular, unstructured (images, video, logs files etc.) data
  • Creating data definitions and data lineages for effective LLM training for high accuracy
  • Helping build and test prompts to render high quality insights
  • Train and fine-tune language models using frameworks like PyTorch and TensorFlow
  • Rigorously test models to evaluate accuracy, bias, toxicity, and other attributes using statistical analysis
  • Monitor metrics and logs from LLMs in deployment to proactively identify any degraded performance or anomalies.
  • Diagnose root causes when models err or behave unexpectedly using techniques like saliency maps, heatmap visualizations and interactive debugging.
  • Improve model robustness by analyzing model behavior and identifying failure modes. Recommend data augmentation, training modifications etc.
  • Perform model surgery by carefully editing model weights and architectures to fix incorrect or unsafe behavior while maintaining performance.
  • Run A/B experiments to measure impact of model tweaks and fixes on performance, accuracy, toxicity, bias etc.
  • Continuously inspect models for signs of concept drift or staleness and recommend retraining cadence.
  • Document LLM version changes, experiments, and incident response postmortems.
  • Stay updated on the latest techniques from research and industry conferences for responsible and reliable deployment of LLMs.

Requirements:

  • 8+ years experience training, deploying and monitoring natural language models
  • Strong stats skills and large-scale data manipulation capabilities
  • Proficiency of Azure Machine Learning Studio and deploying models in Azure Cloud environments
  • Deep knowledge of Azure SQL and vector databases
  • Proficiency in Python, PyTorch, TensorFlow, NLP libraries and other ML tools
  • Knowledge of responsible AI principles around transparency, fairness and accountability
  • Monitor metrics and logs from LLMs in deployment to proactively identify any degraded performance or anomalies.
  • Diagnose root causes when models err or behave unexpectedly using techniques like saliency maps, heatmap visualizations and interactive debugging.
  • Improve model robustness by analyzing model behavior and identifying failure modes. Recommend data augmentation, training modifications etc.
  • Perform model surgery by carefully editing model weights and architectures to fix incorrect or unsafe behavior while maintaining performance.
  • Run A/B experiments to measure impact of model tweaks and fixes on performance, accuracy, toxicity, bias etc.
  • Continuously inspect models for signs of concept drift or staleness and recommend retraining cadence.
  • Document LLM version changes, experiments, and incident response postmortems.
  • Stay updated on the latest techniques from research and industry conferences for responsible and reliable deployment of LLMs.
  • Knowledge of Autogen, LangChain/Llama Index frameworks


  • Data Scientist

    4 weeks ago


    Santa Clara, United States Sigmaways Inc Full time

    Job DescriptionJob DescriptionDuties: We are looking for a highly motivated Principal Software Engineer to help us build cutting edge analysis, visualization and compute pipelines for analyzing sequencer data. The job requires advanced python expertise and data science skills, in addition to solid computer science skills. You will be involved in...

  • Data Scientist

    5 days ago


    Santa Clara, California, United States Amazon Full time

    Amazon is looking for a passionate, talented, and inventive Data Scientist with a strong machine learning background to help build industry-leading language technology.Our mission is to provide a delightful experience to Amazon's customers by pushing the envelope in Natural Language Processing (NLP), Generative AI, Large Language Model (LLM), Natural...

  • Data Scientist

    3 weeks ago


    Santa Clara, United States Net2Source Inc. Full time

    Data ScientistLocation - Santa Clara, CA (Hybrid)Contract/ Full Time This is data scientist role focused on transforming complex and large (100s of terabytes) multi-dimensional data e.g. tabular(relational) , unstructured data such as images, videos, audio files and other various file formats. The key responsibility is to be able to curate high quality...

  • Data Scientist

    3 weeks ago


    Santa Clara, United States Net2Source Inc. Full time

    Data ScientistLocation - Santa Clara, CA (Hybrid)Contract/ Full Time This is data scientist role focused on transforming complex and large (100s of terabytes) multi-dimensional data e.g. tabular(relational) , unstructured data such as images, videos, audio files and other various file formats. The key responsibility is to be able to curate high quality...


  • Santa Clara, CA, United States NVIDIA Full time

    The NVIDIA Operations organization is seeking an experienced technology professional for the position of Senior Data Scientist to support initiatives for the Sourcing function in Operations. As a Data Scientist on our team, you will be an integral part of the Data and Analytics organization in Operations that is building the Operations Data Platform to turn...

  • Data Scientist

    3 days ago


    Santa Cruz, United States Fullpower Full time

    Description Fullpower-AI delivers a complete B2B IoT platform for AI-powered algorithms, remote contactless biosensing together with end-to-end engineering services, and customization of software in the field of life sciences, health, and biotechnology. Fullpower's platform is vetted and deployed as a PaaS, backed by a patent portfolio of 135+ patents, and...


  • Santa Clara, CA, United States SoundHound Full time

    We believe in enabling humans to interact with the things around them in the same way we interact with each other: by speaking naturally to mobile phones, cars, TVs, music speakers, coffee machines, and every other part of the emerging 'connected' world. Our latest product, Hound, leverages our Speech-to-Meaning technology to showcase a ground-breaking...


  • Santa Clara, CA, United States soundhound Full time

    SOUNDHOUND INC. TURNS SOUND INTO UNDERSTANDING AND ACTIONABLE MEANING. We believe in enabling humans to interact with the things around them in the same way we interact with each other: by speaking naturally to mobile phones, cars, TVs, music speakers, coffee machines, and every other part of the emerging 'connected' world. Our latest product,...


  • Santa Clara, United States Amazon Development Center U.S., Inc. Full time

    PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience - Experience in patents or publications at top-tier peer-reviewed conferences or journals - Experience programming in Java, C++, Python or related language - Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining,...


  • Santa Clara, United States Abbott Laboratories Full time

    Job TitlePrincipal Clinical Research ScientistWorking at AbbottAt Abbott, you can do work that matters, grow, and learn, care for yourself and family, be your true self and live a full life. You’ll also have access to:Career development with an international company where you can grow the career you dream of.Free medical coverage for employees* via the...


  • Santa Clara, United States Abbott Laboratories Full time

    Abbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans the spectrum of healthcare, with leading businesses and products in diagnostics, medical devices, nutritionals and branded generic medicines. Our 114,000 colleagues serve people in more than 160 countries.Job...


  • Santa Clara, United States NVIDIA Full time

    Our technology has no boundaries! NVIDIA is building the world’s most groundbreaking and pioneering computing platforms. Because of our work, scientists, researchers, and engineers can advance their ideas. At its core, our visual computing technology not only enables an outstanding computing experience, but it is also energy efficient! We pioneered a...


  • Santa Clara, United States Palo Alto Networks Full time

    PALO ALTO NETWORKS is the fastest-growing security company in history. We offer the chance to be part of an important mission: ending breaches and protecting our way of digital life. If you are a motivated, intelligent, creative, and hardworking individual, then this job is for you! In this service, we ingest tens of terabytes of data daily from several...


  • Santa Clara, United States PDF Solutions Full time

    Overview: Company Mission: We're not just a software company. We're a global force at the forefront of data-driven decision-making. Our analytics revolution leverages AI, machine learning, and cutting-edge visualization to reveal insights that empower industries. We need your expertise to shape the next generation of how data tells its story. **The...


  • Santa Clara, United States Simplex Full time

    Job DescriptionJob DescriptionThis role is on-site M-F in Santa Clara, CA. We are looking for candidates local to Santa Clara, CA or willing to easily re-locate. This is a contract position expected to last 6-12+ months with the possibility to be extended. Requirements: Excellent knowledge in LLM specifically expertise in open AI skills including GPT 3.5,...


  • Santa Clara, United States Petlibro Full time

    About the job About Petlibro Petlibro is a design thinking company creating products that nurture the intertwined lives of pets & their people. We launched with a philosophy that good design, in form & in function, can make a difference. Petlibro innovates with the latest technology to solve everyday problems for the modern pet parents and revolutionize how...

  • Sr. Software Engineer

    3 weeks ago


    Santa Clara, United States Petlibro Full time

    About the jobAbout PetlibroPetlibro is a design thinking company creating products that nurture the intertwined lives of pets & their people. We launched with a philosophy that good design, in form & in function, can make a difference. Petlibro innovates with the latest technology to solve everyday problems for the modern pet parents and revolutionize how we...

  • Sr. Software Engineer

    2 weeks ago


    Santa Clara, United States Petlibro Full time

    About the jobAbout PetlibroPetlibro is a design thinking company creating products that nurture the intertwined lives of pets & their people. We launched with a philosophy that good design, in form & in function, can make a difference. Petlibro innovates with the latest technology to solve everyday problems for the modern pet parents and revolutionize how we...

  • Sr. Software Engineer

    7 hours ago


    Santa Clara, United States Petlibro Full time

    About the jobAbout PetlibroPetlibro is a design thinking company creating products that nurture the intertwined lives of pets & their people. We launched with a philosophy that good design, in form & in function, can make a difference. Petlibro innovates with the latest technology to solve everyday problems for the modern pet parents and revolutionize how we...

  • PhD Only

    4 days ago


    Santa Clara, United States ServiceNow Full time

    Job DescriptionJob DescriptionCompany DescriptionAt ServiceNow, our technology makes the world work for everyone, and our people make it possible. We move fast because the world can’t wait, and we innovate in ways no one else can for our customers and communities. By joining ServiceNow, you are part of an ambitious team of change makers who have a restless...