Research Engineer, Trust and Safety

1 week ago


San Francisco, California, United States Anthropic Full time
About Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems that benefit society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role

We are seeking a Research Engineer to help design and build safety and oversight algorithms for our AI models and products. As a Trust and Safety Research Engineer, you will work to design and train ML models based on research progress, which detect harmful user/model behaviors and ensure society well-being.

Key Responsibilities
  • Design, iterate, and build ML models to detect unwanted or anomalous behaviors from both users and LLM models
  • Work with T&S ML engineers to review and iterate experiment ideations, co-authoring success criteria and production deployment roadmaps
  • Partner with T&S Policy and Enforcement cross-functionals to understand emerging and sustained abuse patterns from user prompts and behaviors, incorporating insights into T&S research datasets
  • Surface abuse patterns to sibling research teams, collaborating to harden Anthropic's LLMs at pre/post training stages
  • Stay current with state-of-the-art research in AI and machine learning, proposing ways to apply these advancements to T&S systems
Requirements
  • 4+ years of experience in research engineering or applied research scientist positions, preferably with a focus on trust and safety
  • Significant Python programming experience and machine learning expertise
  • Proficiency in building trustworthy and safe AI technology
  • Strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
  • Care about societal impacts and long-term implications of work, with a results-oriented approach
Preferred Qualifications
  • Experience fine-tuning large language models with supervised learning or reinforcement learning
  • Experience with machine learning frameworks like Scikit-Learn, Tensorflow, or Pytorch
  • Experience authoring research papers in machine learning, NLP, or AI alignment, or similar industry experience
  • Developed evaluations for language models
Compensation and Benefits

Anthropic's compensation package consists of three elements: salary, equity, and benefits. We aim for these elements collectively to be highly competitive with market rates.

Equity will be a major component of the total compensation for eligible roles. We offer a range of benefits, including comprehensive health, dental, and vision insurance, 401(k) plan with 4% matching, 22 weeks of paid parental leave, unlimited PTO, and more.

Employees based outside of the UK or US will receive a different benefits package.

How We're Different

We believe that the highest-impact AI research will be big science. At Anthropic, we work as a single cohesive team on just a few large-scale research efforts. We value impact - advancing our long-term goals of steerable, trustworthy AI - rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science.

We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.



  • San Francisco, California, United States OpenAI Full time

    Job Title: Software Developer - Trust and Safety Company: OpenAI Join OpenAI's Applied AI team, where our mission is to responsibly implement cutting-edge AI technologies. We are seeking a skilled software developer to enhance our trust and safety initiatives. About the Position: As a software developer on our team, you will be instrumental in designing and...


  • San Francisco, California, United States Center for AI Safety Full time

    About the RoleWe are seeking a highly skilled Research Engineer to join our team at the Center for AI Safety. As a Research Engineer, you will play a key role in advancing the field of AI safety through technical and conceptual research, collaboration with academics and industry researchers, and promotion of safety within the broader machine learning...


  • San Francisco, California, United States Indiana Biosciences Research Institute Full time

    Job Title: Research Engineer, HorizonsJob Summary:We are seeking a highly skilled Research Engineer to join our team at the Indiana Biosciences Research Institute. As a Research Engineer, you will play a key role in advancing the capabilities and safety of large language models through fundamental AI research in reinforcement learning.Key...

  • Software Engineer

    1 week ago


    San Jose, California, United States TikTok Full time

    About TikTokTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to users worldwide. Our mission is to empower creators and communities to thrive on our platform.Job SummaryWe are seeking a highly skilled Full Stack Software Engineer to join our Trust and Safety team. As a key member of our platform, you will be...


  • San Francisco, California, United States Center for AI Safety Full time

    About the RoleWe are seeking a highly motivated and skilled Research Engineer Intern to join our team at the Center for AI Safety. As a research engineer intern, you will work closely with our researchers on projects related to AI safety, machine learning, and artificial intelligence.Key ResponsibilitiesConduct research and experiments in AI safety and...


  • San Francisco, California, United States Lionheart Ventures Company Defunct Full time

    About the RoleWe are seeking a highly skilled Research Engineer to join our team at Lionheart Ventures Company Defunct. As a key member of our Trust and Safety team, you will play a critical role in designing and building safety and oversight algorithms for our AI models and products.Your primary responsibility will be to work on designing, iterating, and...


  • San Francisco, California, United States DoorDash Full time

    About the RoleDoorDash is seeking an experienced Engineering Manager to lead our Safety and Security team. This dedicated group is responsible for providing a comprehensive catalog of Trust and Safety measures, focusing on safety incident prevention, incident response and support, and beyond. We believe that innovations on Trust and Safety measures lead to...


  • San Francisco, California, United States Trust A-tech Full time

    We are seeking a **Program Manager** to spearhead initiatives primarily aimed at synchronizing technical strategies and facilitating execution between our internal engineering divisions and key cloud partners.Our AI models are available on both our proprietary platforms and API, as well as on major cloud services.It is essential to ensure effective...


  • San Francisco, California, United States Anthropic Full time

    About AnthropicAt Anthropic, our goal is to develop AI systems that are dependable, understandable, and controllable. We aim for our AI to be secure and advantageous for both our users and society at large. Our team comprises a rapidly expanding group of dedicated researchers, engineers, policy specialists, and business leaders collaborating to create...


  • San Francisco, California, United States Anthropic Full time

    About AnthropicAt Anthropic, our mission is to develop AI systems that are reliable, interpretable, and controllable. We aim for our AI technologies to be safe and advantageous for both our users and society at large. Our rapidly expanding team consists of dedicated researchers, engineers, policy specialists, and business leaders collaborating to create...

  • Research Scientist

    2 weeks ago


    San Francisco, California, United States AI Safety, Inc Full time

    About the RoleWe are seeking a highly motivated and skilled Research Engineer Intern to join our team at AI Safety, Inc. As a Research Engineer Intern, you will play a key role in advancing our mission to ensure the safety of future artificial intelligence systems.Key ResponsibilitiesConduct research in machine learning safety and contribute to the...

  • Software Engineer

    4 days ago


    San Jose, California, United States TikTok Full time

    About TikTokTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy to our users. With a global presence and a diverse team, we strive to create a platform that is both entertaining and safe.Job SummaryWe are seeking a highly skilled Full Stack Software Engineer to join our Trust and Safety team. As a...

  • Research Scientist

    6 days ago


    San Francisco, California, United States AI Safety, Inc Full time

    About the RoleWe are seeking a highly motivated and talented Research Engineer Intern to join our team at the Center for AI Safety. As a research engineer intern, you will work closely with our researchers on projects in fields such as Trojans, Adversarial Robustness, Power Aversion, Machine Ethics, and Out-of-Distribution Detection.Key ResponsibilitiesPlan...


  • San Francisco, California, United States Openai Full time

    About the Team The Safety Systems team is dedicated to various safety initiatives aimed at ensuring our advanced models can be responsibly deployed in real-world applications for the benefit of society. This team is pivotal to OpenAI's mission of developing and implementing safe AGI, reinforcing our commitment to AI safety while promoting a culture of trust...


  • San Francisco, California, United States Strava Full time

    About This RoleStrava is the leading subscription platform at the center of connected fitness, with over 100 million community members in more than 190 countries. We're seeking a Senior Technical Program Manager for our Trust & Safety Team to lead cross-functional programs and initiatives that span Product Management, Design, Research, Engineering, Legal,...

  • Research Scientist

    8 hours ago


    San Francisco, California, United States Center for AI Safety Full time

    About the RoleWe are seeking a highly skilled Research Engineer to join our team at the Center for AI Safety. As a Research Engineer, you will play a key role in advancing the field of AI safety through technical and conceptual research.Key ResponsibilitiesPursue a variety of research projects in fields such as Power Aversion, Trojans, Machine Ethics, and...


  • San Jose, California, United States TikTok Full time

    About the RoleWe are seeking a highly skilled Full Stack Software Engineer to join our Trust and Safety Platform team at TikTok. As a key member of our team, you will be responsible for designing and developing highly scalable, efficient, and robust trust and safety platforms and tools.Key ResponsibilitiesDesign and develop web-based application pages...


  • San Francisco, California, United States Russell Tobin Full time

    Position OverviewContract Duration: 8+ MonthsLocation: RemoteCompensation: $50/hr W2 - $54/hr W2About the Opportunity:We are looking for a Product Lead to spearhead our Trust & Safety division. In this role, you will be responsible for upholding the integrity of user-generated content, safeguarding against fraudulent activities, and ensuring the security of...


  • San Francisco, California, United States RI Research Instruments GmbH Full time

    About the RoleWe are seeking a highly skilled Research Prompt Engineer to join our team at RI Research Instruments GmbH. As a Research Prompt Engineer, you will play a crucial role in developing and refining the prompts that instruct our large language models to deliver high-quality outputs.Key ResponsibilitiesDevelop new and innovative prompting strategies...


  • San Francisco, California, United States OpenAI Full time

    About the TeamThe Safety Systems team at OpenAI is at the forefront of developing and deploying safe artificial general intelligence (AGI), driving our commitment to AI safety and fostering a culture of trust and transparency.The Safety Reasoning Research team is poised at the intersection of short-term pragmatic projects and long-term fundamental research,...