Large Scale Data Infrastructure Specialist

7 days ago


San Francisco, California, United States Anthropic Full time
About Anthropic

We are a leading AI research organization dedicated to developing safe, ethical, and powerful artificial intelligence. Our mission is to ensure that transformative AI systems are aligned with human interests.

Our Pretraining team is seeking a highly skilled Research Engineer to join our efforts in developing the next generation of large language models. In this role, you will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems.

Key Responsibilities
  • Design and implement high-performance data processing infrastructure for large language model training, ensuring scalability and reliability.
  • Develop and maintain core processing primitives, such as tokenization, deduplication, and chunking, with a focus on scalability and efficiency.
  • Build robust systems for data quality assurance and validation at scale, ensuring high-quality data for model training.
  • Implement comprehensive monitoring systems for data processing infrastructure, providing real-time insights into system performance.
  • Create and optimize distributed computing systems for processing web-scale datasets, leveraging cloud computing platforms and distributed systems architecture.
  • Collaborate with research teams to implement novel data processing architectures, driving innovation in AI research.
  • Build and maintain documentation for infrastructure components and systems, ensuring transparency and reproducibility.
Qualifications
  • Strong software engineering skills, with experience in building distributed systems and expertise in Python and distributed computing frameworks.
  • Deep understanding of cloud computing platforms and distributed systems architecture, with experience in designing and implementing high-throughput, fault-tolerant systems.
  • Strong background in performance optimization and system scaling, with excellent problem-solving skills and attention to detail.
  • Excellent communication skills and ability to work in a collaborative environment, with strong interpersonal skills and adaptability.
Preferred Experience
  • Advanced degree (MS or PhD) in Computer Science or related field, with a strong background in distributed systems and parallel computing.
  • Experience with language model training infrastructure, including tokenization algorithms and techniques, and high-throughput, fault-tolerant system design.
  • Deep knowledge of monitoring and observability practices, with experience in infrastructure-as-code and configuration management.
  • Background in MLOps or ML infrastructure, with a strong understanding of machine learning research and its infrastructure requirements.
What You'll Enjoy About This Role
  • The opportunity to work at the forefront of AI research, contributing to the development of safe and ethical AI systems.
  • The chance to collaborate with a talented team of researchers, engineers, and policy experts, driving innovation in AI research.
  • The flexibility to work on complex technical challenges, driving solutions and delivering results independently.
  • The opportunity to learn about machine learning research and its infrastructure requirements, with opportunities for professional growth and development.
Salary: $175,000 - $225,000 per year

  • San Francisco, California, United States Unity Full time

    Company OverviewUnity is the world's leading platform of tools for creators to build and grow real-time games, apps, and experiences across multiple platforms.With over 69% of the top 1,000 mobile games made with Unity, as of the fourth quarter of 2023, the company has established itself as a key player in the industry. In 2023, Made with Unity applications...


  • San Francisco, California, United States Databricks Full time

    At Databricks, we are passionate about helping data teams solve complex problems. Our Platform Engineering team provides the core infrastructure that powers our platform. We are rebuilding our platform from the ground up to meet growing needs.We are seeking a Senior Staff Software Engineer to join this team and spearhead our strategy, lead high-stakes...


  • San Francisco, California, United States Databricks Full time

    Databricks is passionate about empowering data teams to tackle the world's most complex problems.We are looking for a Technical Lead to spearhead our infrastructure development efforts, driving strategic initiatives and shaping the future of our platform. As a key member of our Platform Engineering team, you will have a profound impact on the success of our...


  • San Francisco, California, United States Unity Technologies Full time

    At Unity, we're at the forefront of innovation in real-time 3D technologies. Our platform empowers creators to build and grow exceptional experiences across multiple platforms. As a member of our Data & ML Platform team, you'll be working on designing and optimizing large-scale data platforms and machine learning infrastructure systems for efficiency,...


  • San Francisco, California, United States OpenAI Full time

    We are seeking an experienced Senior Software Engineer to lead our Data Acquisition team. The ideal candidate will have a strong background in large-scale distributed systems and data processing.The successful candidate will own and lead engineering projects in the area of data acquisition, including web crawling, data ingestion, and search. They will...


  • San Francisco, California, United States Tbwa ChiatDay Inc Full time

    As a seasoned Machine Learning Engineer, you will play a pivotal role in building the next generation of answer engines at Perplexity. Our cutting-edge technology empowers users to find information in new and more effective ways, serving millions of users worldwide.In pursuit of this ambitious mission, we are seeking talented engineers to join our dynamic...


  • San Francisco, California, United States OpenAI Full time

    About OpenAIOpenAI is a pioneering AI research and deployment company dedicated to harnessing the power of artificial intelligence for the betterment of humanity. Our mission is to create safe and beneficial AI systems that elevate human capabilities.We are committed to fostering an inclusive and diverse work environment, where talented individuals from...


  • San Francisco, California, United States Tbwa ChiatDay Inc Full time

    Company Overview\rTbwa Chiat/Day Inc is a leading community of communities, built on shared interests and trust. It's home to the most open and authentic conversations on the internet, with 100,000+ active communities and approximately 82M+ daily active unique visitors.\rWe're now looking for a talented individual to accelerate our data-centric culture and...


  • San Francisco, California, United States Unity Technologies Full time

    We are seeking a talented Senior Data and ML Infrastructure Engineer to join our team at Unity Technologies. This role is responsible for designing and optimizing large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Job OverviewUnity is the world's leading platform of tools for creators to...


  • San Francisco, California, United States Unity Technologies Full time

    About the RoleWe're seeking a skilled Senior Data and ML Infrastructure Engineer to join our team at Unity. As a key member of our Data & ML Platform team, you will design and optimize large-scale data platforms and machine learning infrastructure systems for efficiency, reliability, and cost-effectiveness.Key Responsibilities:Design and optimize large-scale...


  • San Francisco, California, United States Genmo Full time

    At Genmo, we're pushing the boundaries of video generation and Artificial General Intelligence (AGI). As an experienced Senior/Staff AI Infra Engineer, you'll play a key role in designing and scaling our petabyte-scale data infrastructure.Role OverviewWe're seeking someone with strong technical expertise to create robust, scalable systems that manage...


  • San Francisco, California, United States OpenAI Full time

    About the RoleThe Applied Data Platform team is responsible for designing, building, and operating the foundational data infrastructure that enables products and teams at OpenAI.Key Responsibilities:Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure while...


  • San Francisco, California, United States Pinterest Full time

    Principal Software Engineer, Ads InfrastructureAbout Us:Pinterest is one of the fastest growing online advertising platforms. Our continued success depends on rapidly scaling our core revenue-generating systems.We are looking for a Principal Software Engineer to design and build the next-gen version of key infra components in our monetization ecosystem.This...


  • San Diego, California, United States Apple Full time

    About Apple">Apple is a technology leader that designs, manufactures, and markets consumer electronics, computer software, and online services worldwide. Our innovative products and services are integral to the daily lives of people around the world.​Salary: $125,000 - $182,000 per year">Job Description">We are seeking an experienced Senior Data...


  • San Francisco, California, United States DoorDash USA Full time

    DoorDash is a leading food delivery and logistics company, and our Data Engineering team plays a crucial role in building database solutions to support various use cases. As a Staff Software Engineer, Data, you will be responsible for architecting and scaling our data reliability, infrastructure, automation, and tools to meet growing business needs.About the...


  • San Francisco, California, United States Abridge Full time

    We are a growing team of practitioners, scientists, and engineers working together to empower people and make care more understandable.Abridge is at the forefront of leveraging AI to transform the healthcare industry. Our generative AI-powered products are revolutionizing the practice of medicine, and we're seeking a highly motivated Data Infrastructure...


  • San Francisco, California, United States Magic AI Full time

    Job OverviewMagic AI is a cutting-edge technology company focused on building safe Artificial General Intelligence (AGI) to accelerate humanity's progress on the world's most important problems.Salary Range:$100K - $550K per year, with equity and benefits included in total compensation.About the RoleWe're seeking a highly skilled AI Data Engineer to join our...


  • San Francisco, California, United States Unreal Gigs Full time

    Unlock the Full Potential of Your DataAs a Chief Data Infrastructure Strategist at Unreal Gigs, you will play a pivotal role in designing and implementing scalable data frameworks that support business analytics, data science, and operational reporting. With 5+ years of experience in data architecture or a related field, you will leverage your expertise to...


  • San Francisco, California, United States Unreal Gigs Full time

    Design and Build AI InfrastructureArchitect and implement scalable infrastructure that supports AI workloads, including machine learning model training, large-scale data processing, and real-time inference.As an AI Infrastructure Engineer, you'll design solutions that ensure high availability, fault tolerance, and performance optimization.


  • San Francisco, California, United States Scale AI, Inc. Full time

    Scale AI, Inc.We are seeking a highly skilled Machine Learning Engineer to join our team at Scale AI, Inc. as a key member of our research and development efforts in large language models. This is an exciting opportunity to work with the industry's leading foundation model labs, design next generation data pipelines, and contribute technically and...