Current jobs related to Senior Reliability Engineer for Recommendation Systems - San Jose, California - TikTok


  • San Jose, California, United States Selby Jennings Full time

    About the Role:Selby Jennings is partnering with a global tech company to build out their SRE teams. The company is looking for a Senior SRE Engineer to lead their Recommendation Infrastructure team, working closely with engineers in the US and Asia.Key Responsibilities:Design and implement large-scale distributed systems for high reliability and...


  • San Jose, California, United States Tik Tok Full time

    {"title": "Recommendation Infrastructure Team", "subtitle": "Building the Future of TikTok's Recommendation System", "content": "At TikTok, we're committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace.We're...


  • San Jose, California, United States TikTok Full time

    About the RoleAs a Software Engineer on our Recommendation Architecture Team, you will be responsible for building and maintaining high-performance online services for our recommendation system. Your expertise will help provide the most stable and best experience for our TikTok users.Key ResponsibilitiesDesign and develop high-performance computing...


  • San Jose, California, United States Tik Tok Full time

    About the RoleWe are seeking a highly skilled Backend Software Engineer to join our Recommendation Systems team at TikTok. As a key member of our team, you will be responsible for designing and implementing a storage solution for offline data in our recommendation system, which caters to over a billion users.Your primary objectives will be to ensure system...


  • San Jose, California, United States Tik Tok Full time

    Unlock Your Potential as a Senior Site Reliability Engineer at TikTokTikTok is the leading destination for short-form mobile video, and we're seeking a talented Senior Site Reliability Engineer to join our monetization technology team.About the RoleWe're building and running large-scale, globally distributed, fault-tolerant ads systems, and we need someone...


  • San Jose, California, United States Netpace Full time

    Senior Reliability EngineerThe annual base pay range for this position is $95,913 - $144,599. Our salary ranges are determined by role, level, qualifications and work location.About the RoleAs a Senior Reliability Engineer at Netpace, you will be responsible for evaluating the reliability of IC products, packages, and process technology to ensure suitability...


  • San Jose, California, United States HireIO Inc Full time

    {"title": "Senior Site Reliability Engineer", "description": "Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at HireIO Inc. As a key member of our Site Reliability Engineering team, you will be responsible for designing, implementing, and operating large-scale, distributed systems that are highly available,...


  • San Jose, California, United States Power Integrations Full time

    Job Title: Senior Reliability EngineerPower Integrations is seeking a highly skilled Senior Reliability Engineer to join our team. As a key member of our reliability engineering team, you will be responsible for evaluating the reliability of IC products, packages, and process technology.Key Responsibilities:Perform reliability evaluation of IC products,...


  • San Jose, California, United States MILLENNIUMSOFT Full time

    Job Title: Senior Reliability EngineerJob Summary:We are seeking a highly skilled Senior Reliability Engineer to join our team at MillenniumSoft. As a key member of our Reliability Engineering team, you will play a critical role in the development of new life sciences technology, ensuring our products meet the highest standards of reliability and quality.Key...


  • San Jose, California, United States F5 Full time

    Job SummaryF5 is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will play a pivotal role in ensuring the reliability and scalability of our distributed cloud product.Key ResponsibilitiesDesign and implement automation solutions to reduce toil and improve operational efficiencyParticipate in...


  • San Jose, California, United States Employvision Inc. Full time

    Reliability EngineerJob Summary:At Employvision Inc., we are seeking a highly skilled Senior Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for evaluating IC products, packages, and process technologies for reliability, ensuring compliance with industry standards and suitability for end-use...


  • San Jose, California, United States Tik Tok Full time

    Senior Site Reliability Engineer, Global E-CommerceTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy to our users. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our platform.Responsibilities:Be...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Senior Site Reliability Engineer, Global E-CommerceTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our e-commerce...


  • San Jose, California, United States HireIO Inc Full time

    Job DescriptionAt HireIO Inc, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our Site Reliability Engineering (SRE) team, you will be responsible for designing, implementing, and operating large-scale, distributed systems.ResponsibilitiesDesign and implement software platforms and monitor frameworks for...


  • San Diego, California, United States Insight Global Full time

    Job SummaryWe are seeking a highly skilled Senior Systems Reliability Engineer to join our team at Insight Global. As a key member of our systems engineering team, you will be responsible for managing the verification and traceability of product requirements and specifications for new hardware and software products.Key Responsibilities:Manage the...


  • San Jose, California, United States Tik Tok Full time

    Senior Site Reliability Engineer, Global E-commerceTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy to our users. As a Senior Site Reliability Engineer on our Global E-commerce team, you will play a critical role in ensuring the reliability and scalability of our e-commerce platform.Key...


  • San Jose, California, United States Power Integrations Full time

    Job Description:As a Senior Reliability Engineer at Power Integrations, you will be responsible for evaluating the reliability of IC products, packages, and process technology to ensure suitability for end applications and conformance to industry standards.You will perform device-level failure analysis to gain a deep understanding of IC device failures,...


  • San Jose, California, United States Tik Tok Full time

    About the RoleWe are seeking a highly skilled Machine Learning Engineer to join our E-commerce team. As a key member of our team, you will be responsible for designing and developing large-scale recommendation algorithms for our e-commerce platform.Your primary focus will be on building and optimizing our recommendation system to provide the best possible...


  • San Jose, California, United States Power Integrations Full time

    Job Description:As a Senior Reliability Engineer at Power Integrations, you will be responsible for evaluating the reliability of IC products, packages, and process technology to ensure suitability for end applications and conformance to industry standards. Your expertise in device level failure analysis will enable you to gain a deep understanding of IC...


  • San Jose, California, United States Tik Tok Full time

    {"title": "Site Reliability Engineer", "description": "Job SummaryTikTok is seeking a skilled Site Reliability Engineer to join our monetization technology team. As a key member of our team, you will be responsible for designing, building, and operating large-scale, globally distributed ads systems.Key ResponsibilitiesDesign and implement high-availability...

Senior Reliability Engineer for Recommendation Systems

2 months ago


San Jose, California, United States TikTok Full time

About TikTok
TikTok stands as the premier platform for short-form mobile video, dedicated to inspiring creativity and spreading joy. With global headquarters in major cities, we foster a vibrant community across various locations.

Why Work with Us
At TikTok, creativity is at the heart of our mission. Our platform is designed to nurture imaginative minds, and this ethos extends to our teams. Together, we strive to bring joy and inspire creativity, facing challenges as opportunities for learning, innovation, and growth.

About the Recommendation Infrastructure Team
This team is pivotal in constructing and refining the architecture of our recommendation systems, ensuring a seamless and reliable experience for users. Site Reliability Engineers (SREs) within this team are tasked with maintaining system uptime and developing automated systems and pipelines.

Key Responsibilities

  • Enhance the entire lifecycle of recommendation systems, from design consulting to deployment and operational refinement.
  • Develop tools and software to bolster service reliability and scalability, automate operations, and enhance research and development efficiency.
  • Ensure the availability of large-scale services across global data centers.
  • Oversee cloud resource utilization, guaranteeing service level agreements for extensive clusters.
  • Monitor and assess availability, latency, and overall service health.
  • Implement sustainable incident response strategies and conduct thorough postmortems.

Qualifications

  • A Bachelor's degree or higher in Computer Science or a related discipline.
  • A minimum of 2 years of experience in SRE roles focused on large-scale system deployment with an emphasis on reliability and scalability.
  • Proficient in system operations within Linux environments and networking.
  • Experience with programming languages such as Python, Perl, Go, or C/C++.
  • Skilled in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Familiarity with popular CI/CD processes and environments.
  • Strong communication skills coupled with a sense of ownership and initiative.

Commitment to Diversity
TikTok is dedicated to fostering an inclusive environment where every employee's skills, experiences, and perspectives are valued. We aim to reflect the diverse communities we engage with through our platform.

Accommodations
We are committed to providing reasonable accommodations during the recruitment process for candidates with disabilities or other protected reasons.