Site Reliability Engineer

2 weeks ago


San Jose, California, United States Tik Tok Full time
About Us

TikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to users worldwide.

Our global offices in Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo foster a collaborative environment where imagination thrives.

Our Mission

We aim to create an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

Our platform connects people from diverse backgrounds, and so does our workplace.

Job Description

We're seeking a Site Reliability Engineer to join our monetization technology team, responsible for building and running large-scale, globally distributed, fault-tolerant ads systems.

The ideal candidate will ensure high availability, scalability, and operability of services, measuring and monitoring availability, latency, and overall service health.

Key Responsibilities:

  1. Engage in and improve the whole lifecycle of Ads systems — from system design consulting through to launch reviews, deployment, operation, and refinement.
  2. Build availability of services deployed across multiple data centers globally.
  3. Deliver tools/software to improve the reliability, scalability, and operability of services.
  4. Measure and monitor availability, latency, and overall service health.
  5. Practice sustainable incident response and postmortems.
  6. Participate in on-call rotations across continents.
Requirements

Minimum Qualifications:

  1. Bachelor's degree in Computer Science or similar technical field of study, or equivalent practical experience.
  2. Programming experience in at least one of the following languages: C, C++, Java, Python, Perl, or Go.
  3. Expertise in Unix/Linux operating systems and IP networking.
  4. Experience in problem-solving, application issues, or production operations.
  5. Experience in automating routine tasks.
  6. Effective communication skills and a sense of ownership and drive.

Preferred Qualifications:

  1. Experience in SRE of Ads/recommendation systems.
  2. Experience designing, analyzing, and troubleshooting large-scale distributed systems.
Why Join Us

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.

We're passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws.



  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud services. You will work closely with our development team to design, deploy, and optimize our cloud services,...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our data pipelines.Key Responsibilities:* Debugging data pipelines* Monitoring alerts and troubleshooting...


  • San Jose, California, United States X (formerly Twitter) Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Command Center Team at X (formerly Twitter). As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our services, working closely with cross-functional teams to drive significant impact across all areas of the...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerJob Summary:Diverse Lynx LLC is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...


  • San Jose, California, United States NetApp Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at NetApp. As a Site Reliability Engineer, you will be responsible for managing, supporting, and maintaining a reliable environment for our site to ensure the stability and security of multiple open-source systems/platforms.Key ResponsibilitiesBuilding and supporting a...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerAt Syntricate Technologies, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Jose, California, United States Adobe Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.ResponsibilitiesDesign and implement scalable and reliable systems to support our cloud-based servicesCollaborate...


  • San Jose, California, United States ApTask Full time

    About ApTask:ApTask is a leading global provider of workforce solutions and talent acquisition services, dedicated to shaping the future of work.As an African American-owned and Veteran-certified company, ApTask offers a comprehensive suite of services, including staffing and recruitment solutions, managed services, IT consulting, and project management.With...


  • San Jose, California, United States Adobe Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a key member of our Cloud Engineering team, you will play a critical role in designing, deploying, and optimizing our cloud services.Key ResponsibilitiesDevelop software and tools to improve the reliability and performance of our cloud servicesCollaborate...


  • San Jose, California, United States VDart Full time

    Job Title:Lead Site Reliability EngineerJob Summary:Vdart is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our software systems.Key Responsibilities:Design and implement automation scripts to improve operational...


  • San Jose, California, United States Trianz Full time

    About TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Site Reliability Engineer, Cloud Native PlatformTikTok is seeking a highly skilled Site Reliability Engineer to join our Cloud Native Platform team. As a key member of our infrastructure team, you will be responsible for building and maintaining our globally distributed edge platform.Key Responsibilities:Deploy and administer Kubernetes clusters...


  • San Jose, California, United States NetApp Full time

    Job SummaryAs a Site Reliability Engineer at NetApp, you will be responsible for ensuring the stability and security of our open-source systems and platforms. This role requires a strong understanding of software development, operations, and system administration.Key ResponsibilitiesDesign and develop technical tools to debug problems in the deployment of...


  • San Jose, California, United States Akraya Full time

    Job Summary:We are seeking a skilled Site Reliability Engineer to join our team. The ideal candidate will have expertise in system monitoring, infrastructure management, and automation, with a keen interest in enhancing system reliability.Key Responsibilities:System Monitoring and Incident Response: Ensure system health and responsiveness to incidents with...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAt Adobe, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design, develop, and deploy cloud-based services and...


  • San Jose, California, United States HireIO Inc Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...


  • San Jose, California, United States Western Digital Full time

    Job Title: Site Reliability Engineer - DevOpsWestern Digital is seeking a highly skilled Site Reliability Engineer - DevOps to join our team. As a Site Reliability Engineer - DevOps, you will play a critical role in ensuring the reliability, scalability, and performance of our IT infrastructure and DevOps tools.Key Responsibilities:Observability and...


  • San Jose, California, United States YO HR CONSULTANCY Full time

    Job Title: Site Reliability EngineerJob Type: Full-timeLocation: RTP/NC and San Jose CAJob Description:Must-Have Skills:Strong knowledge of Kubernetes and LinuxExperience with container orchestration frameworksGood understanding of distributed computing and storageProficiency in scripting languages such as Python and ShellKnowledge of Jenkins and...


  • San Jose, California, United States F5 Full time

    Job SummaryF5 is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will play a pivotal role in ensuring the reliability and scalability of our distributed cloud product.Key ResponsibilitiesDesign and implement automation solutions to reduce toil and improve operational efficiencyParticipate in...