Site Reliability Engineer

3 weeks ago


San Jose, California, United States Tik Tok Full time
About Us

TikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to users worldwide. Our global offices foster a collaborative environment where diverse perspectives and ideas thrive.

Job Description

We are seeking a talented Site Reliability Engineer to join our e-commerce team. As a key member of our global SRE team, you will be responsible for ensuring the reliability and scalability of our e-commerce platform.

Responsibilities:
  • Be part of our global SRE on-call rotation and respond to Tier-1 online incidents.
  • Ensure the service levels of our mission-critical e-commerce platform and supporting infrastructure.
  • Define service level indicators and develop devops/SRE standards to improve uptime, latency, and system health.
  • Collaborate with engineering and product teams to ensure key stability and maintainability requirements are met.
  • Design strategies for risk detection and mitigation, disaster recovery, and release management.
  • Implement automation for infrastructure-as-code, scalability, and service resiliency.
  • Implement best practices around incident management and post-mortems.
Requirements:
  • Bachelor's or higher degree in Computer Science or a related field.
  • 5+ years of experience developing, provisioning, or maintaining production-grade large scaled distributed systems.
  • High level of proficiency in Linux OS internals, networking, microservices, databases, and caches in cloud-native environments.
  • Demonstrable familiarity with programming or scripting languages (Go, Python, Bash, C++ etc).
  • Demonstrable experience in devops and SRE methodologies.
Preferred Qualifications:
  • Experience in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Systematic problem-solving approach and effective communication skills.

TikTok is committed to creating an inclusive environment where employees are valued for their skills, experiences, and unique perspectives. We celebrate our diverse voices and strive to reflect the many communities we reach. If you need assistance or a reasonable accommodation, please reach out to us.



  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud services. You will work closely with our development team to design, deploy, and optimize our cloud services,...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our data pipelines.Key Responsibilities:* Debugging data pipelines* Monitoring alerts and troubleshooting...


  • San Jose, California, United States X (formerly Twitter) Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Command Center Team at X (formerly Twitter). As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our services, working closely with cross-functional teams to drive significant impact across all areas of the...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerJob Summary:Diverse Lynx LLC is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...


  • San Jose, California, United States NetApp Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at NetApp. As a Site Reliability Engineer, you will be responsible for managing, supporting, and maintaining a reliable environment for our site to ensure the stability and security of multiple open-source systems/platforms.Key ResponsibilitiesBuilding and supporting a...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerAt Syntricate Technologies, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Jose, California, United States Adobe Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.ResponsibilitiesDesign and implement scalable and reliable systems to support our cloud-based servicesCollaborate...


  • San Jose, California, United States ApTask Full time

    About ApTask:ApTask is a leading global provider of workforce solutions and talent acquisition services, dedicated to shaping the future of work.As an African American-owned and Veteran-certified company, ApTask offers a comprehensive suite of services, including staffing and recruitment solutions, managed services, IT consulting, and project management.With...


  • San Jose, California, United States Adobe Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a key member of our Cloud Engineering team, you will play a critical role in designing, deploying, and optimizing our cloud services.Key ResponsibilitiesDevelop software and tools to improve the reliability and performance of our cloud servicesCollaborate...


  • San Jose, California, United States VDart Full time

    Job Title:Lead Site Reliability EngineerJob Summary:Vdart is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our software systems.Key Responsibilities:Design and implement automation scripts to improve operational...


  • San Jose, California, United States Trianz Full time

    About TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Site Reliability Engineer, Cloud Native PlatformTikTok is seeking a highly skilled Site Reliability Engineer to join our Cloud Native Platform team. As a key member of our infrastructure team, you will be responsible for building and maintaining our globally distributed edge platform.Key Responsibilities:Deploy and administer Kubernetes clusters...


  • San Jose, California, United States NetApp Full time

    Job SummaryAs a Site Reliability Engineer at NetApp, you will be responsible for ensuring the stability and security of our open-source systems and platforms. This role requires a strong understanding of software development, operations, and system administration.Key ResponsibilitiesDesign and develop technical tools to debug problems in the deployment of...


  • San Jose, California, United States Akraya Full time

    Job Summary:We are seeking a skilled Site Reliability Engineer to join our team. The ideal candidate will have expertise in system monitoring, infrastructure management, and automation, with a keen interest in enhancing system reliability.Key Responsibilities:System Monitoring and Incident Response: Ensure system health and responsiveness to incidents with...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAt Adobe, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design, develop, and deploy cloud-based services and...


  • San Jose, California, United States HireIO Inc Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...


  • San Jose, California, United States YO HR CONSULTANCY Full time

    Job Title: Site Reliability EngineerJob Type: Full-timeLocation: RTP/NC and San Jose CAJob Description:Must-Have Skills:Strong knowledge of Kubernetes and LinuxExperience with container orchestration frameworksGood understanding of distributed computing and storageProficiency in scripting languages such as Python and ShellKnowledge of Jenkins and...


  • San Jose, California, United States F5 Full time

    Job SummaryF5 is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will play a pivotal role in ensuring the reliability and scalability of our distributed cloud product.Key ResponsibilitiesDesign and implement automation solutions to reduce toil and improve operational efficiencyParticipate in...


  • San Jose, California, United States VDart Full time

    Job Title: Lead Site Reliability Engineer Location: San Jose, CA (2 Days Hybrid) Duration: 6+ months Job Description: Experience Desired: 14+ Years. Responsibilities: We are seeking a highly skilled and dynamic Site Reliability Engineer to join our team. In this role, you will be responsible for maintaining and improving the reliability, performance,...