Site Reliability Engineer Cloud Native Platform

3 weeks ago


San Jose, California, United States Tik Tok Full time
Job Title: Site Reliability Engineer, Cloud Native Platform

TikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to users worldwide. Our mission is to connect people across the globe, and our infrastructure team is seeking experienced site reliability engineers to build a globally distributed edge platform for provisioning and deploying edge services.

Key Responsibilities:
  • Deploy and administer Kubernetes clusters on-prem and in cloud (AWS, GCP, etc.).
  • Collaborate with software engineers to build an enterprise-level edge computing platform (PaaS) with cutting-edge Cloud Native Computing Foundation (CNCF) technologies.
  • Design, develop, automate, and continuously improve platform services and pipelines, such as monitoring, alerting, logging, tracing, CI/CD, etc.
  • Improve Kubernetes system efficiency and debug issues related to networking, storage, scheduling, etc.
  • Collaborate with open-source communities to advance Kubernetes and edge computing technologies.
Qualifications:
  • Master's degree (or Bachelor's degree with 3+ years of experience) in Computer Engineering, Computer Science, or related fields.
  • 1+ years of experience in Kubernetes administration.
  • 3+ years of experience in Unix/Linux systems from kernel to shell and beyond.
  • Experience with Kubernetes CNI deployment and troubleshooting, including (but not limited to) the following CNIs: Cilium, Kube-Router, Calico, Flannel.
  • Experience in designing, analyzing, and building automation tools for large scale and complex systems.
Inclusivity Commitment:

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. We are passionate about this and hope you are too.

The base salary range for this position in the selected city is $ annually. Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location.

Our company benefits are designed to convey company culture and values, to create an efficient and inspiring work environment, and to support our employees to give their best in both work and life.



  • San Francisco, California, United States Zilliz Full time

    Key Responsibilities:Collaborate with cross-functional teams to design and implement scalable and reliable cloud-based systems.Develop and maintain monitoring tools and systems to ensure the availability and performance of Zilliz's distributed database systems.Design and implement strategies for incident management and disaster recovery to minimize downtime...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAt Adobe, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design, develop, and deploy cloud-based services and...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Cloud Site Reliability EngineerWe are seeking a highly skilled Cloud Site Reliability Engineer to join our team at TikTok. As a Cloud Site Reliability Engineer, you will be responsible for building, expanding, and operating Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers, and content...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that accelerates digital transformation for its customers. With a cloud-native platform, Zscaler protects thousands of organizations from cyber threats and data loss by securely connecting users, devices, and applications in any location.As a pioneer in cloud security, Zscaler has over 10 years of...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that accelerates digital transformation for its customers. With a cloud-native platform, Zscaler protects thousands of organizations from cyber threats and data loss by securely connecting users, devices, and applications worldwide.As a pioneer in cloud security, Zscaler has over 10 years of experience...


  • San Diego, California, United States Platform Science Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team in San Diego, CA (or remote). As a key member of our SRE team, you will be responsible for ensuring the reliability and performance of our cloud-based platform.Key ResponsibilitiesDevelop and enhance CI/CD pipelines to streamline application deployment and...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that accelerates digital transformation for its customers. With a cloud-native platform, Zscaler protects thousands of organizations from cyber threats and data loss by securely connecting users, devices, and applications in any location.As a pioneer in cloud security, Zscaler has over 10 years of...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Senior Site Reliability Engineer, Global E-CommerceTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our e-commerce...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Site Reliability Engineer, Global E-CommerceWe are seeking a highly skilled Site Reliability Engineer to join our team and contribute to the development of our global e-commerce platform. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our e-commerce platform, which serves...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Jose, California, United States Adobe Full time

    About the RoleWe are seeking an exceptional Site Reliability Engineering Manager to lead our team in driving reliability for Adobe's AI Inference Platform, Adobe Firefly. As a key member of our Engineering organization, you will be responsible for developing a team of Site Reliability Engineers who will work closely with our Engineering teams to build,...


  • San Jose, California, United States Adobe Full time

    About the RoleWe are seeking an exceptional Site Reliability Engineering Manager to lead our team in driving reliability for Adobe's AI Inference Platform, Adobe Firefly. As a key member of our Engineering organization, you will be responsible for developing a team of Site Reliability Engineers who will work closely with our Engineering teams to build,...


  • San Jose, California, United States Syntricate Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using...


  • San Francisco, California, United States Pager Full time

    PagerDuty empowers teams of all kinds to drive business forward through our Operations Cloud.We're seeking a Senior Site Reliability Engineer to join our SRE-Platform team. As a key contributor, you'll build, maintain, and scale our Kubernetes platform, accelerating developer productivity, improving reliability, and helping PagerDuty scale for the...


  • San Jose, California, United States Platform9 Full time

    Job Description**About the Role**We are seeking an experienced Cloud Native Software Engineer to join our team at Platform9, working on our cloud native product suite. The ideal candidate will have a strong background in computer science, have worked on cloud-native technologies, and have a willingness to learn new technologies.Key ResponsibilitiesContribute...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...


  • San Jose, California, United States Cisco Full time

    About the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our Cloud Security team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based security solutions.Key ResponsibilitiesDesign and implement scalable and highly available cloud-based security...


  • San Diego, California, United States Platform Science Full time

    About UsAt Platform Science, we're revolutionizing the way businesses connect and interact with the world around them. Our open IoT platform empowers innovative fleets, application developers, and equipment providers to deliver cutting-edge solutions to supply chain professionals globally.The RoleWe're seeking a highly skilled Senior Site Reliability...


  • San Jose, California, United States Trianz Full time

    About TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that accelerates digital transformation for its customers. With its cloud-native platform, Zscaler protects thousands of customers from cyber threats and data loss by securely connecting users, devices, and applications in any location.Position:Staff Site Reliability EngineerLocation:Remote within the...