Senior Cloud Reliability Engineer

4 hours ago


San Jose, California, United States Tik Tok Full time
Job Title: Senior Site Reliability Engineer, Global E-Commerce

TikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our e-commerce platform.

Responsibilities:
  • Be part of our global SRE on-call rotation and be responsible for Tier-1 online incident response and devops support.
  • Be responsible for service levels of mission-critical, revenue-generating e-commerce platforms as well as all supporting infrastructure and services.
  • Define service level indicators and data-driven objectives, and develop devops/SRE standards, processes, and methodologies to uphold and improve uptime, latency, and system health of a core global e-commerce production platform.
  • Collaborate cross-team with engineering and product to ensure that key stability and maintainability requirements, such as capacity planning and launch reviews, are performed to enable transparent service delivery to customers.
  • Design strategies for risk detection and mitigation, disaster recovery & simulation, release management, cost optimization, engineering quality, etc.
  • Automation geared towards infrastructure-as-code, scalability, and service resiliency.
  • Implement best practices around incident management, post-mortems while being part of on-call.
Qualifications:
  • Bachelor's or higher degree in Computer Science, similar technical field of study, or equivalent practical experience.
  • 5+ years experience developing, provisioning or maintaining production-grade large scaled distributed systems.
  • High level of proficiency in Linux OS internals, networking, microservices, databases, caches, etc. in cloud-native environments.
  • Demonstrable familiarity with programming or scripting languages (Go/Python/Bash/C++ etc).
  • Demonstrable experience in the development and implementation of devops and SRE methodologies.
Why Join Us:

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at.



  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that serves thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, our mission is to make the cloud a safe and secure place for businesses to operate. As the operator of the world's largest security cloud, we accelerate digital transformation for...


  • San Jose, California, United States Zscaler Full time

    We are seeking an experienced Cloud Reliability Engineer to join our CRE team at Zscaler. As a key member of our team, you will be responsible for:Key Responsibilities:Troubleshooting and identifying the root cause of cloud reliability issues.Developing solutions and observability tools focusing on early detection and prevention for cloud and customer...

  • Senior Cloud Engineer

    3 weeks ago


    San Francisco, California, United States Google Cloud - Minnesota Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Engineer to join our team at Google Cloud - Minnesota. As a key member of our Technical Infrastructure team, you will play a critical role in designing, building, and operating large-scale, distributed systems that power our cloud services.ResponsibilitiesService Lifecycle Management: Engage in the...


  • San Francisco, California, United States AutoRABIT Holding, Inc. Full time

    About AutoRABIT Holding, Inc.AutoRABIT Holding, Inc. is a leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare. Our solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team, while meeting...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    About AutoRABIT Holding Inc.AutoRABIT Holding Inc. is a leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare. Our solutions enable developers to automate their daily tasks, increasing productivity and release velocity while meeting stringent security, compliance, and privacy...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    About the RoleAutoRABIT is seeking a highly skilled Senior Site Reliability/DevSecOps Engineer to join our team. As a key member of our cloud operations team, you will be responsible for designing, implementing, and maintaining scalable, resilient, and secure infrastructure using AWS.Key ResponsibilitiesDevelop and manage infrastructure as code using...


  • San Jose, California, United States Tik Tok Full time

    Job Title: Cloud Site Reliability EngineerWe are seeking a highly skilled Cloud Site Reliability Engineer to join our team at TikTok. As a Cloud Site Reliability Engineer, you will be responsible for building, expanding, and operating Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers, and content...


  • San Francisco, California, United States GlossGenius Full time

    About GlossGeniusGlossGenius is a pioneering company that empowers entrepreneurs to succeed by providing a comprehensive ecosystem of business management tools. Our platform enables small business owners to focus on their core activities, rather than administrative tasks, by offering a range of innovative solutions including booking and scheduling,...


  • San Francisco, California, United States AutoRABIT Holding, Inc. Full time

    About AutoRABIT Holding, Inc.AutoRABIT Holding, Inc. is a leading provider of cloud-based DevSecOps solutions for regulated industries. Our mission is to empower developers to automate their daily tasks, increase productivity, and meet stringent security, compliance, and privacy regulations.About the RoleWe are seeking a highly experienced Senior Cloud...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAt Adobe, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design, develop, and deploy cloud-based services and...


  • San Jose, California, United States F5 Full time

    About F5F5 is a leading provider of cloud and security solutions, empowering organizations to create, secure, and run applications that enhance the digital experience.Job SummaryWe are seeking an exceptional Senior Site Reliability Engineer to join our SRE team for the F5 Distributed Cloud Product. As a key member of our team, you will play a pivotal role in...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    About AutoRABITAutoRABIT is a leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare. Our solutions enable developers to automate their daily tasks, increasing productivity and release velocity while meeting stringent security, compliance, and privacy regulations.About the RoleWe...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    About AutoRABITAutoRABIT is a leading provider of Salesforce DevSecOps platform for regulated industries, including financial institutions, insurance, and healthcare. Our solutions enable developers to automate daily tasks, increasing productivity and release velocity while meeting stringent security, compliance, and privacy regulations.About the RoleWe are...


  • San Jose, California, United States F5 Full time

    Job SummaryF5 is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will play a pivotal role in ensuring the reliability and scalability of our distributed cloud product.Key ResponsibilitiesDesign and implement automation solutions to reduce toil and improve operational efficiencyParticipate in...


  • San Francisco, California, United States Crusoe Full time

    About Crusoe EnergyCrusoe Energy is a pioneering company that's revolutionizing the way we approach energy consumption. Our mission is to unlock value in stranded energy resources through the power of computation.Job SummaryWe're seeking a highly skilled Senior/Staff Site Reliability Engineer to join our team. As a key member of our cloud infrastructure...


  • San Jose, California, United States Cisco Full time

    About the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our Cloud Security team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based security solutions.Key ResponsibilitiesDesign and implement scalable and highly available cloud-based security...


  • San Francisco, California, United States Springshot Full time

    About the RoleWe're seeking a seasoned Senior Site Reliability Engineer to join our team at Springshot. As a key member of our crew, you'll play a vital role in maintaining the reliability and performance of our SaaS platform.With a strong portfolio of global aviation customers and a passion for innovation, we're continuously pushing the boundaries of what's...


  • San Jose, California, United States F5 Full time

    About the RoleWe are seeking an exceptional Senior Site Reliability Engineer to join our SRE team for the groundbreaking F5 Distributed Cloud Product. As a key member of our team, you will play a pivotal role in ensuring the reliability, scalability, and security of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement automation solutions...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that protects thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, Zscaler's mission is to make the cloud a safe place to do business and provide a seamless experience for enterprise users.As the operator of the world's largest security cloud, Zscaler...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that provides a comprehensive security platform to protect enterprises from cyber threats. With a mission to make the cloud a safe place to do business, Zscaler has built a reputation as a trusted partner for organizations around the world.Job SummaryWe are seeking an experienced Cloud Reliability...