Senior Site Reliability Engineer

1 day ago


San Jose, United States Tik Tok Full time
Responsibilities

TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.

Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.
At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.

TikTok is one of the fastest growing apps in the world, and we're seeking Site Reliability Engineers (SREs) to join our monetization technology team. The monetization technology team works on building and running large-scale, globally distributed, fault-tolerant ads systems. SREs keep the systems up and running with the highest level of availability, ensuring our users have the best experience possible.

What You'll Do

1. Engage in and improve the whole lifecycle of Ads systems - from system design consulting through to launch reviews, deployment, operation and refinement.
2. Build availability of services deployed across multiple data centers globally.
3. Deliver tools/software to improve the reliability, scalability and operability of services.
4. Measure and monitor availability, latency and overall service health.
5. Practice sustainable incident response and postmortems.
6. Participate in on-call rotations across continents.

Qualifications

Minimum Qualifications

1. Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience.
2. 5+ years of experience in programming in at least one of the following programming languages: C, C++, Java, Python, Perl, or Go.
3. Expertise in Unix/Linux operating systems, IP networking.
4. Experience in problem solving, application issues, or production operations.
5. Experience in automating routine tasks.
6. Effective communication skills and a sense of ownership and drive.

Preferred Qualifications

1. Experience in SRE of Ads/recommendation systems.
2. Experience designing, analyzing and troubleshooting large-scale distributed systems.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at tt.accommodations@tiktok.com

  • San Jose, United States NInfo Systems, Inc. Full time

    Company DescriptionNInfo Systems Inc. is a Certified minority-owned national IT Recruiting and Solutions provider with two decades of experience. It works with Fortune 500 corporations, mid-sized companies, Boutique Consulting companies, startups, SME-level organizations, Federal/ State agencies, and tier-one vendors.Role: Senior Reliability Engineer, Hybrid...


  • san jose, United States Triune Infomatics Inc Full time

    Role: Senior Site Reliability ManagerFull-Time - HybridLocal to San Jose, CAThe Client is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control, and security for the distributed edge. Their platform allows customers to seamlessly manage and deploy any compute node, unlocking the value of IoT data, enabling...


  • san jose, United States Triune Infomatics Inc Full time

    Role: Senior Site Reliability ManagerFull-Time - HybridLocal to San Jose, CAThe Client is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control, and security for the distributed edge. Their platform allows customers to seamlessly manage and deploy any compute node, unlocking the value of IoT data, enabling...


  • San Jose, United States Triune Infomatics Inc Full time

    Role: Senior Site Reliability ManagerFull-Time - HybridLocal to San Jose, CAThe Client is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control, and security for the distributed edge. Their platform allows customers to seamlessly manage and deploy any compute node, unlocking the value of IoT data, enabling...


  • San Jose, United States EVONA Full time

    Site Reliability Engineer (SRE)Location: San Francisco Bay AreaRole Overview:We are seeking a highly skilled Site Reliability Engineer (SRE) to join a dynamic team at a rapidly growing technology company. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of mission-critical systems, while implementing automation...


  • San Francisco, United States WEX Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • San Francisco, California, United States WEX Full time

    Job SummaryThe WEX Site Reliability Engineering team is seeking a highly motivated and quick-learning individual to join our team as a Site Reliability Engineer Level 1. As a key member of our team, you will be responsible for ensuring the reliability, performance, and security of our systems.Key Responsibilities:Actively participate in training and...


  • San Francisco, United States WEX, Inc. Full time

    About the RoleThe WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits Reliability organization which supports our internal...


  • San Francisco, California, United States WEX Inc Full time

    The WEX Site Reliability Engineering team is looking for a motivated Site Reliability Engineer to join our Benefits Reliability organization. As a member of our team, you will be responsible for ensuring the reliability, performance, and security of our systems.Key Responsibilities:Learning and Development: Participate in training and mentorship programs to...


  • San Francisco, United States WEX Full time

    About the Role The WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits Reliability organization which supports our internal...


  • San Jose, United States VDart Full time

    Job Title: Lead Site Reliability Engineer Location: San Jose, CA (2 Days Hybrid) Term: Contract Job Description: Responsibilities: Please look for 14 years of hands-on Coding/scripting (Ansible), Python, and Cloud Computing About the Role • We seek a highly skilled and dynamic Site Reliability Engineer - Consultant In this role you will • Maintain and...


  • San Jose, United States HireIO Inc Full time

    About the company It is the leading destination for short-form mobile video. It is the largest Unicorn startup. It's the leader in short-form video hosting service now. It surpassed 1.3 billion mobile downloads in United States and 2 billion worldwide. With 1.5 billion monthly active users worldwide, it ranked one of the most popular social entertainment...


  • San Jose, United States Tik Tok Full time

    TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly...


  • San Jose, United States Diverse Lynx Full time

    Role:- Site Reliability Engineer Location- San Jose, CA - ONSITE roles/No Remote Duration- Long Term Job Description: Must Have Technical/Functional Skills Docker, Kubernetes, Ansible, Python, Shell scripting, etc Experience Required 8 - 10 Roles & Responsibilities Extensive experience working with linux flavors like rhel/centos os,...


  • San Jose, United States Altimetrik Full time

    We are looking to hire a Site reliability EngineerEducational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.Cloud Infrastructure Expertise: Has at...


  • san jose, United States Altimetrik Full time

    We are looking to hire a Site reliability EngineerEducational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.Cloud Infrastructure Expertise: Has at...


  • san jose, United States Altimetrik Full time

    We are looking to hire a Site reliability EngineerEducational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.Cloud Infrastructure Expertise: Has at...


  • San Jose, United States PDSSOFT INC. Full time

    Site Reliability Engineer (SRE) / AWS DevOps EngineerLocation: San Jose,CADuration: Long TermJob Description:We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in GitHub Actions, AWS DevOps, Helm Charts, and YAML configuration. The ideal candidate will be responsible for ensuring the reliability, scalability, and efficiency of our...


  • san jose, United States NInfo Systems, Inc. Full time

    Company DescriptionNInfo Systems Inc. is a Certified minority-owned national IT Recruiting and Solutions provider with two decades of experience. It works with Fortune 500 corporations, mid-sized companies, Boutique Consulting companies, startups, SME-level organizations, Federal/ State agencies, and tier-one vendors.Role: Senior Reliability Engineer, Hybrid...


  • san jose, United States NInfo Systems, Inc. Full time

    Company DescriptionNInfo Systems Inc. is a Certified minority-owned national IT Recruiting and Solutions provider with two decades of experience. It works with Fortune 500 corporations, mid-sized companies, Boutique Consulting companies, startups, SME-level organizations, Federal/ State agencies, and tier-one vendors.Role: Senior Reliability Engineer, Hybrid...