Site Reliability Engineer
4 weeks ago
About the Role
TikTok is seeking an experienced Site Reliability Engineer to join our US Data Security team. As a key member of our Video Platform team, you will be responsible for ensuring the reliability and performance of our video system, which serves billions of users worldwide.
Key Responsibilities
- Oversee the overall reliability of TikTok's video system, including video publishing and distribution.
- Perform lifecycle management of production systems, including change management, service deployment, operations, and emergency response.
- Monitor the system and respond to incidents to maintain system service level agreement (SLA), review and follow up all production incidents.
- Perform capacity management of compute, storage, and network bandwidth resources to ensure system stability and save infrastructure costs.
- Provide strong support during big events to ensure the system is capable of consuming a large volume of Internet traffic.
- Build tools, automations, visualizations, and monitors to facilitate the operation and optimization of the global infrastructure.
Requirements
- Bachelor's degree in Computer Science or a related technical background involving software/system engineering, or equivalent working experience.
- 2+ years of SRE or DevOps experience in large-scale online services.
- Programming experience with at least one of the following languages: C, C++, Java, Python, C#, or Go.
Preferred Qualifications
- Extensive knowledge of networking, operation system, database system, and container technology.
- Good understanding of every aspect of microservice architecture, and hands-on experience in troubleshooting in large-scale distributed systems.
- Hands-on experience in common open-source systems such as Linux, MySQL, MongoDB, Redis, and ELK.
- Experience in building solutions with AWS, Google, Azure, and other cloud services is a plus.
- Passionate, self-motivated, and good teamwork skills.
About TikTok
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy.
We are passionate about this and hope you are too. If you are passionate about ensuring software reliability, love problem-solving, and are prepared for exciting challenges, we would like you to join our team.
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Sogeti Full timeJob Title: Site Reliability EngineerAbout the Role:We are seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using Azure or...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Oracle Full timeAbout the Role:Oracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, develop, and deploy software to improve the availability, scalability, and efficiency of...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Oracle Full timeAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Oracle. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure. You will work closely with our development teams to design, implement, and operate large-scale distributed...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States HireIO Inc Full timeJob SummaryAt HireIO Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and reliability of our Ads systems. This includes designing, analyzing, and troubleshooting large-scale distributed systems, as well as developing tools and...
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Diverse Lynx Full timeJob Title: Sr. Site Reliability EngineerLocation: RemoteDuration: 12+ Months contractJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the availability, reliability, and performance of our applications and services.You will work...
-
Site Reliability Engineer
1 month ago
Seattle, Washington, United States Tik Tok Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Data Platform Team at TikTok. As a key member of our team, you will be responsible for designing, building, and operating large-scale, massively distributed services and infrastructures.Key ResponsibilitiesDesign and implement reliable, scalable, and robust big data systems...
-
Site Reliability Engineer
2 months ago
Seattle, Washington, United States Tik Tok Full timeAbout the RoleThis is a Site Reliability Engineer position, focusing on the data pipeline reliability for the Video Platform team in USDS.Data SREs monitor data and keep production batch and real-time processing jobs up and running with the highest level of availability, ensuring our users have the freshest, complete, and correct data...
-
Site Reliability Engineer
1 month ago
Seattle, Washington, United States Hireio, Inc. Full timeJob OverviewHireio, Inc. is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our Ads systems team, you will be responsible for ensuring the reliability, scalability, and operability of our services.Key ResponsibilitiesDesign and implement scalable and reliable systems architectureCollaborate with cross-functional teams...
-
Site Reliability Engineer III
4 weeks ago
Seattle, Washington, United States F5 Networks Full timeJob SummaryF5 Networks is seeking a highly skilled Site Reliability Engineer III to join our team. As a Site Reliability Engineer III, you will be responsible for ensuring the reliability, availability, and scalability of critical systems and SaaS platforms.Key ResponsibilitiesApply modern engineering principles and practices to operational functions and...
-
Site Reliability Engineering Lead
4 weeks ago
Seattle, Washington, United States DAT Freight Solutions Full timeAbout DAT Freight SolutionsDAT Freight Solutions is a leading provider of transportation management software and services. We are seeking a highly skilled Site Reliability Engineering Lead to join our team.The successful candidate will be responsible for leading major technical initiatives and mentoring engineers to enhance their skills. They will work...
-
Site Reliability Engineer Manager, Foundation
4 weeks ago
Seattle, Washington, United States Qualtrics Full timeWe are looking for a Site Reliability Engineer Manager to lead our Gov1 environment in the Foundation Product Unit.This person will be responsible for managing a team of US-based Support Engineers who will support Gov1 activities for non-US teams in the Foundation org.The ideal candidate will have experience in site reliability engineering, team management,...
-
Site Reliability Engineer Manager, Foundation
4 weeks ago
Seattle, Washington, United States Qualtrics Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Manager to lead our SRE team in the Foundation Product Unit. As a key member of our team, you will be responsible for ensuring the reliability and scalability of our Gov1 environment.As a Site Reliability Engineer Manager, you will be responsible for leading a team of SREs, collaborating...
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States F5 Networks Full timeAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at F5 Networks. As a key member of our engineering team, you will be responsible for ensuring the reliability and performance of our systems.Key ResponsibilitiesDesign and implement scalable and efficient system architecturesDevelop and maintain monitoring and...
-
Site Reliability Engineering Lead
4 weeks ago
Seattle, Washington, United States DAT Solutions Full timeAbout DAT SolutionsWe are a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years.We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on us for the most relevant data...
-
Site Reliability Engineering Manager
4 weeks ago
Seattle, Washington, United States Apple Full timeRole OverviewAs a Site Reliability Engineering Manager at Apple, you will be responsible for leading a team that provides the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish.Key ResponsibilitiesEstablish SRE practices for a private cloud service to accelerate...
-
Site Reliability Engineering Lead
1 month ago
Seattle, Washington, United States DAT Solutions Full timeAbout DAT SolutionsAs a leading employer of choice, DAT Solutions is a next-generation SaaS technology company that has been at the forefront of innovation in transportation supply chain logistics for decades.We continue to transform the industry by deploying a suite of software solutions to millions of customers every day, providing them with the most...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States ApTask Full timeThe Client is a leading global IT services and consulting company, providing a wide range of services to clients in various industries, including banking, financial services, retail, manufacturing, healthcare, and more. The company places a strong emphasis on employee training and development, and is known for its commitment to innovation and investment in...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Oracle Full timeJob DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Oracle. As a key member of our cloud infrastructure team, you will be responsible for designing, building, and maintaining large-scale distributed systems that provide a seamless experience for our customers.Key Responsibilities:Design and implement sophisticated...
-
Site Reliability Engineer III
4 weeks ago
Seattle, Washington, United States F5 Full timeJob SummaryAt F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation. Everything we do...
-
Site Reliability Engineering Leader
4 weeks ago
Seattle, Washington, United States Apple Full timeJob SummaryThe Apple Services Engineering team is seeking a highly skilled Site Reliability Engineering Leader to lead our security-focused SRE team. As a Site Reliability Engineering Leader, you will be responsible for designing, engineering, and running systems and infrastructure that ensure the highest quality Apple Services experience for our customers....