Site Reliability Engineer, Data Infrastructure
2 weeks ago
We are seeking a highly skilled Site Reliability Engineer to join our Compute Platform team at TikTok. As a key member of our team, you will be responsible for ensuring the reliability and performance of our Big Data services and products.
Responsibilities- Ensure the reliability of all TikTok's major data warehouse products, services, and query engines, such as ClickHouse, Spark, Presto, and Doris.
- Uphold Service Level Agreements (SLAs) and ensure that all service level objectives and agreements from ByteDance's Data Platform services are met.
- Continuously analyze service performance and reliability patterns to identify potential performance bottlenecks and implement proactive measures to prevent service disruptions.
- Lead efforts to troubleshoot and resolve service incidents and postmortems, coordinating with cross-functional teams to manage and mitigate service-impacting events.
- Automate infrastructure provisioning, scaling, and management processes to reduce manual interventions and improve service quality.
- Collaborate with product and development teams to integrate reliability and performance considerations into the software lifecycle.
- Bachelor's Degree or above in Computer Science, Engineering, or a related field.
- Indepth understanding of Linux, computer networking, and databases.
- Proficient in common SRE/DevOps open-source toolsets, system monitoring tools, and container orchestration platforms like Kubernetes.
- Experience or familiarity with open-source or commercial technologies such as ClickHouse, Hadoop, Doris, Spark, Presto, and Kubernetes.
- Strong coding skills in at least one scripting or programming language, including but not limited to Python, Shell, Java, Go, etc.
- Excellent problem-solving skills and the ability to think critically under pressure.
- Strong written and verbal communication skills, with a great customer-first mindset.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy.
We are passionate about this and hope you are too. TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws.
-
Site Reliability Engineer, Data Infrastructure
3 weeks ago
San Jose, California, United States Tik Tok Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Compute Platform team at TikTok. As a key member of our team, you will be responsible for ensuring the reliability and performance of our Big Data services and products.ResponsibilitiesDesign and implement proactive measures to prevent service disruptions and ensure high...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Tik Tok Full timeAbout the RoleTikTok is seeking a highly skilled Site Reliability Engineer to join our Compute Platform SRE team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our major data warehouse products, services, and query engines.ResponsibilitiesEnsure the reliability of all TikTok's major data warehouse...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Adobe Systems Inc Full time{"title": "Site Reliability Engineer", "description": "Transforming Digital ExperiencesAt Adobe, we're passionate about empowering people to create beautiful and powerful digital experiences. We're on a mission to hire the very best and create exceptional employee experiences where everyone is respected and has access to equal opportunity.The OpportunityWe...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Splunk Full timeAbout SplunkSplunk is a leading provider of cloud-based data analytics and monitoring solutions. Our mission is to make machine data accessible, usable, and valuable to everyone.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our Cloud TechOps team. As a Site Reliability Engineer, you will be responsible for ensuring the...
-
Site Reliability Engineer, Data Platform
1 week ago
San Jose, California, United States Tik Tok Full timeJob Title: Site Reliability Engineer, Data PlatformTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. Our platform is built to help imaginations thrive, and we're looking for a Site Reliability Engineer to join our Data Platform team.Responsibilities:Ensure the reliability of all TikTok's...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Cisco Full timeAbout the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure. You will work closely with our development teams to identify and resolve issues, and collaborate with other teams to...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Trianz Full timeAbout TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Altius Technologies Inc Full timeJob DescriptionAt Altius Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for creating and supporting automation scripts for infrastructure deployments, validations, and monitoring to improve operational tasks.Key Responsibilities:Design and implement...
-
Cloud Site Reliability Engineer
2 weeks ago
San Jose, California, United States Tik Tok Full timeJob Title: Cloud Site Reliability EngineerWe are seeking a highly skilled Cloud Site Reliability Engineer to join our team at TikTok. As a Cloud Site Reliability Engineer, you will be responsible for building, expanding, and operating Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers, and content...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Site Reliability Engineer
3 hours ago
San Jose, California, United States Trianz Full timeAbout TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...
-
Infrastructure Reliability Engineer
1 month ago
San Jose, California, United States Western Digital Full timeJob OverviewCompany OverviewAt Western Digital, we are driven by a vision to inspire global innovation and redefine technological possibilities. Our legacy as problem solvers has empowered us to achieve remarkable feats, including contributions to monumental projects like the moon landing.As a trusted partner to leading organizations worldwide, we enhance...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Altius Technologies, Inc. Full timeJob Title: Site Reliability EngineerAltius Technologies, Inc. is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that support our business applications.Key Responsibilities:Design and implement automation...
-
Senior Site Reliability Engineer
3 days ago
San Jose, California, United States Tik Tok Full timeJob Title: Senior Site Reliability EngineerAt TikTok, we're committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. As a Senior Site Reliability Engineer, you'll play a critical role in shaping the future of...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using...
-
Infrastructure Reliability Engineer
1 month ago
San Jose, California, United States Western Digital Full timeJob OverviewCompany Overview:At Western Digital, we are dedicated to driving global innovation and redefining the limits of technology, transforming what was once deemed impossible into reality.As a company rooted in problem-solving, we empower individuals to achieve remarkable feats through advanced technology. Our innovations have played a pivotal role in...
-
Site Reliability Engineer
4 weeks ago
San Diego, California, United States Apple Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Data Analytics team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our data analytics applications and infrastructure.Key ResponsibilitiesDesign, develop, and maintain complex data infrastructure at the...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Tik Tok Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our dynamic team at TikTok. As a pioneer in innovation, our data infrastructure SRE team seamlessly merges software development and infrastructure operations to design, build, and manage large-scale, highly distributed systems.Key ResponsibilitiesParticipate in and enhance the...
-
Senior Site Reliability Engineer
2 weeks ago
San Jose, California, United States Hireio, Inc. Full timeAbout the RoleHireio, Inc. is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our data infrastructure team, you will be responsible for designing, building, and managing large-scale, highly distributed systems.Our team is a pioneer in innovation, seamlessly merging software development and infrastructure...