Site Reliability Engineer
4 weeks ago
TikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to users worldwide.
Our global offices in Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo foster a collaborative environment where imagination thrives.
Our MissionWe aim to create an inclusive space where employees are valued for their skills, experiences, and unique perspectives.
Our platform connects people from diverse backgrounds, and so does our workplace.
Job DescriptionWe're seeking a Site Reliability Engineer to join our monetization technology team, responsible for building and running large-scale, globally distributed, fault-tolerant ads systems.
The ideal candidate will ensure high availability, scalability, and operability of services, measuring and monitoring availability, latency, and overall service health.
Key Responsibilities:
- Engage in and improve the whole lifecycle of Ads systems — from system design consulting through to launch reviews, deployment, operation, and refinement.
- Build availability of services deployed across multiple data centers globally.
- Deliver tools/software to improve the reliability, scalability, and operability of services.
- Measure and monitor availability, latency, and overall service health.
- Practice sustainable incident response and postmortems.
- Participate in on-call rotations across continents.
Minimum Qualifications:
- Bachelor's degree in Computer Science or similar technical field of study, or equivalent practical experience.
- Programming experience in at least one of the following languages: C, C++, Java, Python, Perl, or Go.
- Expertise in Unix/Linux operating systems and IP networking.
- Experience in problem-solving, application issues, or production operations.
- Experience in automating routine tasks.
- Effective communication skills and a sense of ownership and drive.
Preferred Qualifications:
- Experience in SRE of Ads/recommendation systems.
- Experience designing, analyzing, and troubleshooting large-scale distributed systems.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives.
We're passionate about this and hope you are too.
TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws.
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Adobe Full timeJob Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud services. You will work closely with our development team to design, deploy, and optimize our cloud services,...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our data pipelines.Key Responsibilities:* Debugging data pipelines* Monitoring alerts and troubleshooting...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerJob Summary:Diverse Lynx LLC is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States NetApp Full timeJob SummaryAs a Site Reliability Engineer at NetApp, you will be responsible for ensuring the stability and security of our open-source systems and platforms. This role requires a strong understanding of software development, operations, and system administration.Key ResponsibilitiesDesign and develop technical tools to debug problems in the deployment of...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Akraya Full timeJob Summary:We are seeking a skilled Site Reliability Engineer to join our team. The ideal candidate will have expertise in system monitoring, infrastructure management, and automation, with a keen interest in enhancing system reliability.Key Responsibilities:System Monitoring and Incident Response: Ensure system health and responsiveness to incidents with...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States HireIO Inc Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States YO HR CONSULTANCY Full timeJob Title: Site Reliability EngineerJob Type: Full-timeLocation: RTP/NC and San Jose CAJob Description:Must-Have Skills:Strong knowledge of Kubernetes and LinuxExperience with container orchestration frameworksGood understanding of distributed computing and storageProficiency in scripting languages such as Python and ShellKnowledge of Jenkins and...
-
Lead Site Reliability Engineer
4 weeks ago
San Jose, California, United States VDart Full timeJob Title: Lead Site Reliability Engineer Location: San Jose, CA (2 Days Hybrid) Duration: 6+ months Job Description: Experience Desired: 14+ Years. Responsibilities: We are seeking a highly skilled and dynamic Site Reliability Engineer to join our team. In this role, you will be responsible for maintaining and improving the reliability, performance,...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Tik Tok Full timeAbout UsTikTok is a global leader in short-form mobile video, inspiring creativity and bringing joy to users worldwide. Our mission is to empower creators and communities to thrive in a vibrant, inclusive space.Job SummaryWe're seeking a skilled Site Reliability Engineer to join our dynamic team, driving innovation and excellence in our cloud infrastructure....
-
Reliability Engineer
4 weeks ago
San Jose, California, United States NetApp Full timeJob SummaryAs a Site Reliability Engineer, you will be responsible for ensuring the stability and security of multiple open-source systems and platforms that are run or operated in our environment.Key ResponsibilitiesBuilding and maintaining a reliable site environment to meet the development and maintenance requirements of open-source systems and...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Tik Tok Full timeAbout Team Site Reliability Engineering at TikTokTikTok's mission is to inspire creativity and bring joy. Our platform is built to help imaginations thrive, and our Site Reliability Engineering team plays a crucial role in making this happen.ResponsibilitiesDesign and implement software platforms and monitor frameworks for efficient, automated, and...
-
Senior Site Reliability Engineer
4 weeks ago
San Jose, California, United States Hireio, Inc. Full timeJob OverviewWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Hireio, Inc.The ideal candidate will have a strong background in software development, systems engineering, and cloud infrastructure. They will be responsible for designing, implementing, and maintaining large-scale, distributed systems that are highly available,...
-
Site Reliability Engineer
4 weeks ago
San Francisco, California, United States Unreal Gigs Full timeJob Title: Site Reliability EngineerAt Unreal Gigs, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the high availability, scalability, and performance of our complex distributed systems.Key Responsibilities:Design and implement monitoring, logging, and alerting...
-
Site Reliability Engineer
4 weeks ago
San Francisco, California, United States Unreal Gigs Full timeJob Title: Site Reliability EngineerAt Unreal Gigs, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the high availability, scalability, and performance of our complex distributed systems.Key Responsibilities:Design and implement monitoring, logging, and alerting...
-
Site Reliability Engineer SRE
4 weeks ago
San Jose, California, United States Tata Consultancy Services Full timeKey Responsibilities:As a Site Reliability Engineer SRE at Tata Consultancy Services, you will be responsible for ensuring the reliability and scalability of our cloud infrastructure. This includes administration experience in Tableau, debugging data pipelines, and monitoring alerts using Grafana, Azure Metrics, Kusto Functions, and LogicApps. You will also...
-
Site Reliability Engineer, Compute Platform
4 weeks ago
San Jose, California, United States Tik Tok Full timeJob SummaryTikTok is seeking a highly skilled Site Reliability Engineer to join our Compute Platform team. As a key member of our team, you will be responsible for ensuring the reliability and performance of our Big Data services and products. Responsibilities:Design and implement proactive measures to prevent service disruptions and ensure high availability...
-
Site Reliability Engineer Graduate
4 weeks ago
San Jose, California, United States ByteDance Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Applied Machine Learning team at ByteDance. As a key member of our team, you will be responsible for designing and maintaining large-scale systems, ensuring the highest level of availability for our machine learning services, and creating highly automated systems and...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Tata Consultancy Services Full timeJob Title: Site Reliability EngineerAt Tata Consultancy Services, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our data pipelines and cloud infrastructure.Key Responsibilities:Design and implement data pipelines using Tableau and Azure...
-
Site Reliability Performance Engineer
4 weeks ago
San Jose, California, United States Altius Technologies, Inc. Full timeJob SummaryAt Altius Technologies, Inc., we are seeking a highly skilled Site Reliability Performance Engineer to join our team. As a key member of our infrastructure team, you will be responsible for creating and supporting automation scripts for infrastructure deployments, validations, and monitoring. Your expertise in Ansible, Python, and monitoring tools...
-
Site Reliability Engineer, AI Platform Training
3 weeks ago
San Jose, California, United States Adobe Full timeJob Title: Site Reliability Engineer, AI Platform TrainingJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and security of our AI Platform.About the Role:* Identify and implement methodologies and solutions to...