Site Reliability Engineer
2 weeks ago
TikTok is a leading destination for short-form mobile video, and we're committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace.
We're passionate about inspiring creativity and bringing joy, and we're looking for a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability and scalability of our cloud infrastructure.
Responsibilities- Work with infrastructure, product, and platform engineering teams on operating and deploying software platforms, capacity planning, and launch reviews throughout the whole lifecycle of services.
- Maintain sustainable reliability and scalability of software systems by improving automation to measure and monitor availability, latency, and overall system health.
- Consistently evolve systems by pushing for changes that improve system reliability and release velocity.
- Practice sustainable incident response and postmortems.
- BS degree in Computer Science, Computer Engineering, Electrical Engineering, or relevant majors with 2+ years of working experience.
- Experience in programming, debugging, and optimization skills in general-purpose programming languages, including Go, Python, C/C++, Rust, or Java.
- Experience in working with Unix/Linux systems from kernel to shell and beyond.
- Experience in analyzing and debugging production issues at scale.
- Experience and understanding of infrastructure-as-code concepts, approaches, methods, and tooling.
- Hands-on experience with large cloud providers, such as AWS, Azure, GCP.
- Code infrastructure with tools, such as Kubernetes, Terraform, Ansible, Puppet, Chef, or SaltStack.
- Secure infrastructure in a distributed system with automation or practice chaos engineering.
- Experience with two or more from the following areas: web application development, Unix/Linux environments, distributed and parallel systems, developing large software systems, mobile application development, and/or security software development.
-
Site Reliability Engineer
2 weeks ago
Washington, Washington, D.C., United States MetroStar Corporation Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at MetroStar Corporation. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our systems.Key Responsibilities:Monitor and analyze platform and containerized applications to...
-
Site Reliability Engineer
2 weeks ago
Washington, Washington, D.C., United States MetroStar Systems Full timeJob Title: Site Reliability EngineerAt MetroStar Systems, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Monitor and analyze system performance to identify areas...
-
Site Reliability Engineer
5 days ago
Washington, Washington, D.C., United States Veterans Enterprise Technology Solutions Full timeJob Title: Site Reliability EngineerOverview:Veterans Enterprise Technology Solutions is seeking a highly skilled Site Reliability Engineer to join our team. This role will be responsible for ensuring the reliability and performance of our cloud-based infrastructure. The ideal candidate will have a strong understanding of SRE principles and experience with...
-
Site Reliability Engineer
1 week ago
Washington, Washington, D.C., United States Varada Consulting, LLC Full timeJob Title: Site Reliability EngineerVarada Consulting, LLC is seeking a highly skilled and experienced Site Reliability Engineer to join our team. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications through automation, monitoring, and infrastructure improvements.Key...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States MetroStar Systems Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at MetroStar Systems. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our systems.Key Responsibilities:Monitor and analyze platform and containerized applications to identify...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Alldus Full timeSite Reliability EngineerAlldus is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems.Key Responsibilities:Perform root cause analysis to identify and resolve system or application issues in a timely and...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Tik Tok Full timeAbout the RoleTikTok is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our software systems.ResponsibilitiesWork with infrastructure, product, and platform engineering teams to operate and deploy software platforms, capacity planning,...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States CloudFit Software Full timeJob Title: Site Reliability EngineerCloudFit Software is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the quality, performance, and reliability of our CloudFit Managed Applications and Services systems.Key Responsibilities:Collaborate with cross-functional teams...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Cinder LLC Full timeAbout Cinder LLCCinder LLC provides a cutting-edge investigation platform to protect the internet.Our software helps Trust and Safety teams at the world's most influential companies innovate and adapt quickly to emerging threats.Job Title: Site Reliability EngineerWe're seeking an experienced Site Reliability Engineer to lead the development and deployment...
-
Site Reliability Engineer II
2 weeks ago
Washington, Washington, D.C., United States Microsoft Full timeJob Title: Site Reliability Engineer IIMicrosoft is seeking a highly skilled Site Reliability Engineer II to join our team. As a Site Reliability Engineer II, you will be responsible for designing, developing, and delivering software engineering solutions to serve and protect O365 government clouds.Key Responsibilities:Design, develop, and deploy software...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Palantir Technologies Full time{"title": "Site Reliability Engineer", "description": "Job SummaryPalantir Technologies is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesCollaborate with cross-functional teams...
-
Site Reliability Engineer
2 weeks ago
Washington, Washington, D.C., United States Palantir Technologies Full timeAbout the RoleWe're looking for a skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesMaintain the availability of cloud and physical Linux servers that power...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States MetroStar Systems Full timeTransforming Government Services with Reliability and PerformanceAs a Site Reliability Engineer at MetroStar Systems, you will play a pivotal role in driving improvements in observability, performance, and reliability across high-level government platforms. Your expertise will be instrumental in making a lasting impact.Key Responsibilities:Monitor and...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States MetroStar Corporation Full timeMetroStar Corporation is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our organization, you will play a critical role in driving improvements in observability, performance, and reliability across our systems.**Key Responsibilities:*** Monitor and analyze platform and containerized applications to identify...
-
Site Reliability Engineer
4 weeks ago
Washington, Washington, D.C., United States MetroStar Systems Full timeTransforming Government Services with Reliability and PerformanceAs a Site Reliability Engineer at MetroStar Systems, you will play a pivotal role in driving improvements in observability, performance, and reliability across high-level government platforms. Your expertise will be instrumental in making a lasting impact.Key Responsibilities:Monitor and...
-
Director of Site Reliability Engineering
2 weeks ago
Washington, Washington, D.C., United States DataRobot Full timeJob Title: Director of Site Reliability Engineering Job Summary: DataRobot is seeking a highly skilled and experienced Director of Site Reliability Engineering to lead our SRE team. As a key member of our engineering organization, you will be responsible for ensuring the reliability, scalability, and performance of our platform. Key Responsibilities: *...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Oracle Full timeJob DescriptionOracle Health Applications & Infrastructure (OHAI) is seeking a highly skilled Site Reliability Engineer to join its OHAI Platform & Production Engineering organization.This is a unique opportunity to work on a net new line of business, constructed with an entrepreneurial spirit that promotes an energetic and creative environment.As a Site...
-
Principal Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Kansas Action for Children Full timeTransforming System ReliabilityWe're seeking a seasoned Principal Site Reliability Engineer to spearhead the improvement of system reliability and resilience at T-Mobile USA, Inc. in Overland Park, Kansas, United States.About the RoleAs a key member of our team, you'll apply your expertise to minimize manual effort and prevent operational incidents. Your...
-
Site Reliability Engineer
3 weeks ago
Washington, Washington, D.C., United States Palantir Technologies Full timeAbout the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in building, operating, and maintaining high-performance, scalable, and reliable services for our production infrastructure.Key ResponsibilitiesMaintain the availability of cloud and physical...
-
Site Reliability Engineer
2 days ago
Washington, Washington, D.C., United States Palantir Technologies Full timeAbout the RoleWe are seeking a skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesCollaborate with cross-functional teams to design, implement, and maintain...