Site Reliability Engineer

1 month ago


Seattle, Washington, United States Tik Tok Full time
About the Role

This is a Site Reliability Engineer position, focusing on the data pipeline reliability for the Video Platform team in USDS.

Data SREs monitor data and keep production batch and real-time processing jobs up and running with the highest level of availability, ensuring our users have the freshest, complete, and correct data possible.

Responsibilities
  • Manage day-to-day operations of data service, real-time/batch data pipelines, such as Service Level Agreement management, pipeline deployment, performance tuning, and troubleshooting
  • Proactively monitor and troubleshoot data pipelines and systems for performance issues, errors, or anomalies
  • Create tools, build alarms, and dashboards, drive internal process improvements, and automation to monitor and improve data engineering operations
  • Improve systems reliability, efficiency, and velocity through scaling, optimization of both resources and data processing workflows, potentially refactoring code or implementing new solutions
  • Develop and deploy new reliable and scalable data pipelines and infrastructure components as required by business needs
  • Work closely with data engineering and various vertical teams within the Video Architecture platform
Qualifications
  • Minimum Qualifications
    • Bachelor's in Computer Science or a related technical background involving software/system engineering, or equivalent working experience
    • Good programming experience with SQL and at least one of the following languages: Java, Python, Go, or Scala
    • Experience in data engineering, with a focus on data systems reliability, scalability, and performance
  • Preferred Qualifications
    • Solid experience with big data technologies (e.g., Hadoop, Spark, Flink, YARN) and databases (SQL, NoSQL)
    • Knowledge of data pipeline and workflow management tools (e.g., Airflow, Luigi)
    • Demonstrated independent thinking capabilities and troubleshooting skills in large-scale distributed systems
    • Good communication and coordination skills
    • Experience in building data solutions with AWS, Azure, and other cloud services is a plus
About Us

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace.

We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs, or other reasons protected by applicable laws.

If you need assistance or a reasonable accommodation, please reach out to us.



  • Seattle, Washington, United States Sogeti Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using Azure or...


  • Seattle, Washington, United States HireIO Inc Full time

    Job Title: Site Reliability EngineerHireIO Inc is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our distributed systems.Key Responsibilities:Design and implement scalable and reliable systemsCollaborate with cross-functional...


  • Seattle, Washington, United States Oracle Full time

    About the Role:Oracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, develop, and deploy software to improve the availability, scalability, and efficiency of...


  • Seattle, Washington, United States Sogeti Full time

    Site Reliability Engineer **Job Summary** We are seeking an experienced Site Reliability Engineer to join our team. As a key member of our operations team, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure. **Key Responsibilities** * Design, implement, and maintain scalable and reliable cloud...


  • Seattle, Washington, United States Apple Full time

    Job Title: Site Reliability EngineerAt Apple, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.About the RoleWe are seeking a talented and motivated individual to join our dynamic...


  • Seattle, Washington, United States Oracle Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Oracle. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure. You will work closely with our development teams to design, implement, and operate large-scale distributed...


  • Seattle, Washington, United States Tik Tok Full time

    About TikTok U.S. Data SecurityTikTok U.S. Data Security is a subsidiary of TikTok in the U.S., dedicated to protecting user data and ensuring the security of our platform.ResponsibilitiesWe are seeking a highly motivated and experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the...


  • Seattle, Washington, United States Apple Full time

    Job Title: Site Reliability EngineerAt Apple, we're looking for a skilled Site Reliability Engineer to join our Object Storage SRE team. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and performance of our cloud storage systems.About the RoleWe're seeking a seasoned software and systems engineer with a...


  • Seattle, Washington, United States Sogeti Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Sogeti. As a key member of our operations team, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using Azure or...


  • Seattle, Washington, United States HireIO Inc Full time

    Job SummaryAt HireIO Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and reliability of our Ads systems. This includes designing, analyzing, and troubleshooting large-scale distributed systems, as well as developing tools and...


  • Seattle, Washington, United States Capgemini Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our software systems and infrastructure.Key Responsibilities:Develop, maintain, and configure cloud observability systems (e.g.,...


  • Seattle, Washington, United States Diverse Lynx Full time

    Job Title: Sr. Site Reliability EngineerLocation: RemoteDuration: 12+ Months contractJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the availability, reliability, and performance of our applications and services.You will work...


  • Seattle, Washington, United States SingleStore Full time

    Senior Site Reliability EngineerAt SingleStore, we're seeking a seasoned Senior Site Reliability Engineer to drive our Kubernetes product strategy and help shape the future of our managed service.Key ResponsibilitiesDesign and build elastic Kubernetes clusters across on-prem, AWS, Azure, and Google Cloud environments.Develop and maintain production container...


  • Seattle, Washington, United States Apple Full time

    Site Reliability Engineering ManagerAt Apple, we're looking for a skilled Site Reliability Engineering Manager to join our team. As a Site Reliability Engineering Manager, you will be responsible for leading a team that provides the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and...


  • Seattle, Washington, United States Tik Tok Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Data Platform Team at TikTok. As a key member of our team, you will be responsible for designing, building, and operating large-scale, massively distributed services and infrastructures.Key ResponsibilitiesDesign and implement reliable, scalable, and robust big data systems...


  • Seattle, Washington, United States Apple Full time

    Job SummaryApple is seeking a highly skilled and motivated Security Site Reliability Engineer (SRE) to join our dynamic and growing team.Key ResponsibilitiesEnsure the security, reliability, and scalability of our systems and infrastructure.Collaborate with cross-functional teams to design, implement, and maintain security measures, incident response...


  • Seattle, Washington, United States Sogeti Full time

    Job Title: Lead Site Reliability Engineer Job Summary: We are seeking a highly skilled Lead Site Reliability Engineer to join our team at Sogeti. The successful candidate will be responsible for developing and maintaining cloud observability systems, building monitoring and alerting systems, and optimizing system performance. Key Responsibilities: *...


  • Seattle, Washington, United States Apple Full time

    Job DescriptionWe are seeking a highly skilled Security Site Reliability Engineer (SRE) to join our dynamic and growing team at Apple. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.You will collaborate with cross-functional teams to design, implement, and maintain...


  • Seattle, Washington, United States Saxon Global Full time

    Job SummaryStarbucks is seeking a highly skilled Senior Site Reliability Engineer to join their Data Platform Services team. This team is responsible for maintaining and improving the data platform that many Starbucks services rely on.Key ResponsibilitiesEnsure the health and stability of production systemsDevelop and implement monitoring dashboards and...


  • Seattle, Washington, United States Tik Tok Full time

    About the RoleWe are seeking an experienced Site Reliability Engineer to join our USDS Video Platform team at TikTok. As a key member of our team, you will be responsible for ensuring the reliability and scalability of our video system, which serves billions of users worldwide.ResponsibilitiesDesign and implement scalable and reliable systems to support our...