Site Reliability Engineer
2 weeks ago
TikTok is seeking a highly skilled Site Reliability Engineer to join our Compute Platform SRE team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our major data warehouse products, services, and query engines.
Responsibilities- Ensure the reliability of all TikTok's major data warehouse products, services, and query engines, such as ClickHouse, Spark, Presto, Doris, etc.
- Uphold Service Level Agreements (SLAs): Ensure that all service level objectives and agreements from ByteDance's Data Platform services are met. Respond promptly to any system outages or issues.
- Continuous Performance Optimization: Analyze service performance and reliability patterns to identify potential performance bottlenecks. Implement proactive measures to prevent service disruptions. Work with development teams to optimize application performance, ensuring that services run efficiently and that resources are utilized effectively.
- Incident Management: Lead efforts to troubleshoot and resolve service incidents and postmortems. Coordinate with cross-functional teams to manage and mitigate service-impacting events.
- Infrastructure Automation: Automate infrastructure provisioning, scaling, and management processes to reduce manual interventions and improve service quality
- Bachelor's Degree or above, in Computer Science, Engineering, or a related field.
- Passionate about computer science and Internet technology.
- In-depth understanding of Linux, computer networking, and databases.
- Proficient in common SRE/DevOps open-source toolsets, system monitoring tools, and container orchestration platforms like Kubernetes.
- Experience or familiarity with open source or commercial technologies such as ClickHouse, Hadoop, Doris, Spark, Presto, and Kubernetes.
- Strong coding skills in at least one scripting or programming language, including but not limited to Python, Shell, Java, Go, etc.
- Excellent problem-solving skills and the ability to think critically under pressure.
- Strong written and verbal communication skills, with a great customer-first mindset.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. We are passionate about this and hope you are too.
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Altius Technologies, Inc. Full timeJob Title: Site Reliability EngineerAltius Technologies, Inc. is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that support our business applications.Key Responsibilities:Design and implement automation...
-
Site Reliability Engineer
3 days ago
San Jose, California, United States ApTask Full timeAbout ApTask:ApTask is a leading global provider of workforce solutions and talent acquisition services, dedicated to shaping the future of work.As an African American-owned and Veteran-certified company, ApTask offers a comprehensive suite of services, including staffing and recruitment solutions, managed services, IT consulting, and project management.With...
-
Site Reliability Engineer
3 days ago
San Jose, California, United States Adobe Full timeAbout the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a key member of our Cloud Engineering team, you will play a critical role in designing, deploying, and optimizing our cloud services.Key ResponsibilitiesDevelop software and tools to improve the reliability and performance of our cloud servicesCollaborate...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Cisco Full timeAbout the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure. You will work closely with our development teams to identify and resolve issues, and collaborate with other teams to...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Splunk Full timeAbout SplunkSplunk is a leading provider of cloud-based data analytics and monitoring solutions. Our mission is to make machine data accessible, usable, and valuable to everyone.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our Cloud TechOps team. As a Site Reliability Engineer, you will be responsible for ensuring the...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services.ResponsibilitiesEnsure the highest level of uptime and Quality of Service (QoS) to Adobe's customers through...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services.Key ResponsibilitiesEnsure the highest level of uptime and Quality of Service (QoS) to Adobe's customers through...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Trianz Full timeAbout TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Tik Tok Full time{"title": "Site Reliability Engineer", "description": "\u003Cp\u003EAt TikTok, we're seeking Site Reliability Engineers (SREs) to join our monetization technology team.\u003C/p\u003E\u003Cp\u003EOur team works on building and running large-scale, globally distributed, fault-tolerant ads systems.\u003C/p\u003E\u003Cp\u003ESREs keep the systems up and running...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services.Key ResponsibilitiesDevelop software and tools to design, deploy, and optimize cloud servicesProvide hands-on technical...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Altius Technologies Inc Full timeJob DescriptionAt Altius Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for creating and supporting automation scripts for infrastructure deployments, validations, and monitoring to improve operational tasks.Key Responsibilities:Design and implement...
-
Site Reliability Engineer Graduate
3 weeks ago
San Jose, California, United States ByteDance Full timeAbout the RoleByteDance is seeking a highly skilled Site Reliability Engineer to join our Applied Machine Learning team. As a Site Reliability Engineer, you will play a critical role in ensuring the availability and performance of our machine learning services, which are used by hundreds of millions of people around the world.ResponsibilitiesDesign and...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a key member of our Cloud Engineering team, you will play a critical role in designing, deploying, and optimizing our cloud services.Key ResponsibilitiesDevelop software and tools to improve the reliability and performance of our cloud servicesProvide...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States NetApp Full timeJob SummaryAs a Site Reliability Engineer at NetApp, you will be responsible for managing, supporting, and maintaining a reliable environment for our site. This involves ensuring the stability and security of multiple open-source systems and platforms that are run or operated in that environment.Key ResponsibilitiesBuilding and supporting a reliable site for...
-
Site Reliability Engineer
3 hours ago
San Jose, California, United States Trianz Full timeAbout TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...
-
Cloud Site Reliability Engineer
2 weeks ago
San Jose, California, United States Tik Tok Full timeJob Title: Cloud Site Reliability EngineerWe are seeking a highly skilled Cloud Site Reliability Engineer to join our team at TikTok. As a Cloud Site Reliability Engineer, you will be responsible for building, expanding, and operating Bytedance's global infrastructures, including large-scale systems in public and private clouds, data centers, and content...
-
Site Reliability Engineer Graduate
7 days ago
San Jose, California, United States ByteDance Full timeAbout the Role:ByteDance is seeking a highly skilled Site Reliability Engineer to join our Applied Machine Learning team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our machine learning services.Responsibilities:Design and implement large-scale systems to support machine learning...