Site Reliability Engineer
3 days ago
We're seeking a highly skilled Site Reliability Engineer to join our Command Center Team at X (formerly Twitter). As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our services, working closely with cross-functional teams to drive significant impact across all areas of the business.
Key Responsibilities- Triage and Troubleshoot Complex Issues: Analyze and resolve critical system issues at massive scale, ensuring high availability and reliability of our services.
- Continuous Improvement: Develop and implement strategies to continuously improve system performance, reliability, and resiliency. Apply techniques to detect and remediate bot and bad actor behavior. Optimize services for peak utilization and performance.
- Software Development: Create and maintain software for load testing, failure detection, traffic management, and data analysis using Python, Go, Scala, JavaScript, and Superset.
- Incident Management: We are the primary incident managers for the entire site. We provide clear and effective communication and lead engineering teams to mitigate impact and ensure timely resolution.
- Enhance SLA/SLO Understanding: Continually refine service-level objectives (SLOs) across the stack, ensuring we meet/exceed our error budgets and user expectations.
- User Experience Measurement: Implement high-fidelity metrics to more accurately measure and improve the user experience across our constantly evolving set of services.
- Service Dependency Analysis: Use distributed tracing to understand and manage service dependencies, to facilitate debugging and to improve latencies.
- Cross-Functional Collaboration: Work closely with various teams including Product, Infrastructure, and Safety. We leverage our lean structure to drive significant impact across all areas of the business.
- Highly self-motivated team player.
- Enjoy approaching complex problems, thinking critically, and prototyping solutions in a dynamic and fast-paced environment without needing constant supervision.
- Strong debugging, documentation and communication skills.
- Availability for occasional travel visits to San Francisco HQ.
- Bachelor's degree or above in Computer Science, Engineering, or related field.
- 2+ years of experience in large-scale software development with a specific focus on site reliability engineering.
- Profound understanding of computer science fundamentals, including data structures, algorithms, and concurrency principles.
- Expertise with observability and monitoring, incident management, load testing, microservice architecture and design patterns, distributed systems, data visualization tools, and SQL-like query languages.
- Proficiency in one or more object-oriented programming languages (e.g. Scala, Java, C++). Additional knowledge of Python or Golang will be considered a significant asset.
- Strong knowledge of Unix/Linux system administration at scale.
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Altius Technologies, Inc. Full timeJob Title: Site Reliability EngineerAltius Technologies, Inc. is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the infrastructure and systems that support our business applications.Key Responsibilities:Design and implement automation...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Site Reliability Engineer
1 day ago
San Jose, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerJob Summary:Diverse Lynx LLC is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement automation scripts using shell,...
-
Site Reliability Engineer
3 days ago
San Jose, California, United States NetApp Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at NetApp. As a Site Reliability Engineer, you will be responsible for managing, supporting, and maintaining a reliable environment for our site to ensure the stability and security of multiple open-source systems/platforms.Key ResponsibilitiesBuilding and supporting a...
-
Site Reliability Engineer
3 days ago
San Jose, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineerAt Syntricate Technologies, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Site Reliability Engineer
1 week ago
San Jose, California, United States Adobe Full timeAbout the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.ResponsibilitiesDesign and implement scalable and reliable systems to support our cloud-based servicesCollaborate...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States ApTask Full timeAbout ApTask:ApTask is a leading global provider of workforce solutions and talent acquisition services, dedicated to shaping the future of work.As an African American-owned and Veteran-certified company, ApTask offers a comprehensive suite of services, including staffing and recruitment solutions, managed services, IT consulting, and project management.With...
-
Site Reliability Engineer
2 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a key member of our Cloud Engineering team, you will play a critical role in designing, deploying, and optimizing our cloud services.Key ResponsibilitiesDevelop software and tools to improve the reliability and performance of our cloud servicesCollaborate...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Cisco Full timeAbout the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure. You will work closely with our development teams to identify and resolve issues, and collaborate with other teams to...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Splunk Full timeAbout SplunkSplunk is a leading provider of cloud-based data analytics and monitoring solutions. Our mission is to make machine data accessible, usable, and valuable to everyone.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our Cloud TechOps team. As a Site Reliability Engineer, you will be responsible for ensuring the...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services.ResponsibilitiesEnsure the highest level of uptime and Quality of Service (QoS) to Adobe's customers through...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our cloud-based services.Key ResponsibilitiesEnsure the highest level of uptime and Quality of Service (QoS) to Adobe's customers through...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Trianz Full timeAbout TrianzTrianz is a leading-edge technology platforms and services company that accelerates digital transformations at Fortune 100 and emerging companies worldwide in data & analytics, digital experiences, cloud infrastructure, and security.Our VisionWe believe that companies around the world face three challenges in their digital transformation journeys...
-
Site Reliability Engineer
4 weeks ago
San Jose, California, United States Tik Tok Full time{"title": "Site Reliability Engineer", "description": "\u003Cp\u003EAt TikTok, we're seeking Site Reliability Engineers (SREs) to join our monetization technology team.\u003C/p\u003E\u003Cp\u003EOur team works on building and running large-scale, globally distributed, fault-tolerant ads systems.\u003C/p\u003E\u003Cp\u003ESREs keep the systems up and running...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Altius Technologies Inc Full timeJob DescriptionAt Altius Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for creating and supporting automation scripts for infrastructure deployments, validations, and monitoring to improve operational tasks.Key Responsibilities:Design and implement...
-
Site Reliability Engineer
3 weeks ago
San Jose, California, United States Adobe Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services.Key ResponsibilitiesDevelop software and tools to design, deploy, and optimize cloud servicesProvide hands-on technical...
-
Lead Site Reliability Engineer
3 days ago
San Jose, California, United States VDart Full timeJob Title:Lead Site Reliability EngineerJob Summary:Vdart is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our software systems.Key Responsibilities:Design and implement automation scripts to improve operational...