Senior Site Reliability Engineer
3 weeks ago
This position contributes to Saxon Global's Data Platform Services team, maintaining and improving the data platform that many services are dependent on. The successful candidate will be responsible for the health of production systems, developing monitoring dashboards, and configuring alerts and automation for system recovery.
Key Responsibilities- Responsible for the health of production systems
- Develop monitoring dashboards
- Configure alerts and automate process for system recovery
- Monitor alerts and take proactive steps to resolve system issues
- Troubleshoot production issues
- Lead production troubleshooting calls
- Responsible for patches and updates on production systems
- Design and build cutting-edge, multi-micro service solutions to support Saxon Global's growth worldwide
- Work with cross-functional teams for on-going design efforts and systems support
- Automate password and certificate rotations on application and DB servers
- Helping CI/CD team during rolling out application and infrastructure globally
- Collaborates with development team, other Information Technology (IT) team's developer leads
- Initiates process improvements for new and existing systems
- Coaches, and mentors other team members
- Performs cross-training and facilitates information sharing among team members
- Participates in a production support rotation that includes pager responsibilities
- Senior Site Reliability Engineering Experience - 5+ years as an SRE and 7+ total in IT Industry
- Expert in Azure Cloud
- Expert in Kubernetes
- Strong Skills with SQL
- Strong skills with Kafka, Event Hub, or other Messaging Broker
- Must be able to commute to Saxon Global headquarters in Seattle 3 times a week - local to Seattle area
- Requires 7+ years experience in the IT industry
- Requires 5+ years of software and DevOps development engineering
- Experience in working with cloud environment Azure preferred
- Experience with using Kafka, Event Hub or any messaging broker
- Experience with Cassandra, PostgresSQL, Cosmos DB
- Experience on Jenkins/ Python / Terraform / Ansible
- Experience with DataDog, Splunk or other logging and APM tools
- Experience in working with Linux environment
- In-depth understanding of Computer Science fundamentals in object-oriented design, data structures, algorithms, and problem solving
- Experience building complex, scalable, high-performance software systems that have been successfully delivered to customers
- Demonstrated knowledge of best practices for the design and implementation of large-scale systems as well as experience in taking such systems from design to production
- Experience building and operating mission critical, highly available (24x7) systems
- Ability to work well with a team in a fast-paced agile development environment
- Bachelors in Computer Science or equivalent work experience
- Excellent communication, analytical and problem-solving skills
- Extensive understanding in SDLC and scrum methodologies
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States SingleStore Full timeSenior Site Reliability EngineerAt SingleStore, we're seeking a seasoned Senior Site Reliability Engineer to drive our Kubernetes product strategy and help shape the future of our managed service.Key ResponsibilitiesDesign and build elastic Kubernetes clusters across on-prem, AWS, Azure, and Google Cloud environments.Develop and maintain production container...
-
Senior Site Reliability Engineer
3 weeks ago
Seattle, Washington, United States Saxon Global Full timeJob SummaryStarbucks is seeking a highly skilled Senior Site Reliability Engineer to join their Data Platform Services team. This team is responsible for maintaining and improving the data platform that many Starbucks services rely on.Key ResponsibilitiesEnsure the health and stability of production systemsDevelop and implement monitoring dashboards and...
-
Senior Site Reliability Engineer
3 weeks ago
Seattle, Washington, United States F5 Networks Full timeAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at F5 Networks. As a key member of our engineering team, you will be responsible for ensuring the reliability and performance of our systems.Key ResponsibilitiesDesign and implement scalable and efficient system architecturesDevelop and maintain monitoring and...
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Moloco Full timeAbout MolocoMoloco is a cutting-edge machine learning company that empowers organizations to grow and unlock the full value of their unique first-party data. Our mission is to elevate the traditional path to performance advertising, making it accessible to businesses of all sizes.Our TechnologyWe operate at massive scale, ingesting 10 petabytes of training...
-
Senior Site Reliability Engineer
2 weeks ago
Seattle, Washington, United States Diverse Lynx Full timeJob Title: Sr. Site Reliability EngineerLocation: RemoteDuration: 12+ Months contractJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the availability, reliability, and performance of our applications and services.You will work...
-
Site Reliability Engineer
3 weeks ago
Seattle, Washington, United States Sogeti Full timeJob Title: Site Reliability EngineerAbout the Role:We are seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using Azure or...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States HireIO Inc Full timeJob Title: Site Reliability EngineerHireIO Inc is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our distributed systems.Key Responsibilities:Design and implement scalable and reliable systemsCollaborate with cross-functional...
-
Site Reliability Engineer
2 weeks ago
Seattle, Washington, United States Oracle Full timeAbout the Role:Oracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, develop, and deploy software to improve the availability, scalability, and efficiency of...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Sogeti Full timeSite Reliability Engineer **Job Summary** We are seeking an experienced Site Reliability Engineer to join our team. As a key member of our operations team, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure. **Key Responsibilities** * Design, implement, and maintain scalable and reliable cloud...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Apple Full timeJob Title: Site Reliability EngineerAt Apple, we're looking for a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.About the RoleWe are seeking a talented and motivated individual to join our dynamic...
-
Site Reliability Engineer
2 weeks ago
Seattle, Washington, United States Oracle Full timeAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Oracle. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure. You will work closely with our development teams to design, implement, and operate large-scale distributed...
-
Senior Cloud Reliability Engineer
4 weeks ago
Seattle, Washington, United States Elit IT Inc. Full timeJob Title: Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Elit IT Inc. in Seattle, WA. As a key member of our cloud operations team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design, implement, and...
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States F5 Networks Full timeAbout the RoleWe are seeking a highly skilled Sr SRE to join our team at F5 Networks. As a key member of our engineering community, you will play a critical role in ensuring the reliability and performance of our systems.Key ResponsibilitiesApply observability and data skills to proactively measure system performance, diagnose services/needs, and quickly...
-
Site Reliability Engineer
1 month ago
Seattle, Washington, United States Tik Tok Full timeAbout TikTok U.S. Data SecurityTikTok U.S. Data Security is a subsidiary of TikTok in the U.S., dedicated to protecting user data and ensuring the security of our platform.ResponsibilitiesWe are seeking a highly motivated and experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the...
-
Senior Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Saxon Global Full timeJob SummaryThis position contributes to Starbucks on their Data Platform Services team, maintaining and improving the data platform that many Starbucks services are dependent on. The successful candidate will be responsible for the health of the production system, developing monitoring dashboards, configuring alerts, and automating process for system...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Apple Full timeJob Title: Site Reliability EngineerAt Apple, we're looking for a skilled Site Reliability Engineer to join our Object Storage SRE team. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and performance of our cloud storage systems.About the RoleWe're seeking a seasoned software and systems engineer with a...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Sogeti Full timeSite Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Sogeti. As a key member of our operations team, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using Azure or...
-
Site Reliability Engineer
2 weeks ago
Seattle, Washington, United States HireIO Inc Full timeJob SummaryAt HireIO Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and reliability of our Ads systems. This includes designing, analyzing, and troubleshooting large-scale distributed systems, as well as developing tools and...
-
Site Reliability Engineer
4 weeks ago
Seattle, Washington, United States Capgemini Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our software systems and infrastructure.Key Responsibilities:Develop, maintain, and configure cloud observability systems (e.g.,...
-
Senior Cloud Site Reliability Engineer
1 month ago
Seattle, Washington, United States MongoDB Full timeAbout the RoleMongoDB is seeking a highly skilled Senior Cloud Site Reliability Engineer to join our Cloud Team. As a key member of our team, you will be responsible for designing and building the global infrastructure on which we deploy our services.ResponsibilitiesDesign and build the infrastructure for a global cloud service that comprises hundreds of...