Chief AWS Site Reliability Engineer
1 week ago
Description EPAM Systems is looking for a Chief AWS SRE Engineer who fully understands and practices SRE activities and philosophy to join the global engineering team that ensures fleet services reliability and availability under the SRE model. If you're passionate about innovation, we invite you to apply and become part of our team EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential. #EasyApply Responsibilities Collaborate with service teams to improve the reliability and efficiency of workloads and services using SRE practices Develop and improve CI/CD processes to enhance release cadence and success Build, consume toil backlog, automating toilsome tasks Document knowledge and processes Practice and promote sustainable incident response and blameless postmortems Write code that improves scalability, performance, maintainability, and security Implement distributed monitoring practices Refine monitoring processes, configurations, and thresholds Contribute towards the identification and implementation of service level indicators and objectives for workloads and services Requirements 7+ years of cloud engineering experience, with a good track record of highly scalable, distributed systems projects in the past 5 years Previous experience working as an SRE engaged with active development teams is a must, and the candidate should have a good understanding of SRE methodologies and philosophies AWS cloud expertise Ideally, has experience running multi-region workloads and has in-depth knowledge of most of the commonly used AWS services Observability experience with distributed services, for example, experience of distributed tracing and similar concepts Independent and self-directed people to work alongside client engineering teams under minimal supervision Strong programming and automation experience: Python, Golang Understanding of the software development lifecycle Fluent English communication skills at a B2+ level We offer Connectivity Bonus (15, ARS are paid with a salary receipt at the end of each month as a non-wages concept) Medicina Prepaga (It covers the collaborator and direct family group) Paternity Leave (Two additional days are added to what is established by law, total of 4 days) Discounts card English Training (English lessons, twice per week) Training Program (Access to multiple customized training plans according to the needs of each role within the company) Marriage bonus (The company doubles the allowance established by law that ANSES offers) Referral Program (Referral bonus is paid when the referral of a collaborator joins the Company) External Agreements and Discounts
-
AWS - Site Reliability Engineer
2 days ago
remote, us Epam Full timeDescription DESCRIPTION Join EPAM as an AWS SRE. In this role, you'll collaborate with service teams to improve the reliability and efficiency of workloads and services using SRE practices. If you're a senior engineer with a good track record of highly scalable, distributed systems projects in the past 5 years, we'd love to hear from you. EPAM is a leading...
-
Senior Site Reliability Engineer
7 days ago
remote, us Epam Full timeDescription DESCRIPTION Join EPAM as a Senior Site Reliability Engineer specializing in AWS! In this role, you'll ensure fleet services reliability and availability under the SRE model. If you have a good track record of highly scalable, distributed systems projects and previous experience working as an SRE, we'd love to hear from you. EPAM is a leading...
-
AWS Cloud Site Reliability Engineer
7 days ago
remote, us Epam Full timeDescription DESCRIPTION Join EPAM as an AWS Cloud Site Reliability Engineer. In this role, you'll transfer security processes, manage authentication technologies, and support the implementation of a Palo Alto firewall. If you have 3+ years of experience with AWS, proficiency in designing and managing data migration processes, and superior communication...
-
Staff Site Reliability Engineer
1 week ago
remote, us Crisis Text Line Full timeCrisis Text Line provides free, 24/7, high-quality text-based mental health support and crisis intervention by empowering a community of trained volunteers to support people in their moments of need.Our mission is at the intersection of empathy and innovation — we promote mental well-being for people wherever they are.Our vision is an empathetic world...
-
Senior Site Reliability Engineer
5 days ago
remote, us Epam Full timeDescription DESCRIPTION Are you a seasoned professional with a passion for site reliability engineering and a knack for leading strategic initiatives? Join our dynamic team at EPAM, a leading global provider of digital platform engineering and software development services. We are seeking a Senior Site Reliability Engineer who can make a significant impact...
-
Site Reliability Engineer
20 hours ago
Remote, Oregon, United States ADT Full time $200,000 - $250,000 per yearADT is transitioning to an in-office model. New team members will work from home but should plan to return to an in-office model at a later date. We will keep you well informed and supported throughout the transition.Summary:We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for...
-
Senior Site Reliability Engineer
4 days ago
Remote, Oregon, United States D-Wave Full time $124,545 per yearD-Wave (NYSE: QBTS), D-Wave is a leader in the development and delivery of quantum computing systems, software, and services. We are the world's first commercial supplier of quantum computers, and the only company building both annealing and gate-model quantum computers. Our mission is to help customers realize the value of quantum, today. Our quantum...
-
Azure DevOps Site Reliability Engineer
2 weeks ago
remote, us Epam Full timeDescription DESCRIPTION Are you a skilled Azure DevOps Site Reliability Engineer with a passion for ensuring business continuity and helping businesses always be near their clients? Do you have experience in optimizing and supporting OSDU deployment, performing monitoring including incidents resolution, and suggesting improvements? If so, we have an exciting...
-
Senior Site Reliability Engineer
1 week ago
Remote, United States Grafana Labs Full timeSenior Site Reliability Engineer - DatabasesThis is a remote position and we're considering candidates in the USA & Canada.About the role:We are looking for a Senior SRE to help us support our highest value Grafana Cloud customers by increasing the reliability of our Cloud databases that are based on Mimir, Loki, Tempo, and Pyroscope. We provide these...
-
Site Reliability Engineer
1 week ago
remote, us Epam Full timeDescription We are seeking a Site Reliability Engineer (Azure) to join our team. #Not found Responsibilities As a Lead Azure SRE, you will be responsible for driving the reliability, performance, and scalability of cloud-based applications and services. Your expertise in Kubernetes, scripting, troubleshooting, and observability will be instrumental in...