Senior Site Reliability Engineer
1 week ago
Description DESCRIPTION Join EPAM as a Senior Site Reliability Engineer specializing in AWS In this role, you'll ensure fleet services reliability and availability under the SRE model. If you have a good track record of highly scalable, distributed systems projects and previous experience working as an SRE, we'd love to hear from you. EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential. Responsibilities Collaborating with service teams to improve the reliability and efficiency of workloads and services using SRE practices Organizing and participating in wargames/gamedays and enhancing observability practices and tooling Developing and improving CI/CD processes to enhance release cadence and success Building, consuming toil backlog, automating toilsome tasks, and documenting knowledge and processes Writing code that improves scalability, performance, maintainability, and security Requirements Senior Engineers with a good track record of highly scalable, distributed systems projects in the past 5 years Previous experience working as an SRE and a good understanding of SRE methodologies and philosophies AWS cloud expertise and experience running multi-region workloads Observability experience with distributed services, for example, experience of distributed tracing and similar concepts Strong programming and automation experience: Python, Golang We Offer Career plan and real growth opportunities Unlimited access to LinkedIn learning solutions International Mobility Plan within 25 countries Constant training, mentoring, online corporate courses, eLearning and more English classes with a certified teacher Support for employees initiatives (Algorithms club, toastmasters, agile club and more) Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more) Flexible work schedule and dress code Collaborate in a multicultural environment and share best practices from around the globe Hired directly by EPAM & % under payroll Law benefits (IMSS, INFONAVIT, 25% vacation bonus) Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members) 13 % employee savings fund, capped to the law limit Grocery coupons 30 days December bonus Employee Stock Purchase Plan 12 vacations days plus 4 floating days Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st) Relocation bonus: transportation, 2 weeks of accommodation for you and your family and more Monthly non-taxable amount for the electricity and internet bills
-
Senior Site Reliability Engineer
7 days ago
remote, us Epam Full timeDescription DESCRIPTION Are you a seasoned professional with a passion for site reliability engineering and a knack for leading strategic initiatives? Join our dynamic team at EPAM, a leading global provider of digital platform engineering and software development services. We are seeking a Senior Site Reliability Engineer who can make a significant impact...
-
Senior Site Reliability Engineer
2 weeks ago
Remote, United States Webflow Full timeAt Webflow, our mission is to bring development superpowers to everyone. Webflow is the leading visual development platform for building powerful websites without writing code. By combining modern web development technologies into one platform, Webflow enables people to build websites visually, saving engineering time, while clean code seamlessly generates...
-
Site Reliability Engineer
1 day ago
remote, us Epam Full timeDescription DESCRIPTION Join our dynamic team as a Site Reliability Engineer and lead the way in optimizing and automating our Linux-based infrastructure. With 3 to 5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure, you will play a crucial role in elevating our capabilities and ensuring high-impact, internet-facing production...
-
Senior Site Reliability Engineer II
4 days ago
Remote, Oregon, United States Shutterfly Full time $106,000 - $151,000 per yearAt Shutterfly, we make life's experiences unforgettable. We believe there is extraordinary power in the self-expression. That's why our family of brands helps customers create products and capture moments that reflect who they uniquely are.Shutterfly is looking for a Senior Site Reliability Engineer to join our team. Shutterfly is undergoing a comprehensive...
-
AWS - Site Reliability Engineer
3 days ago
remote, us Epam Full timeDescription DESCRIPTION Join EPAM as an AWS SRE. In this role, you'll collaborate with service teams to improve the reliability and efficiency of workloads and services using SRE practices. If you're a senior engineer with a good track record of highly scalable, distributed systems projects in the past 5 years, we'd love to hear from you. EPAM is a leading...
-
Staff Site Reliability Engineer
1 week ago
remote, us Crisis Text Line Full timeCrisis Text Line provides free, 24/7, high-quality text-based mental health support and crisis intervention by empowering a community of trained volunteers to support people in their moments of need.Our mission is at the intersection of empathy and innovation — we promote mental well-being for people wherever they are.Our vision is an empathetic world...
-
Senior Site Reliability Engineer
2 weeks ago
Remote, United States Grafana Labs Full timeSenior Site Reliability Engineer - DatabasesThis is a remote position and we're considering candidates in the USA & Canada.About the role:We are looking for a Senior SRE to help us support our highest value Grafana Cloud customers by increasing the reliability of our Cloud databases that are based on Mimir, Loki, Tempo, and Pyroscope. We provide these...
-
Azure DevOps Site Reliability Engineer
2 weeks ago
remote, us Epam Full timeDescription DESCRIPTION Are you a skilled Azure DevOps Site Reliability Engineer with a passion for ensuring business continuity and helping businesses always be near their clients? Do you have experience in optimizing and supporting OSDU deployment, performing monitoring including incidents resolution, and suggesting improvements? If so, we have an exciting...
-
Remote, United States Upstart Full timeAbout UpstartUpstart is the leading AI lending marketplace partnering with banks and credit unions to expand access to affordable credit. By leveraging Upstart's AI marketplace, Upstart-powered banks and credit unions can have higher approval rates and lower loss rates across races, ages, and genders, while simultaneously delivering the exceptional...
-
Senior Site Reliability Engineer
5 days ago
Remote, Oregon, United States D-Wave Full time $124,545 per yearD-Wave (NYSE: QBTS), D-Wave is a leader in the development and delivery of quantum computing systems, software, and services. We are the world's first commercial supplier of quantum computers, and the only company building both annealing and gate-model quantum computers. Our mission is to help customers realize the value of quantum, today. Our quantum...