Director - Site Reliability Engineering

2 weeks ago


Phoenix, United States Amex Full time

**You Lead the Way. We’ve Got Your Back.**

With the right backing, people and businesses have the power to progress in incredible ways. When you join Team Amex, you become part of a global and diverse community of colleagues with an unwavering commitment to back our customers, communities and each other. Here, you’ll learn and grow as we help you create a career journey that’s unique and meaningful to you with benefits, programs, and flexibility that support you personally and professionally.

At American Express, you’ll be recognized for your contributions, leadership, and impact—every colleague has the opportunity to share in the company’s success. Together, we’ll win as a team, striving to uphold our company values and powerful backing promise to provide the world’s best customer experience every day. And we’ll do it with the utmost integrity, and in an environment where everyone is seen, heard and feels like they belong.

Join Team Amex and let's lead the way together.

As part of our diverse tech team, you can architect, code and ship software that makes us an essential part of our customers’ digital lives. Here, you can work alongside talented engineers in an open, supportive, inclusive environment where your voice is valued, and you make your own decisions on what tech to use to solve challenging problems. Amex offers a range of opportunities to work with the latest technologies and encourages you to back the broader engineering community through open source. And because we understand the importance of keeping your skills fresh and relevant, we give you dedicated time to invest in your professional development. Find your place in technology on #TeamAmex.

**How will you make an impact in this role?**

Most of our software development focuses on delivery new features while optimizing existing systems, building infrastructure and eliminating work through automation. As a leader of the SRE team, you’ll have the opportunity to manage the complex challenges at scale which are unique to American Express, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and willingness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and try new things in a blame-less environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

As an Engineering Director, you will operate and lead a global Site Reliability Engineering (SRE) organization, and partner with the Core Engineering and Platform Teams. You will work with engineering and product partners to ensure alignment between the organizations and contribute to the key strategic efforts. You will model and mentor talent across the pillars to ensure SRE is influential across the substantial product area engineering efforts.

**Responsibilities include**
- Define a comprehensive SRE strategy that aligns with the company’s overall goals and objectives, drive technology roadmaps, and implement proactive controls to mitigate risks and enhance system resilience.
- Analyze and solve complex distributed systems that combine internally and externally developed software and production infrastructure. Develop and promote standard methodologies from patterns observed in designs and incidents.
- Collaborate with product and engineering partners to establish non-functional requirements for new products and services.
- Build and lead a strong team of Site Reliability Engineers, provide guidance, support, and mentorship to ensure the team’s success and development.
- Participate in detailed design reviews and set standards for the organization.
- Develop and maintain Service-Level Objectives and Service-Level Indicators

**Minimum Qualifications**
- Experience in modeling and architecting complicated business domains and associated methodologies/paradigms: i.e. Domain Driven Design, Event Sourcing, CQRS.
- At least 8 years of proven experience with system design, algorithms, data structures, analysis, and software design.
- Expertise in distributed architectural patterns: event driven microservices, distributed transactions: sagas, append-only logs, change data capture, idempotent consumer, eventual consistency.
- Proven track record implementing mínimalistic event driven microservices chassis (not just Spring), i.e. Quarkus/Vert.x, Micronaut, Javalin, Ktor or non-JVM: Javascript, Go.
- Completely hands on and should be able to build and understand code.
- Experience working in a 24/7 environment with on-call responsibilities to provide support to production support on a need basis.
- Deep understanding of cloud technologies, distributed systems, automation, and monitoring tools.
- Demonstrated leadershi



  • Phoenix, United States American Express Full time

    American Express Director - Site Reliability Engineering Phoenix , Arizona Apply Now With the right backing, people and businesses have the power to progress in incredible ways. When you join Team Amex, you become part of a global and diverse community of colleagues with an unwavering commitment to back our customers, communities and each other. Here,...


  • Phoenix, United States Cloud BC Labs Full time

    Job DescriptionJob DescriptionPOSITIONSite Reliability EngineerLOCATIONHybrid- Phoenix, AZ (locals only)DURATION5+ Months possible ext or CTHINTERVIEW TYPEVideoVISA RESTRICTIONSMust convert perm without sponsorshipREQUIRED SKILLSExperience leading onshore/offshore teamsHands on building/troubleshooting experienceTransitioned from Prometheus to Mimir; Grafana...


  • Phoenix, United States Motion Recruitment Full time

    We are seeking a skilled Site Reliability Engineer (SRE) to join our team and ensure the reliability, scalability, and performance of our critical systems and services. The SRE will bridge the gap between development and operations, focusing on automation, monitoring, and incident management to maintain high service availability and seamless software...


  • Phoenix, Arizona, United States Cloud BC Labs Full time

    Job DescriptionJob DescriptionPOSITIONSite Reliability Engineer (SRE)LOCATIONHybrid Phoenix, AZDURATION6 MonthsINTERVIEW TYPEVideoVISA RESTRICTIONSNoneREQUIRED SKILLSExperience leading onshore/offshore teamsHands on building/troubleshooting experienceTransitioned from Prometheus to Mimir; Grafana is still a must-haveSite Reliability/Observability dev...


  • Phoenix, United States Insight Global Full time

    Position: LEAD Site Reliability Engineer (SRE)Location: Phoenix, AZ (Hybrid 3X Per Week)Pay Range: $60-$70 an hour + BenefitsDuration: 6 month contract to hireDesired Qualifications: -4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. **A DEGREE IS ABSOLUTELY...


  • Phoenix, United States Insight Global Full time

    Position: LEAD Site Reliability Engineer (SRE)Location: Phoenix, AZ (Hybrid 3X Per Week)Pay Range: $60-$70 an hour + BenefitsDuration: 6 month contract to hireDesired Qualifications: -4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. **A DEGREE IS ABSOLUTELY...


  • Phoenix, United States Insight Global Full time

    Position: LEAD Site Reliability Engineer (SRE)Location: Phoenix, AZ (Hybrid 3X Per Week)Pay Range: $60-$70 an hour + BenefitsDuration: 6 month contract to hireDesired Qualifications: -4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. **A DEGREE IS ABSOLUTELY...


  • Phoenix, United States Insight Global Full time

    Position: LEAD Site Reliability Engineer (SRE)Location: Phoenix, AZ (Hybrid 3X Per Week)Pay Range: $60-$70 an hour + BenefitsDuration: 6 month contract to hireDesired Qualifications: -4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. **A DEGREE IS ABSOLUTELY...


  • Phoenix, United States Insight Global Full time

    Position: LEAD Site Reliability Engineer (SRE) Location: Phoenix, AZ (Hybrid 3X Per Week) Pay Range: $60-$70 an hour + Benefits Duration: 6 month contract to hire Desired Qualifications: -4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience. **A DEGREE IS...


  • Phoenix, United States Jobs Malaysia - Two95 HR HUB Full time

    Title: Site Reliability Engineer Location: Phoenix, AZ Job Type: Full Time Minimum Qualifications BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience in DevOps - Java/J2EE/REACT JS applications 2+ years of hands on experience on configuring Splunk dashboards, Alerts setup Good...


  • Phoenix, United States TWO95 International Full time

    Title: Site Reliability Engineer Location: Phoenix, AZ Job Type: Full Time Minimum Qualifications •BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 3-6 years of work experience in DevOps - Java/J2EE/REACT JS applications •2+ years of hands on experience on configuring Splunk dashboards, Alerts setup...


  • Phoenix, United States Mastech Digital Full time

    Mastech Digital Inc. is a (certified) Minority owned business certified by NMSDC. Public traded firm under MHH at NYSE, Established in 1986. Headquartered in Pittsburgh, PA our operations are spread across 11 Global Recruiting & Sales offices across US.Role: Site Reliability EngineerLocation: Pheonix AZDuration: FulltimeMust have:SRE - Network Engineering &...


  • Phoenix, United States Mastech Digital Full time

    Mastech Digital Inc. is a (certified) Minority owned business certified by NMSDC. Public traded firm under MHH at NYSE, Established in 1986. Headquartered in Pittsburgh, PA our operations are spread across 11 Global Recruiting & Sales offices across US.Role: Site Reliability EngineerLocation: Pheonix AZDuration: FulltimeMust have:SRE - Network Engineering &...


  • Phoenix, United States Mastech Digital Full time

    Mastech Digital Inc. is a (certified) Minority owned business certified by NMSDC. Public traded firm under MHH at NYSE, Established in 1986. Headquartered in Pittsburgh, PA our operations are spread across 11 Global Recruiting & Sales offices across US.Role: Site Reliability EngineerLocation: Pheonix AZDuration: FulltimeMust have:SRE - Network Engineering &...


  • Phoenix, Arizona, United States Expert In Recruitment Solutions Full time

    Job Title: Site Reliability Engineer (SRE) Location: This is a hybrid onsite position, worker is required to work onsite 2-3 days per week in Phoenix, AZ.Hybrid Onsite: Worker is required to work onsite 3 days per week in Phoenix, AZ as they will be working cross functionally with 3 different teams.MAIN RESPONSIBILITIES" Experience in leading Observability...


  • Phoenix, United States Cloud BC Labs Full time

    Job DescriptionJob DescriptionPOSITIONSite Reliability Engineer (SRE)LOCATIONHybrid Phoenix, AZDURATION6 MonthsINTERVIEW TYPEVideoVISA RESTRICTIONSNoneREQUIRED SKILLSExperience leading onshore/offshore teamsHands on building/troubleshooting experienceTransitioned from Prometheus to Mimir; Grafana is still a must-haveSite Reliability/Observability dev...


  • Phoenix, United States PNC Financial Services Group Full time

    Job Profile Position Overview Job Description Summary At PNC, our people are our greatest differentiator and competitive advantage in the markets we serve. We are all united in delivering the best experience for our customers. We work together each day to foster an inclusive workplace culture where all of our employees feel respected, valued and have an...


  • Phoenix, United States Motion Recruitment Full time

    Senior Site Reliability Engineer (SRE)Location: Phoenix, AZ, 85050 (Hybrid- 3 days onsite)Term: 06+ Months Contract (with a possible extension)Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer (SRE) to lead our release processes, manage infrastructure incidents, and optimize our CI/CD pipelines. The ideal candidate will have a deep...


  • Phoenix, United States Motion Recruitment Full time

    Senior Site Reliability Engineer (SRE)Location: Phoenix, AZ, 85050 (Hybrid- 3 days onsite)Term: 06+ Months Contract (with a possible extension)Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer (SRE) to lead our release processes, manage infrastructure incidents, and optimize our CI/CD pipelines. The ideal candidate will have a deep...


  • Phoenix, United States Motion Recruitment Full time

    Senior Site Reliability Engineer (SRE)Location: Phoenix, AZ, 85050 (Hybrid- 3 days onsite)Term: 06+ Months Contract (with a possible extension)Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer (SRE) to lead our release processes, manage infrastructure incidents, and optimize our CI/CD pipelines. The ideal candidate will have a deep...