Senior Principal Site Reliability Engineer
4 weeks ago
Northern Trust is a globally recognized financial institution with a rich history dating back to 1889. We provide innovative financial services and guidance to the world's most successful individuals, families, and institutions.
We are committed to delivering exceptional service, expertise, and integrity in all our endeavors. Our team of experienced professionals works tirelessly to ensure the reliability and performance of our systems and services.
We are seeking a highly skilled Sr Principal Site Reliability Engineer to join our team. This role will play a pivotal part in ensuring the reliability and performance of our systems and services.
As a Site Reliability DevOps Engineer, you will be responsible for defining and deploying key observability services with a deep focus on architecture, production operations, capacity planning, performance management, deployment, and release engineering.
You will work with cross-functional teams to assist with providing efficiency of our services. Your expertise in both software engineering and system operations will enable our partners to drive continuous improvements in our platform's reliability.
This role will focus on bringing complete observability across all technologies.
This role will be responsible for a number of key functions that both support and drive improvements to the reliability of Northern Trust's IT Landscape.
Key Responsibilities:
System Design and Architecture:
Lead the design and architecture of providing reliability, scalability, and performance of critical complex systems.
Operational Excellence:
Develop and maintain automation scripts and tools to streamline operations and reduce manual tasks. Oversee system performance transparency.
Incident Response/Root Cause Analysis:
Collaborate with root cause analysis and implement measures to prevent recurrence of issues.
Monitoring and Observability:
Design and implement comprehensive monitoring and observability solutions to proactively detect and address issues prior to them impacting our business.
- Develop and maintain dashboards and alerts to provide real-time insights into system health.
Reliability Improvements:
Identify opportunities for improving system reliability through process enhancements and technical solutions.
Documentation and Communication:
Create and maintain detailed documentation of systems, processes, and procedures.
- Communicate effectively with stakeholders across different teams and levels within the organization.
Project Management/Collaboration:
Manage and prioritize multiple projects and initiatives related to reliability and performance improvements.
- Collaborate with product, development, and operations teams to align SRE efforts with overarching business goals.
Requirements:
- Bachelor's degree or equivalent experience
- 10+ years in systems engineering with a focus on reliability, systems operations, and software engineering
- 5+ years as a Team lead or a hands on Technical Manager role that can engage and deliver projects to completion
- Strong proficiency in programming languages such as Python, Go, Ruby, Java, etc
- Experience with both on-prem and cloud solutions
- Experience with containerization
- Demonstrated ability to design and implement systems that ensure observability with associated dashboards
- Deep understanding of distributed systems, networking, and modern software architectures
- Excellent problem-solving skills and ability to handle complex technical challenges
- Strong dedication to customer needs, with excellent communication and the ability to build lasting relationships, alongside the capability to articulate complex reliability strategies in a clear and impactful manner.
- Prior experience delivering Infrastructure as Code via a CI/CD pipeline
- Proven experience in leading a mentoring technical teams
- Skilled in implementing automation for corrective action based on deployed observability solutions
- Practical experience operating in an Agile development environment
Working with Us:
As a Northern Trust partner, you will be part of a flexible and collaborative work culture in an organization where financial strength and stability is an asset that emboldens us to explore new ideas.
Movement within the organization is encouraged, senior leaders are accessible, and you can take pride in working for a company committed to assisting the communities we serve.
We'd love to learn more about how your interests and experience could be a fit with one of the world's most admired and sustainable companies.
-
Principal Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States iManage Full timeAbout the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at iManage. As a key member of our global SRE team, you will contribute to the development and maintenance of our cloud-based platform. Your expertise in cloud infrastructure, Kubernetes, and containerization will be instrumental in ensuring the scalability,...
-
Senior Staff Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States WEX Full timeAbout the RoleThe WEX Site Reliability Engineering team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on reliability, performance, and operational excellence. As a key member of the team, you will be responsible for designing and implementing scalable and reliable systems, as well as collaborating with...
-
Chicago, Illinois, United States Adyen Full timeWe are looking for a highly technical Senior Site Reliability Engineer to join our Internal Services team at Adyen. As a Site Reliability Engineer, you will be responsible for the stability and reliability of our internal services.The ideal candidate will have 7+ years of relevant work experience and a solid understanding of the Linux operating system and...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Northern Trust Full timeAbout Northern TrustNorthern Trust is a globally recognized, award-winning financial institution with a rich history dating back to 1889. As a Fortune 500 company, we provide innovative financial services and guidance to the world's most successful individuals, families, and institutions.Job SummaryWe are seeking an experienced Sr Principal Site Reliability...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Brain Bolt Consulting Full timeJob Title: Site Reliability EngineerAt Brain Bolt Consulting, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our systems and applications.Key Responsibilities:Design, develop, and deploy scalable and reliable systems...
-
Senior Lead Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Capital One Full timeRole OverviewAs a Senior Lead Site Reliability Engineer at Capital One, you will be responsible for leading a portfolio of diverse technology projects and a team of developers with deep experience in distributed microservices and full-stack systems. Your primary goal will be to create solutions that help meet regulatory needs for the company.Key...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Diverse Lynx Full timeJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our cloud-based applications. Key Responsibilities:Design and implement monitoring, metrics, and logging systems to ensure application...
-
Senior Lead Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Capital One Full timeAbout the Role:We are seeking a highly skilled Senior Lead Site Reliability Engineer to join our dynamic remote-first engineering team. As a key member of our team, you will be responsible for leading a portfolio of diverse technology projects and a team of developers with deep experience in distributed microservices and full-stack systems.Key...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Oak Street Health Full timeRole OverviewWe are seeking a skilled Site Reliability Engineer to join our team at Oak Street Health. As a Site Reliability Engineer, you will play a critical role in ensuring the stability and performance of our platform, which is built specifically for the clinical team. You will partner with our software engineering teams to transform ideas into reality,...
-
Senior Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States TalTeam Full timeJob Summary TalTeam is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will work closely with technology support teams and application teams to build monitoring and automation solutions to improve application and infrastructure availability.Key Responsibilities Represent the Enterprise Monitoring team...
-
Site Reliability Engineer
3 weeks ago
Chicago, Illinois, United States Enova Full timeAbout the Role: As a Site Reliability Engineer at Enova, you will play a crucial part in maintaining the reliability of our consumer business from a technology and operational standpoint. You will drive the rapid improvement and efficiency of our platform by implementing automated tools, evaluating processes, troubleshooting, and resolving complex problems....
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Matlen Silver Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Matlen Silver. As a key member of our infrastructure and operations team, you will be responsible for ensuring the availability, performance, and reliability of our Fulfillment Technology solutions.Key Responsibilities:Partner with application engineering, observability,...
-
CloudBC Labs Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States CloudBC Labs Full timeJob Title: Site Reliability EngineerJob Summary:CloudBC Labs is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based infrastructure. You will work closely with our development team to identify and resolve issues, and...
-
Principal Cloud Reliability Engineer
3 weeks ago
Chicago, Illinois, United States iManage Full timeAbout the RoleiManage is seeking a skilled Principal Cloud Reliability Engineer to join our team. As a key member of our global SRE team, you will contribute to the development and maintenance of our cloud-based SaaS platform. Your expertise in cloud infrastructure, Kubernetes, and containerization will be instrumental in ensuring the scalability,...
-
Senior Cloud Reliability Engineer
4 weeks ago
Chicago, Illinois, United States CCC Intelligent Solutions, Inc. Full timeJob SummaryAs a Senior Cloud Reliability Engineer at CCC Intelligent Solutions, Inc., you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and services hosted primarily on Microsoft Azure. Key Responsibilities: Design, implement, and manage the alerting and monitoring strategy for Azure-based...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Northern Trust Full timeAbout Northern TrustNorthern Trust is a globally recognized, award-winning financial institution with a rich history dating back to 1889. As a Fortune 500 company, we provide innovative financial services and guidance to the world's most successful individuals, families, and institutions. Our commitment to service, expertise, and integrity has earned us a...
-
Senior Cloud Reliability Engineer
3 weeks ago
Chicago, Illinois, United States CCC Intelligent Solutions, Inc. Full timeJob SummaryAs a Senior Cloud Reliability Engineer at CCC Intelligent Solutions, Inc., you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and services hosted primarily on Microsoft Azure.Key Responsibilities: Design, implement, and manage the alerting and monitoring strategy for Azure-based...
-
Senior Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Intelsat US LLC Full timeJob Title:Senior Reliability EngineerYour Impact:As a Senior Reliability Engineer at Intelsat US LLC, you will be recognized as a subject matter expert in Product Reliability. You will help develop and maintain departmental methodologies, systems, and practices that meet organizational and customer requirements.You will rely on critical judgment to plan and...
-
Site Reliability Engineer
3 weeks ago
Chicago, Illinois, United States Enova Full timeAbout the Role:As a Site Reliability Engineer at Enova International, you will play a critical role in maintaining the reliability of our consumer business from a technology and operational standpoint. Your expertise will drive the rapid improvement and efficiency of our platform by implementing automated tools, evaluating processes, and troubleshooting...
-
Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Bank of America Full timeJob Description:At Bank of America, we are committed to delivering exceptional customer experiences through the power of every connection. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and observability of our services.Key responsibilities include:Partnering with engineering and technology teams to improve...