Senior Site Reliability Engineer

4 days ago


Chicago, United States TAG - The Aspen Group Full time

The Aspen Group (TAG) is one of the largest and most trusted retail healthcare business support organizations in the U.S. and has supported over 20,000 healthcare professionals and team members with close to 1,500 health and wellness offices across 48 states in four distinct categories: dental care, urgent care, medical aesthetics, and animal health. Working in partnership with independent practice owners and clinicians, the team is united by a single purpose: to prove that healthcare can be better and smarter for everyone. TAG provides a comprehensive suite of centralized business support services that power the impact of five consumer-facing businesses: Aspen Dental, ClearChoice Dental Implant Centers, WellNow Urgent Care, Chapter Aesthetic Studio, and AZPetVet. Each brand has access to a deep community of experts, tools and resources to grow their practices, and an unwavering commitment to delivering high-quality consumer healthcare experiences at scale.​

A

s a reflection of our current needs and planned growth we are very pleased to offer a new opportunity to join our dedicated team as a Senior Site Reliability Engineer.

T

he Senior Site Reliability Engineer (SRE) & Monitoring Specialist will be responsible for ensuring the reliability, performance, and scalability of our systems. This role involves implementing and managing monitoring solutions, responding to incidents, and optimizing system performance to meet business objectives.

R

esponsibilities: S

ite Reliability Engineering: D

  • esign, build, and maintain scalable and reliable systems to support our applications and services. D
  • evelop and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure systems meet reliability targets. D
  • rive improvements in system reliability, availability, and performance through proactive measures and automation. M

onitoring & Observability: I

  • mplement and manage comprehensive monitoring and alerting solutions to ensure full visibility into system health and performance. D
  • evelop and maintain dashboards and reporting tools that provide actionable insights for troubleshooting and performance optimization. E
  • valuate and integrate new monitoring tools and technologies as needed to enhance observability.

Incident Management: L

  • ead and participate in incident response efforts, including troubleshooting, root cause analysis, and resolution. D
  • evelop and maintain incident management processes to improve response times and minimize service disruptions. C
  • onduct post-incident reviews to identify areas for improvement and implement preventive measures. P

erformance Optimization: A

  • nalyze performance metrics and logs to identify and address bottlenecks and inefficiencies in the system. C
  • ollaborate with development teams to optimize code and infrastructure for better performance and reliability. P
  • erform capacity planning to ensure systems can handle current and future loads.

A

utomation & Process Improvement: D

  • evelop and implement automation solutions to streamline operations and reduce manual intervention. I
  • dentify and drive process improvements to enhance operational efficiency and effectiveness. M
  • aintain documentation related to monitoring, incident management, and SRE best practices. C

ollaboration & Communication: W

  • ork closely with engineering, operations, and product teams to align on reliability and monitoring goals. C
  • ommunicate effectively with stakeholders, providing regular updates on system health, incidents, and performance improvements. F
  • oster a culture of collaboration and knowledge sharing within the team and across the organization.

R

equirements: B

  • achelor's degree in Computer Science or a related field. A
  • t least 5 years of experience in Site Reliability Engineering or a similar role. S
  • trong proficiency in at least one programming language such as Python, Java, or Go. E
  • xperience with containerization technologies such as Docker and Kubernetes. S
  • trong understanding of networking, distributed systems, and cloud infrastructure. F
  • amiliarity with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, and Splunk. E
  • xcellent problem-solving skills and the ability to work independently and in a team environment. E
  • xperience with incident management and root cause analysis. I

f you are a Senior SRE Engineer with a passion for ensuring the reliability and performance of production systems, we encourage you to apply for this exciting opportunity.



  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...

  • Senior AI Engineer

    2 weeks ago


    Chicago, Illinois, United States SITE Technologies Full time

    About the RoleWe are seeking a highly skilled Senior AI Engineer to lead our computer vision initiatives at SITE Technologies. This individual will be responsible for designing and implementing computer vision solutions using both traditional image processing approaches and modern deep learning models.Key ResponsibilitiesDevelop and implement computer vision...


  • Chicago, IL, United States WEX Inc. Full time

    Senior Staff Site Reliability Engineer Apply to locations: Chicago, IL; Bay Area, CA; San Francisco, CA. About the Role The WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and...


  • Chicago, United States Iris Software Inc. Full time

    Greetings!One of our direct client (Logistics) is looking to hire Sr. SRE Engineer in Naperville IL (Hybrid – 3-4 days onsite per week). Please find below job description. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications. You will work closely with cross-functional teams to...


  • Chicago, United States Iris Software Inc. Full time

    Greetings!One of our direct client (Logistics) is looking to hire Sr. SRE Engineer in Naperville IL (Hybrid – 3-4 days onsite per week). Please find below job description. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications. You will work closely with cross-functional teams to...


  • Chicago, United States Saxon Global Full time

    Northern Trust Site Reliability Engineer (Azure)Location: Downtown Chicago - Onsite 2 days/week - 181 W Madison St Duration: 12+ month contract w/extension/conversion Overview The Goals Driven Wealth Management platform is a showcase product for Northern Trusts Wealth Management business and we must demonstrate our ability to deliver and support innovative...


  • Chicago, United States Enova Full time

    We are interested in every qualified candidate who is eligible to work in the United States. However, we are not able to sponsor visas or take over sponsorship at this time.Reports to: Technology Manager II - Tech Ops About the Role: As a Site Reliability engineer you will help maintain the reliability of our consumer business from a technology and...


  • Chicago, United States Cleo Full time

    Site Reliability Engineer At Cleo, we make doing business easy! Cleo is an established software company with a start-up feel.  We have awesome products, which go hand in hand with our awesome culture! We are devoted to our people and pride ourselves on creating a fun, laid-back, but fast-paced work environment.  Not only do we work hard, we play hard. We...


  • Chicago, United States Enova Full time

    We are interested in every qualified candidate who is eligible to work in the United States. However, we are not able to sponsor visas or take over sponsorship at this time. #LI-Hybrid #BI-Hybrid Reports to: Technology Manager II - Tech Ops About the Role: As a Site Reliability engineer you will help maintain the reliability of our consumer business from a...


  • Chicago, United States Info Way Solutions Full time

    Site Reliability Engineer in Wealth Management Chicago (IL) / Tempe (AZ) Onsite Job ROLE: This role will be Responsible for application observability, maintenance, and support, identifying and implementing preventive measures proactively, evaluates and makes recommendation on techniques, practices, or technologies that would enhance business needs. As a SRE...


  • Chicago, Illinois, United States PROVE Full time

    About ProveWe're at the forefront of digital identity innovation, and our mission is to empower businesses to thrive in a mobile-first economy. With our cutting-edge phone-centric identity tokenization and passive cryptographic authentication solutions, we reduce friction, enhance security and privacy across all digital channels, and accelerate revenues...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team. You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python. What you'll do: Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Diverse Lynx Full time

    Role name: Engineer Role Description: This role will be Responsible for application observability, maintenance, and support, identifying and implementing preventive measures proactively, evaluates and makes recommendation on techniques, practices, or technologies that would enhance business needs. As a SRE associate you will collaborate with Application...


  • Chicago, United States CloudBC Labs Full time

    POSITIONSite Reliability Engineer (GCP)LOCATIONMidwest area, travel once a month. Headquarters is in Chicago IL so someone local is preferredDURATION3+ monthsINTERVIEW TYPEVideoVISA RESTRICTIONSNoneREQUIRED SKILLS• SRE infrastructure• GCP platform• Dataflow, composer, BigQuery, cloud function, pubsub• Talend, Collibra

  • Lead DevOps

    2 hours ago


    Chicago, United States CapB InfoteK Full time

    CapB is a global leader on IT Solutions and Managed Services. Our R&D is focused on providing cutting edge products and solutions across Digital Transformations from Cloud, AI/ML, IOT, Blockchain to MDM/PIM, Supply chain, ERP, CRM, HRMS and Integration solutions. For our growing needs we need consultants who can work with us on salaried or contract basis. We...

  • Senior Engineer

    12 hours ago


    Chicago, United States Orion Engineers, LLC Full time

    Job DescriptionJob DescriptionThe Senior Engineer of Transportation Site/Civil is a leader of the Transportation Site/Civil engineering practice and exercises direct or indirect supervision of personnel assigned for projects within the transportation site/civil engineering practice and technical matters. The Senior Engineer is responsible for delivering...


  • Chicago, Illinois, United States eTek IT Services, Inc. Full time

    About eTek IT Services, Inc.eTek IT Services, Inc. is a renowned company in the financial services industry, providing holistic advice on wealth management to high net worth and ultra high net worth clients through its Goals Driven Wealth Management (GDWM) platform.CompensationThe estimated annual salary for this position is $140,000-$160,000, considering...


  • Chicago, United States DATAMAXIS Full time

    Location: Chicago, IL Position Type: Fulltime (3 days a week (Tue, Wed & Thu) onsite or more if needed) Salary: $125,000 to 140,000 (10% yearly bonus) Responsibilities: Manage and monitor systems and infrastructure hosted on-premises and Cloud. Good understanding of different layers of an application and system design - networking concepts, cloud...