Senior Site Reliability Engineer

2 days ago


Chicago, United States TAG - The Aspen Group Full time

The Aspen Group (TAG) is one of the largest and most trusted retail healthcare business support organizations in the U.S. and has supported over 20,000 healthcare professionals and team members with close to 1,500 health and wellness offices across 48 states in four distinct categories: dental care, urgent care, medical aesthetics, and animal health. Working in partnership with independent practice owners and clinicians, the team is united by a single purpose: to prove that healthcare can be better and smarter for everyone. TAG provides a comprehensive suite of centralized business support services that power the impact of five consumer-facing businesses: Aspen Dental, ClearChoice Dental Implant Centers, WellNow Urgent Care, Chapter Aesthetic Studio, and AZPetVet. Each brand has access to a deep community of experts, tools and resources to grow their practices, and an unwavering commitment to delivering high-quality consumer healthcare experiences at scale.​


As a reflection of our current needs and planned growth we are very pleased to offer a new opportunity to join our dedicated team as a Senior Site Reliability Engineer.


The Senior Site Reliability Engineer (SRE) & Monitoring Specialist will be responsible for ensuring the reliability, performance, and scalability of our systems. This role involves implementing and managing monitoring solutions, responding to incidents, and optimizing system performance to meet business objectives.


Responsibilities


Site Reliability Engineering:

  • Design, build, and maintain scalable and reliable systems to support our applications and services.
  • Develop and manage Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure systems meet reliability targets.
  • Drive improvements in system reliability, availability, and performance through proactive measures and automation.


Monitoring & Observability:

  • Implement and manage comprehensive monitoring and alerting solutions to ensure full visibility into system health and performance.
  • Develop and maintain dashboards and reporting tools that provide actionable insights for troubleshooting and performance optimization.
  • Evaluate and integrate new monitoring tools and technologies as needed to enhance observability.


Incident Management:

  • Lead and participate in incident response efforts, including troubleshooting, root cause analysis, and resolution.
  • Develop and maintain incident management processes to improve response times and minimize service disruptions.
  • Conduct post-incident reviews to identify areas for improvement and implement preventive measures.


Performance Optimization:

  • Analyze performance metrics and logs to identify and address bottlenecks and inefficiencies in the system.
  • Collaborate with development teams to optimize code and infrastructure for better performance and reliability.
  • Perform capacity planning to ensure systems can handle current and future loads.


Automation & Process Improvement:

  • Develop and implement automation solutions to streamline operations and reduce manual intervention.
  • Identify and drive process improvements to enhance operational efficiency and effectiveness.
  • Maintain documentation related to monitoring, incident management, and SRE best practices.


Collaboration & Communication:

  • Work closely with engineering, operations, and product teams to align on reliability and monitoring goals.
  • Communicate effectively with stakeholders, providing regular updates on system health, incidents, and performance improvements.
  • Foster a culture of collaboration and knowledge sharing within the team and across the organization.


Requirements:

  • Bachelor's degree in Computer Science or a related field.
  • At least 5 years of experience in Site Reliability Engineering or a similar role.
  • Strong proficiency in at least one programming language such as Python, Java, or Go.
  • Experience with containerization technologies such as Docker and Kubernetes.
  • Strong understanding of networking, distributed systems, and cloud infrastructure.
  • Familiarity with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, and Splunk.
  • Excellent problem-solving skills and the ability to work independently and in a team environment.
  • Experience with incident management and root cause analysis.


If you are a Senior SRE Engineer with a passion for ensuring the reliability and performance of production systems, we encourage you to apply for this exciting opportunity.



  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...

  • Senior AI Engineer

    2 weeks ago


    Chicago, Illinois, United States SITE Technologies Full time

    About the RoleWe are seeking a highly skilled Senior AI Engineer to lead our computer vision initiatives at SITE Technologies. This individual will be responsible for designing and implementing computer vision solutions using both traditional image processing approaches and modern deep learning models.Key ResponsibilitiesDevelop and implement computer vision...


  • Chicago, IL, United States WEX Inc. Full time

    Senior Staff Site Reliability Engineer Apply to locations: Chicago, IL; Bay Area, CA; San Francisco, CA. About the Role The WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and...


  • Chicago, United States Iris Software Inc. Full time

    Greetings!One of our direct client (Logistics) is looking to hire Sr. SRE Engineer in Naperville IL (Hybrid – 3-4 days onsite per week). Please find below job description. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications. You will work closely with cross-functional teams to...


  • Chicago, United States Iris Software Inc. Full time

    Greetings!One of our direct client (Logistics) is looking to hire Sr. SRE Engineer in Naperville IL (Hybrid – 3-4 days onsite per week). Please find below job description. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications. You will work closely with cross-functional teams to...


  • Chicago, United States Saxon Global Full time

    Northern Trust Site Reliability Engineer (Azure)Location: Downtown Chicago - Onsite 2 days/week - 181 W Madison St Duration: 12+ month contract w/extension/conversion Overview The Goals Driven Wealth Management platform is a showcase product for Northern Trusts Wealth Management business and we must demonstrate our ability to deliver and support innovative...


  • Chicago, United States Enova Full time

    We are interested in every qualified candidate who is eligible to work in the United States. However, we are not able to sponsor visas or take over sponsorship at this time.Reports to: Technology Manager II - Tech Ops About the Role: As a Site Reliability engineer you will help maintain the reliability of our consumer business from a technology and...


  • Chicago, United States Cleo Full time

    Site Reliability Engineer At Cleo, we make doing business easy! Cleo is an established software company with a start-up feel.  We have awesome products, which go hand in hand with our awesome culture! We are devoted to our people and pride ourselves on creating a fun, laid-back, but fast-paced work environment.  Not only do we work hard, we play hard. We...


  • Chicago, United States Enova Full time

    We are interested in every qualified candidate who is eligible to work in the United States. However, we are not able to sponsor visas or take over sponsorship at this time. #LI-Hybrid #BI-Hybrid Reports to: Technology Manager II - Tech Ops About the Role: As a Site Reliability engineer you will help maintain the reliability of our consumer business from a...


  • Chicago, United States Info Way Solutions Full time

    Site Reliability Engineer in Wealth Management Chicago (IL) / Tempe (AZ) Onsite Job ROLE: This role will be Responsible for application observability, maintenance, and support, identifying and implementing preventive measures proactively, evaluates and makes recommendation on techniques, practices, or technologies that would enhance business needs. As a SRE...


  • Chicago, Illinois, United States PROVE Full time

    About ProveWe're at the forefront of digital identity innovation, and our mission is to empower businesses to thrive in a mobile-first economy. With our cutting-edge phone-centric identity tokenization and passive cryptographic authentication solutions, we reduce friction, enhance security and privacy across all digital channels, and accelerate revenues...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team. You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python. What you'll do: Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...

  • Site Reliability Engineer

    22 minutes ago


    Chicago, United States Saxon Global Full time

    Site Reliability Engineer (SRE) - (Azure, Systems background) Client: Lexis Nexis Location: REMOTE Rate: $62 C2C Duration: 1 Year Notes: Azure, Systems background experience • BSc Engineering/Computer Science or relevant experience. • Proven background working in a technical, IT related position. • Desirable -Azure Certifications • Configuration...


  • Chicago, United States Diverse Lynx Full time

    Role name: Engineer Role Description: This role will be Responsible for application observability, maintenance, and support, identifying and implementing preventive measures proactively, evaluates and makes recommendation on techniques, practices, or technologies that would enhance business needs. As a SRE associate you will collaborate with Application...


  • Chicago, United States CloudBC Labs Full time

    POSITIONSite Reliability Engineer (GCP)LOCATIONMidwest area, travel once a month. Headquarters is in Chicago IL so someone local is preferredDURATION3+ monthsINTERVIEW TYPEVideoVISA RESTRICTIONSNoneREQUIRED SKILLS• SRE infrastructure• GCP platform• Dataflow, composer, BigQuery, cloud function, pubsub• Talend, Collibra

  • Lead DevOps

    3 hours ago


    Chicago, United States CapB InfoteK Full time

    CapB is a global leader on IT Solutions and Managed Services. Our R&D is focused on providing cutting edge products and solutions across Digital Transformations from Cloud, AI/ML, IOT, Blockchain to MDM/PIM, Supply chain, ERP, CRM, HRMS and Integration solutions. For our growing needs we need consultants who can work with us on salaried or contract basis. We...

  • Senior Engineer

    12 hours ago


    Chicago, United States Orion Engineers, LLC Full time

    Job DescriptionJob DescriptionThe Senior Engineer of Transportation Site/Civil is a leader of the Transportation Site/Civil engineering practice and exercises direct or indirect supervision of personnel assigned for projects within the transportation site/civil engineering practice and technical matters. The Senior Engineer is responsible for delivering...


  • Chicago, Illinois, United States eTek IT Services, Inc. Full time

    About eTek IT Services, Inc.eTek IT Services, Inc. is a renowned company in the financial services industry, providing holistic advice on wealth management to high net worth and ultra high net worth clients through its Goals Driven Wealth Management (GDWM) platform.CompensationThe estimated annual salary for this position is $140,000-$160,000, considering...