Site Reliability Engineer

3 weeks ago


Chicago, United States IdelSoft Full time

About the job Site Reliability Engineer

This is a 1099-based role. Eligible for US-based candidates with a valid work authorization only.

We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our dynamic team, focusing on the reliability, scalability, and robustness of our trading enclave products and service lines. The ideal candidate will possess a deep understanding of SRE principles, including incident management, DevOps practices, and software development, with specialized expertise in Dynatrace, Splunk, and Grafana. This role requires a strong background in root cause analysis, troubleshooting, and implementing resilient system designs like circuit breakers, Kubernetes deployments, and various deployment strategies (blue/green, canary, etc.).

Key Responsibilities
- Incident Management: Lead and manage incident response efforts, ensuring rapid recovery and implementing preventive measures.
- Monitoring and Observability: Utilize Dynatrace, Splunk, and Grafana to set up comprehensive frameworks for proactive issue detection.
- Performance Optimization: Analyze system performance, identify bottlenecks, and implement optimizations for reliability and efficiency.
- Deployment Strategies: Design and implement resilient deployment strategies, including blue/green deployments, canary releases, and Kubernetes rollouts.
- Root Cause Analysis: Conduct thorough analysis for incidents and issues, leading the implementation of corrective actions.
- DevOps Practices: Champion DevOps practices, streamline CI/CD pipelines, and automate workflows.
- Circuit Breakers Implementation: Specialize in designing and implementing circuit breaker patterns for preventing system failures and ensuring high availability.

Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in an SRE or similar role, with a focus on trading systems or financial services.
- Expertise in monitoring tools (Dynatrace, Splunk, Grafana) and Kubernetes.
- Strong understanding of DevOps methodologies and tools.
- Proven track record in incident management, root cause analysis, and implementing resilient system designs.
- Experience with deployment strategies (blue/green, canary, etc.) and managing complex, distributed systems in a cloud environment.
- Excellent problem-solving, communication, and teamwork skills.
- Solid understanding of on-prem and hybrid cloud infrastructure (VMware, Linux, Windows, Azure) and container orchestration (Kubernetes, Docker).
- Fairly good understanding of MongoDB, Kafka, and IBM mainframe DB2 (preferred), conversant with WebLogic, Java technology stacks including Spring Boot (Not expert level skillset).
- Excellent communication and leadership skills, capable of leading incident response initiatives and collaborating effectively across teams.
- Certifications in relevant technologies (Dynatrace, Splunk) are a plus.



  • Chicago, United States Allied Reliability Full time

    Overview: The Maintenance Reliability Engineer is responsible for implementing machinery and process improvements using management of change best practices while promoting values of a safe, environmentally compliant workplace, and philosophy of continuous improvement with the workforce. Responsibilities: Process Improvements and Operational Upgrading Works...


  • Chicago, Illinois, United States Motion Recruitment Full time

    A financial company is looking for senior level Site Reliability Engineers to join their team in troubleshooting applications and managing their Azure environment. This will be a contract-to-hire position that is hybrid 3 days a week in the Chicago area. Expertise in Terraform, YAML, and Azure infrastructure is mandatory. This company is a global leader in...


  • Chicago, United States JobRialto Full time

    Top 3 requirements: Ecommerce experience (think Nordstrom, Target, where you purchase a product) Java Spring boot Kubernetes Plusses: Azure Kubernetes preferred Description: Client is looking for a forward-thinking, energetic Site Reliability Engineering Manager to join our team. Client serves the ecommerce needs of leading and growing grocery retailers...


  • Chicago, Illinois, United States Balyasny Asset Management L. P Full time

    We are looking for a Senior Site Reliability Engineer who can cultivate our SRE philosophy, processes, and technologies from the ground up.As a Senior Site Reliability Engineer within the Platform group, you will lay the groundwork for our SRE infrastructure.Develop and promote our SRE philosophy, establishing best practices and processes that will be...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond. This role will work 3x's a week in the Downtown Chicago area onsite. Key Responsibilities: Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States AmericanEagle.com Full time

    Americaneagle.com is a family-owned web design, development, and digital marketing agency with a passionate belief in the power of technology to positively transform business practices. Our focus is on helping customers grow and achieve success in the digital space. We cover a variety of different industries, including eCommerce, associations & nonprofits,...


  • Chicago, United States Saxon Global Full time

    Site Reliability Engineer (SRE) - (Azure, Systems background) Client: Lexis Nexis Location: REMOTE Rate: $62 C2C Duration: 1 Year Notes: Azure, Systems background experience •BSc Engineering/Computer Science or relevant experience. •Proven background working in a technical, IT related position. •Desirable -Azure Certifications ...


  • Chicago, United States Motion Recruitment Full time

    A financial company is looking for senior level Site Reliability Engineers to join their team in troubleshooting applications and managing their Azure environment. This will be a contract-to-hire position that is hybrid 3 days a week in the Chicago area. Expertise in Terraform, YAML, and Azure infrastructure is mandatory. This company is a global leader in...


  • Chicago, United States Motion Recruitment Partners LLC Full time

    A financial company is looking for senior level Site Reliability Engineers to join their team in troubleshooting applications and managing their Azure environment. This will be a contract-to-hire position that is hybrid 3 days a week in the Chicago area. Expertise in Terraform, YAML, and Azure infrastructure is mandatory. This company is a global leader in...


  • Chicago, United States Balyasny Asset Management Full time

    We are looking for a Senior Site Reliability Engineer who can cultivate our SRE philosophy, processes, and technologies from the ground up. As a Senior Site Reliability Engineer within the Platform group, you will lay the groundwork for our SRE infrastructure. Your role will entail driving standards and fostering adoption across our technology teams, whilst...


  • Chicago, United States Oak Street Health Full time

    Description Company: Oak Street Health Title: Engineer II, Site Reliability Engineer Location: Chicago Role Description: As a Site Reliability Engineer, you will be instrumental to the stability and performance of a new kind of platform for healthcare, one built specifically for the clinical team. From design to implementation, you will partner with our...


  • Chicago, United States Oak Street Health Full time

    Description Company: Oak Street Health Title: Engineer II, Site Reliability Engineer Location: Chicago Role Description: As a Site Reliability Engineer, you will be instrumental to the stability and performance of a new kind of platform for healthcare, one built specifically for the clinical team. From design to implementation, you will partner with our...


  • Chicago, Illinois, United States Selby Jennings Full time

    This elite trading firm is known for their passion within the technology space- using the most cutting edge systems in the world. The organization is dedicated to pioneering research in Mathematics, Physics, and Computer Science, leveraging these disciplines to innovate in global financial markets. Their culture emphasizes fearlessness, creativity, and...


  • Chicago, United States Selby Jennings Full time

    This elite trading firm is known for their passion within the technology space- using the most cutting edge systems in the world. The organization is dedicated to pioneering research in Mathematics, Physics, and Computer Science, leveraging these disciplines to innovate in global financial markets. Their culture emphasizes fearlessness, creativity, and...


  • Chicago, United States Deere & Company Full time

    Advanced Options 28 open jobs. Use your resume to get matched with the right job. Senior Platform Engineer (Chicago, Visa Sponsorship available) Reliability Engineer Dubuque, Iowa, United States Reliability Engineer Dubuque, Iowa, United States Senior Software Engineer - DevOps eCommerce (Chicago) SOFTWARE ENGINEER (Chicago, IL or Moline, IL - Hybrid) SAP...


  • Chicago, Illinois, United States Georgia IT Inc Full time

    Role: Machine Learning Engineer/SRELocation: Chicago, IL or RemoteDuration: 12 MonthsRate: DOEUS Citizens and Green cards & GC-EAD Only. No Third-party C2C available for this jobWe are seeking a highly skilled and motivated Machine Learning Engineer who possesses expertise in developing, deploying, and managing machine learning models. In this role, you will...

  • Reliability Engineer

    2 weeks ago


    Chicago, United States Daubert Chemical Co. Inc. Full time

    Job DescriptionJob DescriptionDaubert Chemical Company is seeking an experienced Reliability Engineer at its specialty chemical production plant at 4700 S. Central in Forest View, IL.  The successful candidate will be a degreed engineer with at least 6 years in process and/or manufacturing engineering operations plus extensive project management experience...