Site Reliability Engineer

4 weeks ago


chicago, United States Matlen Silver Full time

Compensation: $70 - $75/Hour

Hybrid: 2 Days Onsite Chicago Illinois

Domain: Retail/Supply Chain

Job Title: Site Reliability Engineer


Position Summary

As a Site Reliability Engineer/DevOps Engineer, you will be responsible for ensuring the availability, performance, and reliability of Fulfillment Technology solutions for our client to support omni-channel strategy. You will work closely with the development, testing, and operations teams to design, implement, and maintain scalable, reliable, and efficient solutions for the production environment. You will also troubleshoot and resolve any issues that may arise in the production systems, using various tools and techniques such as monitoring, logging, alerting, automation, and incident management. You will also contribute to the continuous improvement of the DevOps practices and processes, such as CI/CD, configuration management, infrastructure as code, and cloud computing. You will have a strong background in software engineering, system administration, networking, and cloud technologies. You will also have excellent communication and collaboration skills, as well as a passion for learning new technologies and solving complex problems.

Minimum Position Qualifications

  • Bachelor’s Degree in Computer Science/Engineering or related field
  • 4+ years of experience in the cloud SRE/DevOps/Infrastructure, or any related fields
  • 4+ years experience working with databases, web applications and micro-services, event-driven applications, messaging systems, REST APIs and integrations, cloud, support tools, observability and containerization technologies.
  • Knowledge of Java, Spring boot, Microservices, Kafka, Cassandra & SQL Server
  • Proficiency in scripting languages such as Python / Shell scripting
  • 1 year of experience managing System Observability tools (DynaTrace, ELK, PagerDuty, Datadog, Azure Monitor, Grafana, etc)
  • Hands-on experience with GitActions for CI/CD automations
  • Knowledge of Linux architecture, security, administration, performance monitoring/tuning, troubleshooting, and production operations
  • Demonstrated skill in working in an Agile environment
  • Demonstrated skill in working with multi-location global teams
  • Proven ability to think and contribute at the strategic level
  • Demonstrated knowledge of eCommerce, Fulfillment, or Retail Technology solutions
  • Demonstrated written, oral and presentation/public speaking communication skills

Desired Previous Experience/Education

  • Master’s Degree or PhD in computer science, information systems, or related field
  • 4+ years of experience in designing/working in high volume eCommerce applications
  • 2+ years of experience configuring and managing cloud infrastructure (Azure, AWS, GCP)
  • 1 year of experience with technologies such as Apache Kafka, Azure Cosmos DB, Apache Cassandra, Ansible, Terraform, Docker and Kubernetes
  • Experience with Nginx, HAProxy, Squid
  • Experience with CI/CD pipelines using tools such as Jenkins, Spinnaker, Azure DevOps, TeamCity, etc.
  • ?Proficient in implementing and managing RoyalTS or similar cross-platform remote management solutions, ensuring secure and efficient remote access and system administration across diverse environments.

Key Responsibilities

Essential Job Functions

  • Partner and collaborate with application engineering, observability, and other support teams within our clients organization, as well as our business operation partners and third parties (as appropriate) to prioritize, address and drive the resolution of issues and incidents that impact customer pickup or delivery domains
  • Drive root-cause analysis of critical business and production issues to prevent future occurrences and review/approve potential solutions
  • Lead Major Incident calls impacting the Pickup Fulfillment domain and provide clear, timely updates on status of service restoration to key stakeholders
  • Work with the engineering teams to continuously implement and improve reliable and speedy build environments
  • Increase automation to improve efficiency and quality
  • Ensure traceability, observability, and retrievability of system behavior
  • Build logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization in cloud, on-prem and store environments
  • Craft solid and clearly explained designs, playbooks, and documentation
  • Participate in an off-hours on-call rotation, and perform periodic off-hours work during maintenance windows



  • Chicago, Illinois, United States Diverse Lynx Full time

    Job Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our cloud-based applications. Key Responsibilities:Design and implement monitoring, metrics, and logging systems to ensure application...


  • Chicago, Illinois, United States Oak Street Health Full time

    Role OverviewWe are seeking a skilled Site Reliability Engineer to join our team at Oak Street Health. As a Site Reliability Engineer, you will play a critical role in ensuring the stability and performance of our platform, which is built specifically for the clinical team. You will partner with our software engineering teams to transform ideas into reality,...


  • Chicago, United States Matlen Silver Full time

    Compensation: $70 - $75/HourHybrid: 2 Days Onsite Chicago IllinoisDomain: Retail/Supply ChainJob Title: Site Reliability EngineerPosition SummaryAs a Site Reliability Engineer/DevOps Engineer, you will be responsible for ensuring the availability, performance, and reliability of Fulfillment Technology solutions for our client to support omni-channel...


  • chicago, United States Matlen Silver Full time

    Compensation: $70 - $75/HourHybrid: 2 Days Onsite Chicago IllinoisDomain: Retail/Supply ChainJob Title: Site Reliability EngineerPosition SummaryAs a Site Reliability Engineer/DevOps Engineer, you will be responsible for ensuring the availability, performance, and reliability of Fulfillment Technology solutions for our client to support omni-channel...


  • Chicago, United States Algo Capital Group Full time

    Linux Site Reliability Engineer – Linux Systems Engineering TeamOur client, an industry leading proprietary trading firm and liquidity provider, is looking for a Linux Site Reliability Engineer to join their expanding Linux Systems Engineering Team in Chicago. The firm prides itself on its collaborative environment and usage of mostly in-home tools and...


  • chicago, United States Algo Capital Group Full time

    Linux Site Reliability Engineer – Linux Systems Engineering TeamOur client, an industry leading proprietary trading firm and liquidity provider, is looking for a Linux Site Reliability Engineer to join their expanding Linux Systems Engineering Team in Chicago. The firm prides itself on its collaborative environment and usage of mostly in-home tools and...


  • Chicago, Illinois, United States Enova Full time

    About the Role: As a Site Reliability Engineer at Enova, you will play a crucial part in maintaining the reliability of our consumer business from a technology and operational standpoint. You will drive the rapid improvement and efficiency of our platform by implementing automated tools, evaluating processes, troubleshooting, and resolving complex problems....


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team. You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python. What you'll do: Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, United States Selby Jennings Full time

    A leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team. You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python. What you'll do: Support software development teams to implement different parts of the application life cycle, i.e. application deployment,...


  • Chicago, Illinois, United States Enova Full time

    About the Role:As a Site Reliability Engineer at Enova International, you will play a critical role in maintaining the reliability of our consumer business from a technology and operational standpoint. Your expertise will drive the rapid improvement and efficiency of our platform by implementing automated tools, evaluating processes, and troubleshooting...


  • Chicago, Illinois, United States Bank of America Full time

    Job Description:At Bank of America, we are committed to delivering exceptional customer experiences through the power of every connection. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and observability of our services.Key responsibilities include:Partnering with engineering and technology teams to improve...


  • Chicago, Illinois, United States Northern Trust Full time

    About Northern Trust:Northern Trust is a globally recognized financial institution with a rich history dating back to 1889. We provide innovative financial services and guidance to the world's most successful individuals, families, and institutions.We are committed to delivering exceptional service, expertise, and integrity in all our endeavors. Our team of...


  • Chicago, Illinois, United States iManage Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at iManage. As a key member of our global SRE team, you will contribute to the development and maintenance of our cloud-based platform. Your expertise in cloud infrastructure, Kubernetes, and containerization will be instrumental in ensuring the scalability,...


  • Chicago, Illinois, United States TalTeam Full time

    Job Summary TalTeam is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will work closely with technology support teams and application teams to build monitoring and automation solutions to improve application and infrastructure availability.Key Responsibilities Represent the Enterprise Monitoring team...


  • Chicago, United States Oak Street Health Full time

    Role DescriptionAs a Site Reliability Engineer, you will be instrumental to the stability and performance of a new kind of platform for healthcare, one built specifically for the clinical team. From design to implementation, you will partner with our stellar software engineering teams in a fast-paced, agile environment to transform ideas into a reality....


  • Chicago, Illinois, United States Adyen Full time

    We are looking for a highly technical Senior Site Reliability Engineer to join our Internal Services team at Adyen. As a Site Reliability Engineer, you will be responsible for the stability and reliability of our internal services.The ideal candidate will have 7+ years of relevant work experience and a solid understanding of the Linux operating system and...

  • Reliability Engineer

    1 month ago


    Chicago, United States Mondelez International Full time

    Job DescriptionAre You Ready to Make It Happen at Mondelz International?Join our Mission to Lead the Future of Snacking. Make It With Pride.Your goal will be to ensure that the site manufacturing & support activities, without interruption, without any facilities shortages and/or any issues thereof. You will achieve 100% compliance to Local legal regulations,...


  • Chicago, United States Selby Jennings Full time

    Job DescriptionJob DescriptionA leading Proprietary Trading firm is seeking a Site Reliability Engineer to join their team.You'll design and support the systems used by electronic trading desks leveraging tools like Linux, Kubernetes, and Python.What you'll do:Support software development teams to implement different parts of the application life...

  • Reliability Engineer

    4 months ago


    Chicago, United States Mondelez International Full time

    Click HERE to Apply: Reliability EngineerAre You Ready to Make It Happen at Mondelēz International?Join our Mission to Lead the Future of Snacking. Make It With Pride.Your goal will be to ensure that the site manufacturing & support activities, without interruption, without any facilities shortages and/or any issues thereof. You will achieve 100% compliance...