Site Reliability Engineer

4 weeks ago


Lake Forest, United States Motion Recruitment Partners, LLC Full time
Job Description This Fortune 500 company in the Chicago area, a top 10 North American e-commerce player focused on industrial supplies, underwent a digital transformation around 2018 under a new CTO, enabling growth during the pandemic and retaining tech talent due to its competitive and challenging environment.

This SRE/Cloud Infrastructure Engineer role involves managing AWS-hosted Kubernetes platforms engineered for machine learning workloads like training, experimentation, and serving. Responsibilities include ensuring a robust and scalable infrastructure for advanced ML workloads, implementing and managing monitoring tools (Grafana, Loki, Prometheus, Thanos), and maintaining continuous deployment using GitOps practices with ArgoCD and Flux.

The engineer will build, test, configure, tune and support the Kubernetes infrastructure in the cloud, encompassing servers, storage, middleware, network, and client technologies. They will design and implement automation solutions across multiple platforms, recommending improvements for automated tools and identifying opportunities for increased orchestration adoption. The individual will work in a large, complex 24/7 e-commerce environment, gaining experience with various on-premises and cloud-based applications, as part of the Machine Learning Operations team supporting the ML platform. Required Skills & Experience
  • 5+ years of professional experience
  • In-depth Kubernetes experience
  • ArgoCD
  • Monitoring tools like Grafana or Prometheus
Desired Skills & Experience
  • Experience supporting ML platforms
  • At least 2 years supporting GitOps
  • Flux
What You Will Be Doing Daily Responsibilities
  • 70% Hands On
  • 30% Team Collaboration
The Offer
  • Bonus eligible
You will receive the following benefits:
  • Medical Insurance
  • Dental Benefits
  • Vision Benefits
  • Paid Time Off (PTO)
  • 401(k)

Applicants must be currently authorized to work in the US on a full-time basis now and in the future. Site Reliability Engineer / Coding Required / Argo or Monitoring

  • Salt Lake City, United States Sorenson Communications Full time

    Come be a part of our mission and make a meaningful and positive impact with the industry leading provider of language services for the Deaf and heard-of-hearing! Benefits Paid Vacation Time and Paid Sick Time and Paid Holidays k % match with immediate vesting Nationwide Medical Insurance plans and coverage (Medical, Dental/Orthodontia, Vision) ...


  • Salt Lake City, United States Goldman Sachs Full time

    MORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...


  • Salt Lake City, United States Goldman Sachs Full time

    MORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...


  • Salt Lake City, United States Diverse Lynx Full time

    Role: Site Reliability Engineer Type: Full time perm Location: Salt Lake City, Utah Annual Salary: Market StandardResponsibilities: • Opportunity to drive modern Observability platform that covers Cloud-native and hybrid applications • ble to persuade stakeholders and champion effective techniques through product development • Solid understanding of...


  • Salt Lake City, United States Diverse Lynx Full time

    Role: Site Reliability Engineer Type: Full time perm Location: Salt Lake City, Utah Annual Salary: Market Standard Responsibilities " Opportunity to drive modern Observability platform that covers Cloud-native and hybrid applications " Able to persuade stakeholders and champion effective techniques through product development " Solid understanding of...


  • Salt Lake City, United States JPMorgan Chase & Co. Full time

    There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve...


  • Salt Lake City, United States JPMorgan Chase & Co. Full time

    There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve...


  • Avon Lake, United States The Lubrizol Corporation Full time

    About Lubrizol The Lubrizol Corporation, a Berkshire Hathaway company, is a specialty chemical company whose science delivers sustainable solutions to advance mobility, improve wellbeing and enhance modern life. Founded in 1928, Lubrizol owns and operates more than 100 manufacturing facilities, sales, and technical offices around the world and has about...


  • Salt Lake City, United States Technology Search Group, Inc. Full time

    About the job Site Reliability Engineer (SRE) Responsibilities Responsible for collaborating with businesspeople to have a real time understanding of business problems and expected to focus on agile methodology of development. Deliver high quality change within the deadlines. In this role, you will be responsible for coding, testing and delivering high...

  • Reliability Engineer

    1 month ago


    Forest View, United States Daubert Chemical Full time

    Daubert Chemical Company is seeking an experienced Reliability Engineer at its specialty chemical production plant at 4700 S. Central in Forest View, IL.  The successful candidate will be a degreed engineer with at least 6 years in process and/or manufacturing engineering operations plus extensive project management experience and handle the following:...


  • Lake Forest, United States Motion Recruitment Full time

    Job Description This Fortune 500 company in the Chicago area, a top 10 North American e-commerce player focused on industrial supplies, underwent a digital transformation around 2018 under a new CTO, enabling growth during the pandemic and retaining tech talent due to its competitive and challenging environment. This SRE/Cloud Infrastructure Engineer role...


  • Lake Forest, United States Motion Recruitment Full time

    Job Description This Fortune 500 company in the Chicago area, a top 10 North American e-commerce player focused on industrial supplies, underwent a digital transformation around 2018 under a new CTO, enabling growth during the pandemic and retaining tech talent due to its competitive and challenging environment. This SRE/Cloud Infrastructure Engineer role...


  • Salt Lake City, United States Technology Search Group, Inc. Full time

    About the job Site Reliability Engineer (SRE) Responsibilities Responsible for collaborating with businesspeople to have a real time understanding of business problems and expected to focus on agile methodology of development. Deliver high quality change within the deadlines. In this role, you will be responsible for coding, testing and delivering high...


  • Salt Lake City, United States Global Channel Management Full time

    Requirements for a Junior Support Engineer SRE Position on LinkedIn Skills And Qualifications A minimum of 4 years of relevant experience Proficiency in standard RPE and strong written and verbal communication skills Demonstrated expertise in Linux systems Familiarity with Python for automation tasks Experience in Incident management protocols Willingness to...

  • Reliability Specialist

    2 months ago


    Salt Lake, Utah, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Salt Lake, Utah, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Salt Lake City, United States ARCS Full time

    Join our client's vibrant team in Cape Town as an Intermediate Site Reliability Engineer (SRE II). Operating mostly remotely, their team occasionally collaborates in the office for direct engagement. Your role involves achieving operational excellence through automation tooling (e.g., Terraform). You'll contribute to architectural discussions, keeping your...


  • Salt Lake City, United States Allied Reliability Full time

    Overview Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an  Engineer -...

  • Maintenance Engineer

    1 month ago


    Salt Lake City, United States Allied Reliability Full time

    Overview: Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an Engineer - Maintenance to support the...

  • Maintenance Engineer

    2 months ago


    Salt Lake City, United States Allied Reliability Full time

    Overview Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an  Engineer -...