Intermediate Site Reliability Engineer

2 weeks ago


Salt Lake City, United States ARCS Full time

Join our client's vibrant team in Cape Town as an Intermediate Site Reliability Engineer (SRE II). Operating mostly remotely, their team occasionally collaborates in the office for direct engagement. Your role involves achieving operational excellence through automation tooling (e.g., Terraform). You'll contribute to architectural discussions, keeping your skills current for impactful contributions.

Their Infrastructure & Software Stack:

Kubernetes running on Google Kubernetes Engine (GKE)

Prometheus, Grafana, Elastic, Kibana

CI/CD with Jenkins

Kong API Gateway

LogDNA

Falco

MongoDB Atlas

Microservice Architecture with Event Sourcing and CQRS

Containers running Kotlin, Python, Javascript (and a bit of Golang)

Your responsibilities will include:

Being part of their security incident response team

Managing their identity platform and enabling enterprise user and system authentication and authorization using OAuth2

Working effectively with the development team to plan and deploy required infrastructure changes or new capabilities ahead of time and unblocking the development team when unforeseen infrastructure blockers arise

Performing high-quality, ego-free code reviews drive visibility, testing, and improvement initiatives

Writing operational tooling to automate otherwise manual processes (e.g., Golang, Bash)

Writing, testing, and executing change control plans for production changes with an eye for detail to spot potential issues

Debug production issues

Being part of their on-call rotation. When on-call, you will work on repaying technical debt and deal with operational incidents as and when they occur. This will require you to have or acquire a good general knowledge of production operations for technical support.

#J-18808-Ljbffr



  • Salt Lake City, United States Sorenson Communications Full time

    Come be a part of our mission and make a meaningful and positive impact with the industry leading provider of language services for the Deaf and heard-of-hearing! Benefits Paid Vacation Time and Paid Sick Time and Paid Holidays k % match with immediate vesting Nationwide Medical Insurance plans and coverage (Medical, Dental/Orthodontia, Vision) ...


  • Jersey City, United States Veterans Sourcing Group LLC Full time

    Site Reliability Engineer (AWS) (SRE) Jersey City, NJ- onsite 3 days/ week 12 month minimum contract w/ possible full time conversion Roles And Responsibilities Design, code, test, and deliver software to automate manual operational work Troubleshoot priority incidents, facilitate blameless post-mortems, and ensure permanent closure of incidents Engage with...


  • Salt Lake City, United States Goldman Sachs Full time

    MORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...


  • Salt Lake City, United States Diverse Lynx Full time

    Role: Site Reliability Engineer Type: Full time perm Location: Salt Lake City, Utah Annual Salary: Market Standard Responsibilities " Opportunity to drive modern Observability platform that covers Cloud-native and hybrid applications " Able to persuade stakeholders and champion effective techniques through product development " Solid understanding of...


  • Salt Lake City, United States JPMorgan Chase & Co. Full time

    There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve...


  • Salt Lake City, United States Technology Search Group, Inc. Full time

    About the job Site Reliability Engineer (SRE) Responsibilities Responsible for collaborating with businesspeople to have a real time understanding of business problems and expected to focus on agile methodology of development. Deliver high quality change within the deadlines. In this role, you will be responsible for coding, testing and delivering high...


  • Salt Lake City, United States Global Channel Management Full time

    Requirements for a Junior Support Engineer SRE Position on LinkedIn Skills And Qualifications A minimum of 4 years of relevant experience Proficiency in standard RPE and strong written and verbal communication skills Demonstrated expertise in Linux systems Familiarity with Python for automation tasks Experience in Incident management protocols Willingness to...


  • Jersey City, United States SelektIT Full time

    Job Description Position: Site Reliability Engineer Company Overview: Purelogics is a fast-growing technology company that provides innovative solutions to businesses of all sizes. Our team consists of highly skilled and dedicated professionals who are passionate about delivering top-notch services to our clients. We are currently looking for a Site...

  • Maintenance Engineer

    1 month ago


    Salt Lake City, United States Allied Reliability Full time

    Overview Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an  Engineer -...

  • Maintenance Engineer

    4 weeks ago


    Salt Lake City, United States Allied Reliability Full time

    Overview: Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an Engineer - Maintenance to support the...

  • Reliability Specialist

    2 months ago


    Salt Lake City, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Salt Lake City, United States Battelle Applied Solutions, LLC Full time

    Requisition Id 11976 Overview: Are you looking for a way to use your hard-earned SRE skills in a more ambitious environment where you can also help protect national security? The National Center for Computational Sciences (NCCS) at Oak Ridge National Lab (ORNL), which hosts several of the world's most powerful computer systems, is seeking highly qualified...


  • Salt Lake City, United States Battelle Applied Solutions, LLC Full time

    Requisition Id 11979 Overview: The National Center for Computational Sciences (NCCS) at Oak Ridge National Lab (ORNL), which hosts several of the world's most powerful computer systems, is seeking highly qualified individuals to play a key role in improving the security, performance, and reliability of the NCCS computing infrastructure which supports...


  • Salt Lake, Utah, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Salt Lake City, United States Oracle Full time

    Oracle Senior Site Reliability Developer Salt Lake City , Utah Apply Now Customers rely on Oracle Cloud Infrastructure (OCI) to power their business as they tackle some of the world’s biggest challenges. We’re looking for Senior Site Reliability Developers/Engineers who would be responsible for Advanced Operations (AO) and critical issues of production...


  • Salt Lake City, United States Big West Oil Full time

    Experienced Fixed Equipment Reliability engineer to develop and support a developing reliability system. Position is responsible for daily support activities for the refinery asset, as well as developing philosophies, work processes, and special emphasis programs. Responsible for monitoring and measuring reliability KPI's and developing action plans to...


  • Arizona City, United States Openlane Full time

    Job Description: Site Reliability Engineer (f.k.a. Platform Engineer) for CarsArrive Network, Inc. located in Mesa, AZ. Provide daily, hands-on assistance to maintain and advance the build process to ensure reliability and optimum integration with Continuous Integration/Continuous Delivery (CI/CD) and Release Management. Work with the development,...


  • Oklahoma City, United States BJ's Wholesale Club Full time

    Lead Site Reliability Engineer page is loaded Lead Site Reliability Engineer Apply locations BJ's Club Support Center Marlborough, MA #5997 time type Full time posted on Posted 2 Days Ago job requisition id R147855 Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight...


  • Foster City, United States Bayone Full time

    As a Site Reliability Engineer, you will: Keep a large production service up and running including: Host OS upgrades Docker image upgrades SSL certificate upgrades Define and refine metrics to track service health and performance. Automate software releases and service failovers. Requirements Bachelor's degree in Engineering, Mathematics or...


  • Jersey City, United States Trigyn Technologies Full time

    Immediate long-term contract to hire opportunity for Sr. Site Reliability Support Engineer with direct client in Jersey City. Trigyn’s financial services client has an immediate need for a Site Reliability Engineer in Jersey City. This is a long-term contract assignment, that could....