Aumni - Site Reliability Engineer III - MLOPS

4 weeks ago


Salt Lake City, United States JPMorgan Chase & Co. Full time

There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.

As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. As MLops Engineer, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize the models produced by our data science teams and their associated. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability in the AI/ML space.

Job responsibilities

Guides and assists others in the areas of designing and deploying new AI/ML models in the cloud, gaining consensus from peers where appropriate Designs and implements automated continuous integration and continuous delivery pipelines for the Data Science teams to develop and train AI/ML models Writes and deploys infrastructure as code for the models and pipelines you support Collaborates with technical experts, key stakeholders, and team members to resolve complex technical problems Understands the importance of monitoring and observability in the AI/ML space – . service level indicators and utilizes service level objectives Proactively resolve issues before they impact internal and external stakeholders of deployed models Supports the adoption of MLops best practices within your team

Required qualifications, capabilities, and skills

Formal training or certification on site reliability engineering concepts and 3+ years applied experience Understanding of MLops culture and principles and familiarity with how to implement associated concepts at scale Domain knowledge of machine learning applications and technical processes within the AWS ecosystem Experience with infrastructure as code tooling such as Terraform, Cloudformation Experience with container and container orchestration such as ECS, Kubernetes, and Docker Knowledge of continuous integration and continuous delivery tools like Jenkins, GitLab, or Github Actions Proficiency in the following programming languages: Python, Bash Hands-on knowledge of Linux and networking internals Understanding of the different roles served by data engineers, data scientists, machine learning engineers, and system architects, and how MLops contributes to each of these workstreams Ability to identify new technologies and relevant solutions to ensure design constraints are met by the Data Science and Machine Learning teams Preferred qualifications, capabilities, and skills Experience with Model training and deployment pipelines, managing scoring endpoints Familiarity with observability concepts and telemetry collection using tools such as Datadog, Grafana, Prometheus, Splunk, and others Understanding of data engineering platforms such as Databricks or Snowflake, and machine learning platforms such as AWS Sagemaker Comfortable troubleshooting common containerization technologies and issues Ability to proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation

  • Jersey City, New Jersey, United States tapwage Full time

    There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve complex...


  • Salt Lake City, United States Sorenson Communications Full time

    Come be a part of our mission and make a meaningful and positive impact with the industry leading provider of language services for the Deaf and heard-of-hearing! Benefits Paid Vacation Time and Paid Sick Time and Paid Holidays k % match with immediate vesting Nationwide Medical Insurance plans and coverage (Medical, Dental/Orthodontia, Vision) ...


  • Salt Lake City, United States Goldman Sachs Full time

    MORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...


  • Salt Lake City, United States Diverse Lynx Full time

    Role: Site Reliability Engineer Type: Full time perm Location: Salt Lake City, Utah Annual Salary: Market Standard Responsibilities " Opportunity to drive modern Observability platform that covers Cloud-native and hybrid applications " Able to persuade stakeholders and champion effective techniques through product development " Solid understanding of...


  • Salt Lake City, United States Technology Search Group, Inc. Full time

    About the job Site Reliability Engineer (SRE) Responsibilities Responsible for collaborating with businesspeople to have a real time understanding of business problems and expected to focus on agile methodology of development. Deliver high quality change within the deadlines. In this role, you will be responsible for coding, testing and delivering high...

  • Quality Engineer III

    2 weeks ago


    Salt Lake City, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position Summary This is a full-time exempt position. The Quality Engineer III is responsible for assisting with the maintenance of technical aspects of the Quality Management System, specifically the control of Design and Development Changes. This person will be working with BioFire employees outside of Quality Assurance on a day-to-day basis...


  • Salt Lake City, United States Global Channel Management Full time

    Requirements for a Junior Support Engineer SRE Position on LinkedIn Skills And Qualifications A minimum of 4 years of relevant experience Proficiency in standard RPE and strong written and verbal communication skills Demonstrated expertise in Linux systems Familiarity with Python for automation tasks Experience in Incident management protocols Willingness to...

  • Quality Engineer III

    3 weeks ago


    Salt Lake, Utah, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position Summary This is a full-time exempt position. The Quality Engineer III is responsible for assisting with the maintenance of technical aspects of the Quality Management System, specifically the control of Design and Development Changes. This person will be working with BioFire employees outside of Quality Assurance on a day-to-day basis...


  • Salt Lake City, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...

  • Maintenance Engineer

    4 weeks ago


    Salt Lake City, United States Allied Reliability Full time

    Overview Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an  Engineer -...

  • Maintenance Engineer

    3 weeks ago


    Salt Lake City, United States Allied Reliability Full time

    Overview: Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City About the role We are looking for an Engineer - Maintenance to support the...


  • Jersey City, United States SelektIT Full time

    Job Description Position: Site Reliability Engineer Company Overview: Purelogics is a fast-growing technology company that provides innovative solutions to businesses of all sizes. Our team consists of highly skilled and dedicated professionals who are passionate about delivering top-notch services to our clients. We are currently looking for a Site...


  • Salt Lake City, United States ARCS Full time

    Join our client's vibrant team in Cape Town as an Intermediate Site Reliability Engineer (SRE II). Operating mostly remotely, their team occasionally collaborates in the office for direct engagement. Your role involves achieving operational excellence through automation tooling (e.g., Terraform). You'll contribute to architectural discussions, keeping your...


  • Salt Lake, Utah, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Arizona City, United States Openlane Full time

    Job Description: Site Reliability Engineer (f.k.a. Platform Engineer) for CarsArrive Network, Inc. located in Mesa, AZ. Provide daily, hands-on assistance to maintain and advance the build process to ensure reliability and optimum integration with Continuous Integration/Continuous Delivery (CI/CD) and Release Management. Work with the development,...


  • Foster City, United States Bayone Full time

    As a Site Reliability Engineer, you will: Keep a large production service up and running including: Host OS upgrades Docker image upgrades SSL certificate upgrades Define and refine metrics to track service health and performance. Automate software releases and service failovers. Requirements Bachelor's degree in Engineering, Mathematics or...


  • Oklahoma City, United States BJ's Wholesale Club Full time

    Lead Site Reliability Engineer page is loaded Lead Site Reliability Engineer Apply locations BJ's Club Support Center Marlborough, MA #5997 time type Full time posted on Posted 2 Days Ago job requisition id R147855 Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight...


  • Jersey City, United States Pinnacle Group, Inc. Full time

    W2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...


  • Jersey City, United States Pinnacle Group, Inc. Full time

    W2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...

  • Maintenance Engineer

    4 weeks ago


    Salt Lake City, Utah, United States Allied Reliability, Inc. Full time

    Overview: Engineer - Maintenance Work for a company that places the health and safety of all employees above all else Be a part of one of the largest copper mining operations in the world. Continue to build your career with opportunities for future advancement - Salt Lake City   About the role We are looking for an Engineer - Maintenance to support...