Site Reliability Engineer

3 weeks ago


Irving, Texas, United States Publicis Groupe Full time
Job Description

Publicis Sapient is a digital transformation partner helping established organizations get to their future, digitally enabled state. We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with agile engineering and problem-solving creativity.

Responsibilities:
  • Automation & Scripting: Use tools like Ansible and Python to automate provisioning, monitoring, and scaling tasks.
  • Observability & Monitoring: Set up Grafana dashboards and Prometheus alerts to track service health, uptime, and performance metrics across platforms.
  • Infrastructure Management: Deploy and manage applications on OpenShift or other Kubernetes-based platforms, ensuring efficient application lifecycle management.
  • Platform & Service Monitoring: Implement and automate monitoring for both cloud and on-prem environments, ensuring compliance with SLA requirements.
  • Capacity Planning & Resource Management: Monitor and optimize GPU and CPU utilization, ensuring resources are allocated efficiently across workloads.
  • Collaboration & Sprint Planning: Participate in Agile/Scrum sprint planning, collaborating with other teams to ensure tasks are delivered on time and aligned with service-level objectives.
  • Process Automation: Automate manual processes such as resource requests, tenant onboarding, and lifecycle management for AI/ML platforms and other workloads.
Qualifications:
  • Strong experience with automation tools like Ansible and Python scripting for infrastructure management.
  • Proficiency in Grafana and Prometheus for monitoring and setting up alerting mechanisms.
  • Hands-on experience managing applications in OpenShift or other Kubernetes-based platforms.
  • Ability to automate service monitoring and infrastructure scaling in both cloud and on-prem environments, ensuring SLA compliance.
  • Experience with infrastructure management for cloud (GCP) and hybrid environments.
  • Experience with infrastructure as code (IaC) tools (Terraform).
Additional Information:
  • Flexible vacation policy; time is not limited, allocated, or accrued
  • 16 paid holidays throughout the year
  • Generous parental leave and new parent transition program
  • Tuition reimbursement
  • Corporate gift matching program


  • Irving, Texas, United States Resource Informatics Group Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...


  • Irving, Texas, United States Resource Informatics Group Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...


  • Irving, Texas, United States Resource Informatics Group Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...


  • Irving, Texas, United States Resource Informatics Group Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services and applications.Key Responsibilities:Develop and maintain comprehensive...


  • Irving, Texas, United States Resource Informatics Group Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services and applications.Key Responsibilities:Develop and maintain comprehensive...


  • Irving, Texas, United States PTR Global Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at PTR Global.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure solutions.Collaborate with cross-functional teams to identify and resolve performance issues.Develop and maintain monitoring and observability tools using...


  • Irving, Texas, United States Tata Consultancy Services Full time

    Job DescriptionAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Tata Consultancy Services. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ecommerce platform.Key Responsibilities:Design and implement scalable and reliable systems to manage...


  • Irving, Texas, United States Creospan Full time

    Job Title: Site Reliability EngineerWe are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Develop and maintain scripts to automate tasks and processes...


  • Irving, Texas, United States Creospan Full time

    Job Title:Site Reliability EngineerJob Summary:We are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems. This role requires advanced technical skills, a proactive approach to...


  • Irving, Texas, United States Citigroup Inc Full time

    About CitiCiti, a leading global bank, serves over 200 million customers worldwide, operating in more than 160 countries and jurisdictions. As a bank with a strong presence in the global market, Citi provides a wide range of financial products and services to consumers, corporations, governments, and institutions.Job OverviewThe Site Reliability Engineer...


  • Irving, Texas, United States Diverse Lynx Full time

    Job DescriptionJob Title: Site Reliability EngineerCompany: Diverse Lynx LLCJob Type: Full-timeLocation: RemoteAbout Us: Diverse Lynx LLC is an Equal Employment Opportunity employer. We promote and support a diverse workforce across all levels in the company.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team. The successful...


  • Irving, Texas, United States DEFENDERS Full time

    Job Summary:DEFENDERS is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our software applications. You will work closely with our development and infrastructure teams to identify and resolve issues, and implement...


  • Irving, Texas, United States Citigroup Inc Full time

    Job DescriptionAs a Site Reliability Engineer at Citigroup Inc., you will play a critical role in ensuring the stability, efficiency, and observability of our Global Wholesale Lending Technology (WLT) environment. You will work closely with technology leads, architects, engineers, and other stakeholders to identify and resolve production incidents, develop...


  • Irving, Texas, United States Creospan Full time

    Job Title: Site Reliability EngineerWe are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team at Creospan. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Automation and Scripting:Develop and maintain scripts...


  • Irving, Texas, United States Creospan Inc. Full time

    Job Title: Site Reliability EngineerAt Creospan Inc., we are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Automation and Scripting:Develop and maintain...


  • Irving, Texas, United States Wells Fargo Full time

    About this RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Wells Fargo. As a Senior Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud infrastructure. You will work closely with cross-functional teams to identify and resolve...


  • Irving, Texas, United States Citigroup Inc Full time

    Job Description:Citigroup Inc. is seeking a highly skilled Site Reliability Engineer to join our Global Wholesale Lending Technology team. As a key member of our technology organization, you will play a critical role in ensuring the stability, efficiency, and observability of our technology environment.Responsibilities:Partner with technology leads and...


  • Irving, Texas, United States Publicis Groupe Full time

    Job Title: Site Reliability EngineerPublicis Sapient is a digital transformation partner helping established organizations get to their future, digitally enabled state, both in the way they work and the way they serve their customers.We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with...


  • Irving, Texas, United States Foot Locker Full time

    Job Title: Sr. Site Reliability Engineering ManagerWe are seeking a highly skilled and experienced Sr. Site Reliability Engineering Manager to join our team at Foot Locker, Inc. This is a unique opportunity to lead a talented team of automation engineers and drive innovative solutions at the intersection of technology and sneaker culture.Job Summary:The Sr....


  • Irving, Texas, United States Foot Locker Full time

    Job Title: Sr. Site Reliability Engineering ManagerJoin Foot Locker, Inc. as a Sr. Site Reliability Engineering Manager and be part of a dynamic team that drives innovation and excellence in the sneaker industry.About the Role:We are seeking a highly skilled and experienced Site Reliability Engineering Manager to lead our IT Tools and Observability Services...