Site Reliability Engineer
3 weeks ago
Publicis Sapient is a digital transformation partner helping established organizations get to their future, digitally enabled state. We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with agile engineering and problem-solving creativity.
Responsibilities:- Automation & Scripting: Use tools like Ansible and Python to automate provisioning, monitoring, and scaling tasks.
- Observability & Monitoring: Set up Grafana dashboards and Prometheus alerts to track service health, uptime, and performance metrics across platforms.
- Infrastructure Management: Deploy and manage applications on OpenShift or other Kubernetes-based platforms, ensuring efficient application lifecycle management.
- Platform & Service Monitoring: Implement and automate monitoring for both cloud and on-prem environments, ensuring compliance with SLA requirements.
- Capacity Planning & Resource Management: Monitor and optimize GPU and CPU utilization, ensuring resources are allocated efficiently across workloads.
- Collaboration & Sprint Planning: Participate in Agile/Scrum sprint planning, collaborating with other teams to ensure tasks are delivered on time and aligned with service-level objectives.
- Process Automation: Automate manual processes such as resource requests, tenant onboarding, and lifecycle management for AI/ML platforms and other workloads.
- Strong experience with automation tools like Ansible and Python scripting for infrastructure management.
- Proficiency in Grafana and Prometheus for monitoring and setting up alerting mechanisms.
- Hands-on experience managing applications in OpenShift or other Kubernetes-based platforms.
- Ability to automate service monitoring and infrastructure scaling in both cloud and on-prem environments, ensuring SLA compliance.
- Experience with infrastructure management for cloud (GCP) and hybrid environments.
- Experience with infrastructure as code (IaC) tools (Terraform).
- Flexible vacation policy; time is not limited, allocated, or accrued
- 16 paid holidays throughout the year
- Generous parental leave and new parent transition program
- Tuition reimbursement
- Corporate gift matching program
-
Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Resource Informatics Group Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...
-
Site Reliability Engineer
5 days ago
Irving, Texas, United States Resource Informatics Group Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...
-
Site Reliability Engineer
5 days ago
Irving, Texas, United States Resource Informatics Group Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...
-
Site Reliability Engineer
1 month ago
Irving, Texas, United States Resource Informatics Group Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services and applications.Key Responsibilities:Develop and maintain comprehensive...
-
Site Reliability Engineer
2 weeks ago
Irving, Texas, United States Resource Informatics Group Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Resource Informatics Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services and applications.Key Responsibilities:Develop and maintain comprehensive...
-
Site Reliability Engineer
2 weeks ago
Irving, Texas, United States PTR Global Full timeSite Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at PTR Global.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure solutions.Collaborate with cross-functional teams to identify and resolve performance issues.Develop and maintain monitoring and observability tools using...
-
Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Tata Consultancy Services Full timeJob DescriptionAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Tata Consultancy Services. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ecommerce platform.Key Responsibilities:Design and implement scalable and reliable systems to manage...
-
Site Reliability Engineer
4 weeks ago
Irving, Texas, United States Creospan Full timeJob Title: Site Reliability EngineerWe are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Develop and maintain scripts to automate tasks and processes...
-
Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Creospan Full timeJob Title:Site Reliability EngineerJob Summary:We are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems. This role requires advanced technical skills, a proactive approach to...
-
Site Reliability Engineer
5 days ago
Irving, Texas, United States Citigroup Inc Full timeAbout CitiCiti, a leading global bank, serves over 200 million customers worldwide, operating in more than 160 countries and jurisdictions. As a bank with a strong presence in the global market, Citi provides a wide range of financial products and services to consumers, corporations, governments, and institutions.Job OverviewThe Site Reliability Engineer...
-
Site Reliability Engineer
5 days ago
Irving, Texas, United States Diverse Lynx Full timeJob DescriptionJob Title: Site Reliability EngineerCompany: Diverse Lynx LLCJob Type: Full-timeLocation: RemoteAbout Us: Diverse Lynx LLC is an Equal Employment Opportunity employer. We promote and support a diverse workforce across all levels in the company.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team. The successful...
-
Senior Site Reliability Engineer
3 weeks ago
Irving, Texas, United States DEFENDERS Full timeJob Summary:DEFENDERS is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our software applications. You will work closely with our development and infrastructure teams to identify and resolve issues, and implement...
-
Site Reliability Engineer
2 weeks ago
Irving, Texas, United States Citigroup Inc Full timeJob DescriptionAs a Site Reliability Engineer at Citigroup Inc., you will play a critical role in ensuring the stability, efficiency, and observability of our Global Wholesale Lending Technology (WLT) environment. You will work closely with technology leads, architects, engineers, and other stakeholders to identify and resolve production incidents, develop...
-
Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Creospan Full timeJob Title: Site Reliability EngineerWe are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team at Creospan. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Automation and Scripting:Develop and maintain scripts...
-
Site Reliability Engineer
1 month ago
Irving, Texas, United States Creospan Inc. Full timeJob Title: Site Reliability EngineerAt Creospan Inc., we are seeking a highly experienced Site Reliability Engineer to join our Application Production Support team. The ideal candidate will have a strong background in ensuring the reliability, performance, and scalability of complex systems.Key Responsibilities:Automation and Scripting:Develop and maintain...
-
Senior Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Wells Fargo Full timeAbout this RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Wells Fargo. As a Senior Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud infrastructure. You will work closely with cross-functional teams to identify and resolve...
-
Site Reliability Engineer
5 days ago
Irving, Texas, United States Citigroup Inc Full timeJob Description:Citigroup Inc. is seeking a highly skilled Site Reliability Engineer to join our Global Wholesale Lending Technology team. As a key member of our technology organization, you will play a critical role in ensuring the stability, efficiency, and observability of our technology environment.Responsibilities:Partner with technology leads and...
-
Site Reliability Engineer
3 weeks ago
Irving, Texas, United States Publicis Groupe Full timeJob Title: Site Reliability EngineerPublicis Sapient is a digital transformation partner helping established organizations get to their future, digitally enabled state, both in the way they work and the way they serve their customers.We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with...
-
Site Reliability Engineering Manager
5 days ago
Irving, Texas, United States Foot Locker Full timeJob Title: Sr. Site Reliability Engineering ManagerWe are seeking a highly skilled and experienced Sr. Site Reliability Engineering Manager to join our team at Foot Locker, Inc. This is a unique opportunity to lead a talented team of automation engineers and drive innovative solutions at the intersection of technology and sneaker culture.Job Summary:The Sr....
-
Site Reliability Engineering Manager
1 week ago
Irving, Texas, United States Foot Locker Full timeJob Title: Sr. Site Reliability Engineering ManagerJoin Foot Locker, Inc. as a Sr. Site Reliability Engineering Manager and be part of a dynamic team that drives innovation and excellence in the sneaker industry.About the Role:We are seeking a highly skilled and experienced Site Reliability Engineering Manager to lead our IT Tools and Observability Services...