Cloud Site Reliability Engineer
1 week ago
Cloud Site Reliability Engineer
Work Authorization: USC , GC ,GC EAD ONLY
Roles & Responsibilities
Role: Cloud Site Reliability Engineer (SRE)
- Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.
- Experience with Python, Ansible, Golang, and shell scripting.
- Kubernetes /Openshift /Terraform certifications are a plus.
- Strong experience in major services related to Compute, Storage, Network and Security.
- Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics.
- Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and Ping Identity or other SSO solutions.
- Advanced knowledge of Linux OS, DNS, DHCP, Kerberos and Windows Authentication.
- Experience with CI/CD tools git /Jenkins, GitOps model.
- Excellent understanding of Linux /Windows operating systems administration.
- Experience in Container security and vulnerability remediation.
- Systematic problem-solving approach, sense of ownership and drive.
- Ability to juggle competing priorities and adapt to changes in project scope.
- Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
- Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
- Responsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google).
- Monitor and troubleshoot Container platform (Openshift), Rancher (RKE) and Azure (AKS) environment performance issues, connectivity issues, security issues, etc.
- Perform deep dives into systemic and latent reliability issues, Incident management, problem management.
- Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
- Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
- Responsible for application onboarding and provide troubleshooting support through the lifecycle of the applications on the container platform.
- Identify and drive opportunities to improve automation to reduce TOIL and improve operational excellence.
- Partner with risk, and compliance teams to bring visibility and implement right controls and remediation of vulnerabilities.
- Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams.
- Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams.
- Participate in 24x7 on-call coverage follow the sun model.
-
Cloud Site Reliability Engineer
2 weeks ago
atlanta, United States Tata Consultancy Services Full timeCloud Site Reliability Engineer Work Authorization: USC , GC ,GC EAD ONLYRoles & ResponsibilitiesRole: Cloud Site Reliability Engineer (SRE)Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.Experience with Python, Ansible, Golang, and shell scripting.Kubernetes /Openshift /Terraform certifications are a...
-
Cloud Site Reliability Engineer
2 weeks ago
Atlanta, United States Tata Consultancy Services Full timeCloud Site Reliability Engineer Work Authorization: USC , GC ,GC EAD ONLYRoles & ResponsibilitiesRole: Cloud Site Reliability Engineer (SRE)Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.Experience with Python, Ansible, Golang, and shell scripting.Kubernetes /Openshift /Terraform certifications are a...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Now100 Full timeJob Title: Site Reliability Engineer - Cloud Infrastructure SpecialistCompany Overview: Now100 is a leading provider of technology solutions, committed to delivering exceptional results for our clients. We match thoroughly vetted resources to contract, contract-to-hire, and permanent positions in all industries.Job Description: We are seeking a highly...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Navtech Full timeJob Title: Site Reliability EngineerJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Navtech. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our production systems.Key Responsibilities:Provide L4 technical support for production 24x7Design and...
-
Site Reliability Engineer
1 month ago
atlanta, United States Softworld, a Kelly Company Full timeThe Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system...
-
Site Reliability Engineer
1 month ago
Atlanta, United States Softworld, a Kelly Company Full timeThe Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Jonas Software UK Full timeAbout the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Jonas Software UK. As a key member of our technical operations team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Senior Cloud Reliability Engineer
1 month ago
Atlanta, Georgia, United States IRIS Consulting Corporation Full timeJob DescriptionWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at IRIS Consulting Corporation. As a key member of our Retail, Site Reliability Engineering team, you will be responsible for establishing and maintaining the reliability of our cloud-based infrastructure and applications.Key Responsibilities:Design and implement...
-
Site Reliability Engineer
2 weeks ago
atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
2 weeks ago
atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
2 weeks ago
Atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
2 weeks ago
Atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
1 month ago
Atlanta, Georgia, United States Kobiton Full timeAbout the RoleKobiton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our systems and services.You will work closely with development and operations teams to build and maintain robust infrastructure, automate...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Microsoft Corporation Full timeWe are seeking a highly skilled Senior Site Reliability Engineer to join our Windows Servicing and Delivery team at Microsoft Corporation.The ideal candidate will have a strong background in software engineering, network engineering, or systems administration, with a proven track record of delivering high-quality solutions that meet customer needs.As a...
-
Senior Cloud Reliability Engineer
1 month ago
Atlanta, Georgia, United States Duck Creek Technologies Full timeJob Title: Senior Cloud Reliability EngineerAbout the Role:We are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Duck Creek Technologies. As a key member of our engineering organization, you will be responsible for designing and implementing scalable, secure, and highly available cloud solutions. Your expertise in cloud...
-
Senior Site Reliability Engineering Manager
4 weeks ago
Atlanta, Georgia, United States Microsoft Corporation Full timeAbout the RoleMicrosoft Corporation is seeking a highly skilled Senior Site Reliability Engineering Manager to lead the delivery of critical features in Office 365 government cloud offerings. As a key member of the Office 365 team, you will be responsible for combining your passion for quality, reliability, and creativity to drive evolution in the continuous...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Pyramid Consulting Full timeJob SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Pyramid Consulting, Inc. This is a contract opportunity with long-term potential and is located in Atlanta, GA.Key ResponsibilitiesDesign and implement SLOs / SLIs / error budgets and manage reliability for infrastructure and applicationsProven experience with...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Pyramid Consulting Full timePyramid Consulting is seeking a talented Senior Site Reliability Engineer to join our team. This is a contract opportunity with long-term potential and is located in a major US city. The successful candidate will have a strong background in setting SLOs / SLIs / error budgets and managing reliability for infrastructure and applications.Key...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Jobs for Humanity Full timeAbout the Role:FIS is seeking a Site Reliability Engineer to join our innovative Platform Service Delivery team. As a key member of our team, you will be responsible for ensuring the high stability, reduced Service Downtime, and improved Quality of Service for FIS clients.Key Responsibilities:Participate in day-to-day activities of operating the payment...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States JobRialto Full timeJob SummaryThe Site Reliability Engineer is responsible for ensuring the availability, scalability, and performance of critical services and systems. This role requires expertise in OpenShift and CloudFormation, along with a deep understanding of site reliability principles, container technologies, monitoring tools, and automation.Key ResponsibilitiesEnsure...