Cloud Site Reliability Engineer
1 week ago
Cloud Site Reliability Engineer
Work Authorization: USC , GC ,GC EAD ONLY
Roles & Responsibilities
Role: Cloud Site Reliability Engineer (SRE)
- Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.
- Experience with Python, Ansible, Golang, and shell scripting.
- Kubernetes /Openshift /Terraform certifications are a plus.
- Strong experience in major services related to Compute, Storage, Network and Security.
- Experience with monitoring tools like Prometheus and Dynatrace, as well as cloud native tools like Azure Monitor and Log Analytics.
- Strong understanding and background of working with a complex IAM infrastructure, including Active Directory, Azure AD Connect, Azure AD, and Ping Identity or other SSO solutions.
- Advanced knowledge of Linux OS, DNS, DHCP, Kerberos and Windows Authentication.
- Experience with CI/CD tools git /Jenkins, GitOps model.
- Excellent understanding of Linux /Windows operating systems administration.
- Experience in Container security and vulnerability remediation.
- Systematic problem-solving approach, sense of ownership and drive.
- Ability to juggle competing priorities and adapt to changes in project scope.
- Excellent interpersonal, organizational and communication (written, verbal, and presentation) skills are a must.
- Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
- Responsible for reliability and support of Container Platform on-prem and external clouds (Azure /AWS /Google).
- Monitor and troubleshoot Container platform (Openshift), Rancher (RKE) and Azure (AKS) environment performance issues, connectivity issues, security issues, etc.
- Perform deep dives into systemic and latent reliability issues, Incident management, problem management.
- Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues.
- Perform blameless RCA, partner with engineering and operation teams across the organization to roll out fixes.
- Responsible for application onboarding and provide troubleshooting support through the lifecycle of the applications on the container platform.
- Identify and drive opportunities to improve automation to reduce TOIL and improve operational excellence.
- Partner with risk, and compliance teams to bring visibility and implement right controls and remediation of vulnerabilities.
- Ensure resiliency during implementation and identify/fix resiliency problems by collaborating with engineering teams.
- Be a key stakeholder in the design of cloud services and work with Architecture, engineering, product teams.
- Participate in 24x7 on-call coverage follow the sun model.
-
Cloud Site Reliability Engineer
1 week ago
Atlanta, United States Tata Consultancy Services Full timeCloud Site Reliability Engineer Work Authorization: USC , GC ,GC EAD ONLYRoles & ResponsibilitiesRole: Cloud Site Reliability Engineer (SRE)Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.Experience with Python, Ansible, Golang, and shell scripting.Kubernetes /Openshift /Terraform certifications are a...
-
Cloud Site Reliability Engineer
7 days ago
atlanta, United States Tata Consultancy Services Full timeCloud Site Reliability Engineer Work Authorization: USC , GC ,GC EAD ONLYRoles & ResponsibilitiesRole: Cloud Site Reliability Engineer (SRE)Minimum 5+ years of hands-on experience supporting Kubernetes /Openshift / RKE / EKS Container platform.Experience with Python, Ansible, Golang, and shell scripting.Kubernetes /Openshift /Terraform certifications are a...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Motion Recruitment Full timeJob Title: Site Reliability Engineer - Azure Cloud ExpertAbout the Role: We are seeking a highly skilled Site Reliability Engineer to join our team in Atlanta. As a Site Reliability Engineer, you will be responsible for ensuring the scalability and reliability of our ecommerce applications on Azure cloud.Key Responsibilities:* Proactively monitor and...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Now100 Full timeJob Title: Site Reliability Engineer - Cloud Infrastructure SpecialistCompany Overview: Now100 is a leading provider of technology solutions, committed to delivering exceptional results for our clients. We match thoroughly vetted resources to contract, contract-to-hire, and permanent positions in all industries.Job Description: We are seeking a highly...
-
Site Reliability Engineer
1 month ago
Atlanta, Georgia, United States Ditto Job Board Full timeJob Title: Site Reliability EngineerAt Ditto, we're on a mission to unleash the full power of edge devices by removing all the plumbing required to build amazing applications. As a Site Reliability Engineer, you'll play a critical role in helping us achieve this goal.About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Federal...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Navtech Full timeJob Title: Site Reliability EngineerJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Navtech. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our production systems.Key Responsibilities:Provide L4 technical support for production 24x7Design and...
-
Site Reliability Engineer
4 weeks ago
Atlanta, United States Softworld, a Kelly Company Full timeThe Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system...
-
Site Reliability Engineer
4 weeks ago
Atlanta, United States Softworld, a Kelly Company Full timeThe Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system...
-
Site Reliability Engineer
4 weeks ago
atlanta, United States Softworld, a Kelly Company Full timeThe Cloud Site Reliability Engineer (SRE) works closely with cloud development team, IT operations team and business partners to streamline and implement enhanced monitoring and alerting capability across infrastructure, application layers. By leveraging automation tools, SREs address and resolve issues, minimizing manual workload and enhancing system...
-
Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Della Infotech Full timeJob DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Della Infotech. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using AWS...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Jonas Software UK Full timeAbout the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Jonas Software UK. As a key member of our technical operations team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...
-
Senior Cloud Reliability Engineer
1 month ago
Atlanta, Georgia, United States IRIS Consulting Corporation Full timeJob DescriptionWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at IRIS Consulting Corporation. As a key member of our Retail, Site Reliability Engineering team, you will be responsible for establishing and maintaining the reliability of our cloud-based infrastructure and applications.Key Responsibilities:Design and implement...
-
Site Reliability Engineer
1 week ago
Atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
1 week ago
Atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
1 week ago
atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
1 week ago
atlanta, United States Datum Technologies Group Full timeOpening for SRE – Atlanta GA- Hybrid . Site Reliability Engineer Long term contract Atlanta, GA Qualifications:Deep understanding of AWS services (Lambda, S3, SQS, IAM, Route 53 etc.) and proficiency in infrastructure as code (e.g., Terraform, CloudFormation).Hands-on experience with monitoring tools such as CloudWatch, Sumo Logic, Dynatrace, Grafana,...
-
Site Reliability Engineer
1 month ago
Atlanta, Georgia, United States Kobiton Full timeAbout the RoleKobiton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our systems and services.You will work closely with development and operations teams to build and maintain robust infrastructure, automate...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States Microsoft Corporation Full timeWe are seeking a highly skilled Senior Site Reliability Engineer to join our Windows Servicing and Delivery team at Microsoft Corporation.The ideal candidate will have a strong background in software engineering, network engineering, or systems administration, with a proven track record of delivering high-quality solutions that meet customer needs.As a...
-
Senior Site Reliability Engineer
1 month ago
Atlanta, Georgia, United States STORD Full timeAbout the RoleStord is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for designing and implementing scalable, efficient, and secure infrastructure and platform solutions.You will collaborate with cross-functional teams to deliver high-quality products and services to our...
-
Senior Site Reliability Engineer
4 weeks ago
Atlanta, Georgia, United States SIDEARM Sports Full timeJob SummaryAt SIDEARM Sports, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you'll play a critical role in ensuring the reliability, availability, and performance of our live services, which impact millions of customers across the entertainment space.Key ResponsibilitiesCollaborate with...