Current jobs related to Service Reliability Engineer - Fort Mill - Coforge


  • Fort Mill, South Carolina, United States Coforge Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Coforge. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our cloud-based systems.Key Responsibilities:Design and implement monitoring and observability solutions using CloudWatch and...


  • Fort Mill, South Carolina, United States COFORGE Marketing Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Coforge. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our systems and applications.Key Responsibilities:Design and implement monitoring and observability solutions using Dynatrace,...


  • Fort Mill, South Carolina, United States Coforge Full time

    Job Title: Site Reliability EngineerExperience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCCoforge is seeking a skilled Site Reliability Engineer to join our team.Key Responsibilities:Deploy monitoring and observability solutions to ensure high availability and performance of our systemsAnalyze and...


  • Fort Mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • Fort Mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • Fort Mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • fort mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • fort mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • Fort Mill, United States Coforge Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring Site Reliability Engineer with the following skillset:The primary focus of this role is to work alongside intake to onboard and deploy monitoring/observability, facilitate both...


  • Fort Mill, South Carolina, United States COFORGE Marketing Full time

    Job Title: Site Reliability Engineer (SRE)Experience: 8+ YearsSkills: SRE, CloudWatch, Dynatrace, SolarWinds, AlertSite, ELK and KibanaLocation: Fort Mill, SCWe at Coforge are hiring a Site Reliability Engineer with the following skillset:Key Responsibilities:Work alongside the intake team to onboard and deploy monitoring/observabilityFacilitate proactive...


  • Fort Dodge, IA, United States Valero Service Full time

    About the Role:Valero Service is seeking a highly skilled Reliability Engineer to join our team. As a Reliability Engineer, you will play a critical role in improving fixed equipment reliability and ensuring compliance with company and OSHA PSM standards.Key Responsibilities:Provide technical support to improve fixed equipment reliabilityDevelop and...


  • Fort Dodge, IA, United States Valero Service Full time

    Job Summary: We are seeking a highly skilled Reliability Engineer to join our team at Valero Service. As a Reliability Engineer, you will play a critical role in improving fixed equipment reliability and ensuring compliance with company and OSHA PSM standards.About the Role: As a Reliability Engineer, you will provide technical support to improve fixed...

  • Reliability Engineer

    2 weeks ago


    Fort Worth, Texas, United States QuEST Global Full time

    Job Title: Reliability EngineerWe are seeking a skilled Reliability Engineer to join our team at QuEST Global. As a Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our equipment and processes.Key Responsibilities:Analyze failure data to identify patterns and trends.Conduct root cause analysis to determine the...

  • Reliability Engineer

    2 weeks ago


    Fort Worth, Texas, United States QuEST Global Full time

    Job Title: Reliability EngineerWe are seeking a skilled Reliability Engineer to join our team at QuEST Global. As a Reliability Engineer, you will play a crucial role in ensuring the reliability and performance of our equipment and systems.Key Responsibilities:Analyze failure data to identify patterns and trends.Conduct root cause analysis to determine the...


  • Fort Walton Beach, Florida, United States Booz Allen Hamilton Full time

    Job SummaryWe are seeking a highly skilled Reliability Engineer, Lead to join our team at Booz Allen Hamilton. As a key member of our engineering team, you will be responsible for leading the development and implementation of reliability engineering solutions to ensure the delivery of high-quality products and services to our customers.The ideal candidate...


  • Fort Worth, Texas, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications.Key Responsibilities:Design, implement, and maintain scalable and reliable...

  • Reliability Engineer

    2 months ago


    Fort Worth, United States Heidelberg Materials US, Inc. Full time

    AutoReqId: 20641BR Pay Class: Salaried Exempt Minimum Pay Rate: $102,300.00 Maximum Pay Rate: $140,662.50 Department: Technical Line of Business: Corporate Administration (SSC, IT, Employee Services, Etc.) Position Type: Full-Time Job Posting: THE ROLE AND THE COMPANY Heidelberg Materials provides the materials to build...


  • Fort Worth, Texas, United States Heidelberg Materials US, Inc. Full time

    Job Title: Senior Reliability EngineerWe are seeking a highly skilled Senior Reliability Engineer to join our team at Heidelberg Materials US, Inc. in Irving, Texas. As a key member of our Remote Optimization Center, you will be responsible for providing technical support and expertise in reliability engineering to multiple cement plants within a designated...


  • Fort Worth, Texas, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • Fort Worth, Texas, United States US Main Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at US Main. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our systems.Key Responsibilities:Assist SRE teams in defining and achieving goals by organizing and facilitating ceremonies such...

Service Reliability Engineer

3 months ago


Fort Mill, United States Coforge Full time

Job Title: Service Reliability Engineer

Experience: 8-10 Years

Skills: SRE, CI/CD, IAC

Location: Fort Mills, SC


We at Coforge are hiring Service Reliability Engineer with the following skillset:


  • Proficiency in Core SRE Principles: Expertise in essential SRE concepts such as CUJ, SLO, SLI, and Error Budgeting based on NFRs and ability to apply these principles effectively to ensure service reliability, meet business objectives, and drive continuous improvement initiatives.
  • Experience of Reducing TOIL: Identifying manual and repetitive tasks within the Software Development Life Cycle (SDLC) or IT operations and implementing automation solutions to reduce the TOIL. Ability to streamline processes, enhance productivity, and free up resources for more strategic initiatives through automation and process improvement.
  • Comprehensive CI/CD Proficiency: Strong understanding of Continuous CI/CD practices, with robust knowledge of Git, GitHub Actions and GitHub Workflows. Familiarity with other tools such as Jenkins and similar would be advantageous.
  • Engage in and improve the whole life cycle of application and cloud services-from inception and design, through deployment, operation, and refinement.
  • Design, develop, ship, and motivate the creation of software and systems to increase product reliability and organizational efficiency.
  • Lead development and tracking of SRE Error Budgets.
  • Lead development of ‘SRE dashboard’.
  • Lead root cause investigations.
  • Proactively identify system anomalies.
  • Recognize automation opportunities.
  • Plug into software release cycle. Work closely with developers to ensure software releases are well designed, planned, implemented, released, and monitored.
  • Automate time-consuming and manual processes.
  • Assess current SRE solution and define the SRE approach for products.
  • Work with applications development teams on designing, implementing, and improving SRE practices.
  • Cloud Platform Expertise: Cloud platform experience with AWS, hands-on experience with key cloud services, including logging & monitoring,
  • Strong Knowledge on IAC: Expertise in Infrastructure as Code (IAC) and strong command on Terraform for provisioning and managing cloud infrastructure.
  • Proficiency in Container Orchestration: Hands-on experience in creating and managing Docker images, ensuring optimal performance and security. Proficiency in Kubernetes platform including the ability to effectively manage containerized applications, scale resources as needed, and troubleshoot issues in production environments.
  • Monitoring and Observability: Experience with monitoring tools such as Prometheus, Grafana, and ELK Stack and should be able to set up and configure monitoring solutions, utilize metrics for performance optimization, and troubleshoot issues effectively.
  • Strong understanding of cloud platforms like AWS and infrastructure automation tools.
  • Proven ability to design and implement monitoring solutions that ensure system uptime and performance.
  • Experience with AIOps principles and automation best practices.
  • Excellent communication, collaboration, and problem-solving skills.


Responsibilities:


  • Design and implement a comprehensive monitoring strategy for cloud infrastructure and applications.
  • Leverage industry-leading tools like Dynatrace, Splunk, and Elastic Stack for real-time monitoring and troubleshooting.
  • Develop and configure health probes and insightful alerts to proactively identify and address potential issues.
  • Champion the adoption and implementation of AIOps platform for automated incident resolution and self-healing infrastructure.
  • Collaborate with development teams to translate operational insights into actionable requirements for high-quality software releases.
  • Design and execute reliability tests to ensure system stability and production readiness.
  • Maintain a deep understanding of cloud platforms like AWS and utilize infrastructure automation tools like Terraform and Ansible Tower.


Please send the updated resume to Venkata.Harsha@coforge.com