SRE Lead

1 month ago


Hartford, Connecticut, United States Tata Consultancy Services Full time

Job Title

SRE Lead - Site Reliability Engineering Expert

Relevant Experience (in yrs)

5+

Work Location (State, City and Zip)

Hartford, Connecticut

Technical/Functional Skills

Hands-on experience in application production support, debugging, and triaging till resolution experience of applications.

Extensive automation experience in Ansible.

Experience in incident, change, and AWS infrastructure management.

Experience in using key metrics like SLO, SLI, and error budgets in measuring quality-of-service delivery and reliability.

Experience in using system reliability metrics like latency, traffic, errors, and saturation.

Proactive monitoring of system and application performance, identifying bottlenecks and areas for improvement.

Experience in self-healing automation using tools like Ansible, Splunk, and Dynatrace.

Automate several operations tasks and increase team efficiency.

Experience in using Dynatrace, Splunk, and AWS CloudWatch to monitor the health and performance of migrated services.

Set up logging and tracing to capture and analyze system behavior and application performance.

Configure alerting thresholds and notifications to quickly identify and respond to issues.

Experience in building IaC scripts in AWS CloudFormation or Terraform to automate the provisioning and management of AWS resources.

Experience in optimizing AWS resource usage (EC2 instances, ECS, RDS databases) for cost-effectiveness and performance.

Experience in using Cost Explorer and Cloudability tools for monitoring and implementing AWS security best practices, including network configurations, IAM policies, and encryption.

Experience in using CICD pipelines and various deployments like Blue Green and Canary.

Roles & Responsibilities

Build SRE team, site reliability engineering capability.

Resolve complex production issues for improving reliability.

Identify opportunities for fully automating operations with various tools and skillsets.

Application migration and release support.

Application support in production environment.



  • Hartford, Connecticut, United States CVS Health Full time

    Job SummaryCVS Health is seeking a skilled Senior Node JS Software Engineer to join our team. As a Senior Node JS Software Engineer, you will be responsible for building resilient and scalable APIs for our Aetna Health customers using the latest cloud technologies and development methodologies.Key ResponsibilitiesDesign and develop scalable APIs using Node...

  • Application Manager

    4 weeks ago


    Hartford, Connecticut, United States The Hartford Full time

    Staff Reliability Engineer - IE07KEWe're committed to making a positive impact and proud to be an insurance company that goes beyond coverages and policies.Working here means having every opportunity to achieve your goals - and to help others accomplish theirs, too.As a Staff Reliability Engineer, you'll be responsible for ensuring application availability...

  • SRE Lead

    6 months ago


    Hartford, United States Tata Consultancy Services Full time

    Job Title SRE Lead Relevant Experience (in yrs) 5+ Work Location (State, City and Zip) Hartford, Connecticut Technical/Functional Skills Hands On Experience in Application Production support, debugging, triaging till resolution experience of applications. Hands on extensive automation experience in ansible Experience in Incident,...