Senior Site Reliability Engineer

1 week ago


Dallas, Texas, United States Cognizant Full time

Senior Site Reliability Engineer (Hybrid)

Cognizant stands as a prominent global entity delivering IT solutions, encompassing digital transformation, technology services, consulting, and operational support. At Cognizant, we embrace innovative thinking and explore new concepts daily. Our mission is to assist leading enterprises in reimagining their business frameworks and developing products that generate fresh value by bridging connections between individuals, insights, and experiences.

Our digital engineering division specializes in designing, developing, and delivering digital products and experiences that propel digital-first business strategies. We collaborate with clients to enhance their native cloud applications, offering extensive digital engineering expertise and a client-focused approach for sustainable innovation.

Job Overview

The Digital Engineering team at Cognizant is in search of a highly skilled Site Reliability Engineer who possesses a wealth of experience in crafting and constructing robust, scalable enterprise applications. You will integrate into a digital software team dedicated to high-demand applications.

Our engineers are driven by a commitment to producing high-quality, dependable, and maintainable code. You will collaborate closely with product managers, designers, and clients, making collective decisions to swiftly deliver valuable software solutions to our clients and their users. Our team operates with agility and a reflective mindset, unafraid to pinpoint areas for improvement and celebrate our successes. Ultimately, we measure our achievements by the success of our team and the satisfaction of our customers.

Key Responsibilities
  • Ensure monitoring and alerting systems notify on symptoms rather than outages.
  • Transform findings into repeatable actions and subsequently into automation.
  • Enhance deployment, change management, and release management processes for efficiency.
  • Troubleshoot production issues across various services and stack levels.
  • Propose innovative ideas and solutions within the product team to bolster resiliency, availability, and security.
  • Plan and execute configuration change operations at both application and infrastructure levels.
  • Continuously seek opportunities to enhance system availability and performance based on monitoring insights.
  • Conduct Root Cause Analysis (RCA) investigations.
  • Advance DevSecOps practices and expedite delivery while taking a lead role in resolving technical challenges.
  • Contribute to the development of strategic technology roadmaps.
  • Provide support for customer incidents and respond to incidents effectively.
Professional Experience / Qualifications:
  • 12 to 15 years of overall experience.
  • 7 to 8 years of substantial SRE experience.
  • Experience with GitHub, GitAction CI/CD, and ADO cloud for automation.
  • Expertise in monitoring and observability within AKS and Azure cloud, as well as Kubernetes.
  • Proficient in monitoring and metrics using Dynatrace, Prometheus, Grafana, and integrations with Moogsoft/xMatters.
  • Familiarity with open-source logging infrastructure.
  • Experience in Node JS and GQL for at least 2 years.
  • Hands-on experience with IaaS, PaaS tools, and container orchestration platforms (Docker & Kubernetes).
  • Expertise in cloud-native relational databases such as MySQL, PostgreSQL, and NoSQL databases like Cassandra and MongoDB is highly desired.
  • Strong technical knowledge across various hardware, software, and technology platforms.
  • Ability to develop, implement, and maintain applications and systems integrating MongoDB.
  • Experience with Dynatrace and Mezmo.
  • Knowledge of security vulnerabilities and remediation/compliance.
Preferred Qualifications
  • Experience with Terraform in Azure and on-premises infrastructure resources.
  • Skills in load balancing applications, including proxies and CDN automation.
  • Ability to script automated performance testing scenarios for APIs and web front ends, integrating them into CI/CD pipelines.
  • Experience in the airline industry.
  • Proficiency in TypeScript and JavaScript.
  • Familiarity with database and persistence frameworks: Mongo, Oracle, Object/Relational Mapping, and query performance tuning.
  • Experience with Mongo schema design and Mongo aggregation framework.
  • Knowledge of web services: GraphQL, REST/SOAP (JSON/WSDL/XML).
  • DB Admin/SQL Server experience.
  • Familiarity with Terraform.
  • SysAdmin experience.
  • Ability to resolve network issues.
  • Experience in VM management.

Why Choose Cognizant?

Cognizant is recognized as a Military Friendly Employer and is a coalition member of the Veteran Jobs Mission. Our Cognizant Veterans Network supports Veterans in building and advancing their careers at Cognizant, allowing them to showcase the leadership, loyalty, integrity, and commitment to excellence instilled through military service.

Cognizant is an equal opportunity employer committed to fostering a diverse environment. We consider all applicants without regard to race, creed, color, national origin, ancestry, age, marital and family status, disabilities, sexual orientation or preference, veteran status, or any other classification protected by state, federal, or local law.

If you require a reasonable accommodation to search for a job opening or submit an application, please reach out with your request and contact information.

NASDAQ: CTSH

#li-pc1

  • Dallas, Texas, United States JPMorganChase Full time

    Job Description Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the Corporate Sector, Infrastructure Platforms organization, you work with your fellow stakeholders...


  • Dallas, Texas, United States Cambium Learning Group Full time

    Job Overview:We are seeking a highly skilled Senior Site Reliability Engineer to join our dynamic team at Cambium Learning Group. This role will be responsible for enabling application monitoring visibility and reliability, curating monitoring data in Splunk, and enhancing infrastructure monitoring. The ideal candidate will have an exceptional background in...


  • Dallas, Texas, United States JPMorganChase Full time

    Job Description There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Enterprise technology, Infrastructure platforms team, you will solve...


  • Dallas, Texas, United States Goldman Sachs Full time

    About This RoleWe are seeking a highly skilled Site Reliability Engineering Specialist to join our team at Goldman Sachs. As a Site Reliability Engineer, you will play a critical role in ensuring the availability and reliability of our firm's most critical platform services.Key ResponsibilitiesDevelop and implement incident management processes to ensure...


  • Dallas, Texas, United States The Goldman Sachs Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer, VP to join our team at The Goldman Sachs Group. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining large-scale distributed systems that support our business operations.Key ResponsibilitiesOwn technical operations for systems...


  • Dallas, Texas, United States Signify Health Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our vibrant team at Signify Health. As a Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and availability of our products.Key ResponsibilitiesDevelop and Implement Strategies to improve the performance and reliability of our...


  • Dallas, Texas, United States Signify Health Full time

    About the Role:Signify Health is seeking a highly skilled Site Reliability Engineer II to join our vibrant team. As a Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and availability of our products.Your Key Responsibilities:Develop and Implement Strategies: Design and implement strategies to improve the...


  • Dallas, Texas, United States PMG Full time

    PMG is a digital company that helps marketers connect people with their brand. Focused on people and grounded in data, our award-winning culture fosters meaningful careers. Partnering with the most iconic brands in the world, we put people at the center of everything we do to deliver value, innovation, and business transformation.WHO WE AREAgile. Authentic....


  • Dallas, Texas, United States American Airlines Full time

    IntroductionAre you ready to embark on a journey filled with opportunities, both professionally and personally? Become a part of the American Airlines family, where you can explore the globe, enhance your skills, and evolve into your best self. As you begin this exciting chapter, you will face challenges with adaptability and poise, acquiring new...


  • Dallas, Texas, United States Apple Full time

    Job SummaryApple is seeking a highly skilled Site Reliability Engineering Manager to lead a team responsible for providing a platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish.Key ResponsibilitiesEstablish and maintain SRE practices for a private cloud service to...


  • Dallas, Texas, United States Wise Skulls llc Full time

    Job OverviewPosition: Site Reliability Engineer (Python)Location: Dallas, TX (On-site presence required)Contract Duration: 12 monthsPartnering Company: Wise Skulls LLCClient: ConfidentialKey Responsibilities:Minimum of 5 years of relevant experience in the field.Proficient in Python programming and familiar with frameworks such as Django or Flask.Mandatory...


  • Dallas, Texas, United States Zimperium Full time

    Duties:Will define enterprise-wide applications development strategies to ensure that applications andinfrastructure that are brought to market meet customer requirements and are stable, reliable, andproduction ready.Duties include:1. Design, code, test and deliver automation tools for production applications deployment andmaintenance;2. Automate...


  • Dallas, Texas, United States Heidelberg Materials US, Inc. Full time

    About the RoleWe are seeking a highly skilled Reliability Engineer to join our team at Heidelberg Materials US, Inc. as part of our Remote Optimization Center (HROC) in Irving, Texas.Key ResponsibilitiesDevelop and implement reliability engineering strategies and programs across multiple cement plants to improve equipment performance, increase uptime, and...


  • Dallas, Texas, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerLocation: RemoteDuration: Full TimeJob Description:Key Responsibilities:Ensure the reliability and uptime of systems, minimizing downtime and meeting service-level objectives (SLOs).Develop, automate, and implement tools to streamline processes, deploy applications, and manage infrastructure.Set up and maintain monitoring...


  • Dallas, Texas, United States Dice Full time

    About the Role:Dice is a leading career destination for tech experts at every stage of their careers. We're seeking a skilled Reliability Engineer to join our team.Job Summary:We're looking for a highly motivated and experienced Reliability Engineer to join our team. As a Reliability Engineer, you will be responsible for ensuring that our applications are...


  • Dallas, Texas, United States Motion Recruitment Full time

    About the Role:Motion Recruitment is seeking a highly skilled Site Reliability Engineer to join our client's team in Dallas, Texas. As a key member of the digital solutions team, you will be responsible for ensuring the performance, scalability, and reliability of our applications.Key Responsibilities:Monitor application performance and implement solutions...


  • Dallas, Texas, United States Jobot Full time

    Join Our Innovative TeamAbout Jobot:At Jobot, we pride ourselves on making a significant impact in the engineering sector. We are currently in search of a passionate and skilled Permanent Field Engineer to enhance our team.Why Choose Us? Comprehensive Benefits Package 401(K) Plan Generous Paid Time Off Supportive Team Environment that Promotes Career...


  • Dallas, Texas, United States Intertek Group Full time

    Job OverviewPosition: Senior Geotechnical EngineerIntertek Group is seeking a highly skilled Senior Geotechnical Engineer to enhance our Building & Construction division. This role offers an exceptional opportunity to advance your career within the architectural, engineering, and construction (A/E/C) sector.Key Benefits:Career advancement opportunities...


  • Dallas, Texas, United States Aurora Innovation Full time

    About UsAurora Innovation is at the forefront of revolutionizing transportation through self-driving technology. Our mission is to enhance safety, accessibility, and efficiency in transportation systems. The Aurora Driver is a sophisticated self-driving platform that caters to various vehicle types, including freight and passenger transport, forming the...


  • Dallas, Texas, United States Strategic Staffing Solutions Full time

    Job Title: Senior C++ Software Engineer Location: Dallas, TX, On-site Compensation: $52-79/hr Position Overview: The Senior C++ Software Engineer will play a crucial role in the design, development, and upkeep of C++ applications within our financial systems. This position calls for a seasoned professional with extensive knowledge of C++...