SRE Lead

1 week ago


Dallas, United States TEKsystems Full time

Description: SRE Lead - • 3rd: Mon-Fri, 11pm-7:30am EST: REMOTE IN HUB Hubs: Pittsburgh, PA, Phoenix, AZ, Dallas, TX, Birmingham, AL The client is looking to add an SRE Lead for 3rd shift sitting hybrid in one of the technology hubs to oversee a team of global contractors (L1.5 Engineers). Role Overview: As an SRC Lead, you’ll be at the forefront of ensuring the reliability, availability, and performance of critical enterprise technology and security applications. Your leadership will drive operational excellence, foster collaboration, and elevate the overall reliability of our systems within the Site Reliability Center (SRC). You’ll work closely with cross-functional teams, mentor engineers, and contribute to the success of the organization. NOTE FOR THE SKILLS/TECHNOLOGIES Be knowledgeable enough to jump in, drive the conversations to resolution, and escalate if needed to the client Application System Managers/SMEs (ex. Here is the problem, here is what we think it is, here is the solutions we think we should do, what do you want to do?). Top Technologies: • Monitoring and Debugging Tools (LogScale, Splunk, Dynatrace) • DevOps pipeline (Git, Jenkins, Artifactory) • Infrastructure (Red Hat Linux, Openshift, Windows) • Networking (DNS, Load-balancing, Network tracing, Firewall) • Database (Oracle, SQL) • API understanding & Web services technologies: (SOAP, JSON, REST) • Directories (LDAP, Active Directory) • Java Secondary: • Python/Java Scripting, Ansible, Powershell for Automation purposes • Modern development technologies and tools: (Agile, CI/CD, Git, Jenkins) • Kafka Event Streaming • ETL/Informatica Responsibilities Summary: Production Support. NOT new development. Troubleshoot highly technical problems which may require assessing source code to analyze and resolve problems. This requires advanced troubleshooting skills and the ability to adapt and create non-standard approaches to problem-solving. *There are 185 applications and platforms combined in this space. It is acknowledged that expertise is not expected in all, but the emphasis will be needed to develop SMEs for the Criticality Level 0/1 mnemonics, which are reflected in the top skills. We are looking for someone astute enough to see a problem and fix it or escalate it to the client SME teams and learn from how they fix the problem. Runbooks should then be updated accordingly. Key responsibilities: • Create and Maintain documentation to ensure knowledge accessibility. • Liaise with other application support teams and internal/external business and technical partners. • Provide ad hoc and on-demand reports. • Perform timely escalation of critical issues and proactively identify patterns of recurring issues to improve production. • Lead problem resolution conduct root cause analysis and establish processes that will help incident prevention. • Participate in the Incident and Problem Management processes as a resolver accountable for root cause analysis, resolution, and reporting. • Guidance to all staff involved and vendors in driving a coordinated approach for results. • Reduce escalations to Level 3 based on incremental learning about applications. Technical Acumen and System Familiarity: While the majority of the role involves management, the SRC Lead should possess a solid understanding of the systems and technical stacks they are supporting. They should be able to pull up dashboards, troubleshoot issues, and guide conversations related to system health. Additionally, they must effectively manage impact and risk. System Monitoring and Health: Lead the production environment by monitoring availability and taking a holistic view of system health. Quality and Time-to-Market: Drive improvements in reliability, quality, and time-to-market for software solutions. Performance Optimization: Continuously optimize system performance, anticipating customer needs and innovating for excellence. Operational Leadership: Provide primary operational support for large-scale distributed software applications. Mentorship: Mentor and guide engineers within your shift team, fostering growth and technical expertise. Stakeholder Communication: Manage team operations while effectively communicating with directors and other executives/CIOs who have a stake. Qualifications: • Proactive Approach: Take a proactive approach to identifying problems, performance bottlenecks, and areas for improvement. • Leadership Experience: Demonstrated leadership in technical roles, preferably within Site Reliability Engineering (SRE) or DevOps. • Continuous Improvement: Foster a culture of continuous improvement and technical excellence, proactively identifying patterns of recurring issues to enhance stability and improved processes (automation opportunities, etc). Skills: monitoring tools, Splunk, Dynatrace, DevOps, Jenkins, Openshift, Redhat, Linux, Windows, networking, DNS, Firewall, SQL, Oracle, API, Java, Python, Kafka, ETL Top Skills Details: • Monitoring and Debugging Tools (LogScale, Splunk, Dynatrace) • DevOps pipeline (Git, Jenkins, Artifactory) • Infrastructure (Red Hat Linux, Openshift, Windows) • Networking (DNS, Load-balancing, Network tracing, Firewall) • Database (Oracle, SQL) • API understanding & Web services technologies: (SOAP, JSON, REST) • Directories (LDAP, Active Directory) • Java Additional Skills & Qualifications: Nice to Have: • Database (Mongo, Cassandra, other databases) • Evolven Experience Level: Expert Level About TEKsystems: We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company. The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.


  • SRE Lead

    3 days ago


    Dallas, United States TEKsystems Full time

    The client is looking to add an SRE Lead for 3rd shift sitting hybrid in one of the technology hubs to oversee a team of global contractors (L1.5 Engineers). Role Overview: As an SRC Lead, you’ll be at the forefront of ensuring the reliability, availability, and performance of critical enterprise technology and security applications. Your leadership will...

  • SRE Lead

    6 days ago


    Dallas, United States VORTO Full time

    Description:SRE Lead - •3rd: Mon-Fri, 11pm-7:30am EST: REMOTE IN HUBHubs: Pittsburgh, PA, Phoenix, AZ, Dallas, TX, Birmingham, ALThe client is looking to add an SRE Lead for 3rd shift sitting hybrid in one of the technology hubs to oversee a team of global contractors (L1.5 Engineers).Role Overview: As an SRC Lead, you’ll be at the forefront of ensuring...

  • SRE Lead

    2 days ago


    Dallas, United States TEKsystems Full time

    *Description:* SRE Lead - 3rd: Mon-Fri, 11pm-7:30am EST: REMOTE IN HUB Hubs: Pittsburgh, PA, Phoenix, AZ, Dallas, TX, Birmingham, AL The client is looking to add an SRE Lead for 3rd shift sitting hybrid in one of the technology hubs to oversee a team of global contractors (L1.5 Engineers). Role Overview: As an SRC Lead, youll be at the forefront of ensuring...

  • SRE Program Principal

    4 weeks ago


    Dallas, United States TEKsystems Full time

    Description: 100% Remote SRE Leadership Position We are seeking a dynamic technology leader to join our team as the Program Principal for Site Reliability Engineering. This strategic role will be pivotal in leading and executing large-scale Site reliability engineering customer engagements, ensuring the stability and performance of Modern cloud and legacy...

  • SRE Program Principal

    2 weeks ago


    Dallas, United States TEKsystems Full time

    Description: We are seeking a dynamic technology leader to join our team as the Program Principal for Site Reliability Engineering. This strategic role will be pivotal in leading and executing large-scale Site reliability engineering customer engagements, ensuring the stability and performance of Modern cloud and legacy systems and applications by...


  • Dallas, United States TekWissen LLC Full time

    Job DescriptionJob DescriptionOverview: TekWissen Group is a workforce management provider throughout the USA and many other countries in the world. Our client is an American multinational information technology services and consulting company and is a leading provider of information technology, consulting, and business process outsourcing services,...


  • Dallas, United States Cognizant North America Full time

    Sr. Site Reliability Engineer (SRE) Cognizant's Digital Engineering practice is seeking a highly qualified Sr. Site Reliability Engineer with 10+ years plus experience developing and building high-performing, scalable, enterprise applications. You will be part of a digital software team that works on high-demand applications. Our engineers have a passion for...

  • Site Reliability Engineer

    36 minutes ago


    Dallas, United States VIZIO Full time

    About the Team: VIZIO releases firmware & software for millions of customers in a time efficient manner. Our goal is to maintain 99.9% uptime for our customers. We are seeking a Site Reliability Engineer to join our expanding organization. The Site Reliability Engineer will report to the Manager, DevOps Security and will play a crucial role in enhancing the...

  • Senior SRE Engineer

    1 month ago


    Dallas, United States RingCentral Full time

    Say hello to possibilities. It's not everyday that you consider starting a new career. We're RingCentral, and we're happy that someone as talented as you is considering this role. First, a little about us, we're the $2 billion global leader in cloud-based communications and collaboration software. We are fundamentally changing the nature of human...

  • Senior SRE Engineer

    2 weeks ago


    Dallas, Texas, United States RingCentral Full time

    Say hello to opportunities. It's not everyday that you consider starting a new career. We're RingCentral, and we're happy that someone as talented as you is considering this role. First, a little about us, we're a $2 Billion annual revenue company with double digit Annual Recurring Revenue (ARR) and a $93 Billion market opportunity in UCaaS, Contact Center...

  • Senior SRE Engineer

    2 weeks ago


    Dallas, United States RingCentral Full time

    Say hello to opportunities. It's not everyday that you consider starting a new career. We're RingCentral, and we're happy that someone as talented as you is considering this role. First, a little about us, we're a $2 Billion annual revenue company with double digit Annual Recurring Revenue (ARR) and a $93 Billion market opportunity in...


  • Dallas, United States Diverse Lynx Full time

    Description: Role: Director - Site Reliability Engineering Location: Dallas, TX (Day 1 Onsite) Long term role Job Description: Domain : Telecommunications Lead strategy and development of SRE functions for consumer applications including Digital, Assisted and Marketing Technology systems to implement best in class SRE practices to improve the reliability of...


  • Dallas, United States Tekwissen Full time

    Overview: TekWissen Group is a workforce management provider throughout the USA and many other countries in the world. Our client is an American multinational information technology services and consulting company and is a leading provider of information technology, consulting, and business process outsourcing services, dedicated helping the world's leading...


  • Dallas, United States Signify Health Full time

    How will this role have an impact? Signify Health is looking for a passionate Site Reliability Engineer (SRE) to enhance our dynamic SRE team. Reporting to the Sr Director of Cloud Operations and SRE, we welcome individuals from different technical backgrounds, especially software engineers aspiring to transition into SRE/DevOps roles.At Signify Health, we...

  • Director of DevOps

    3 weeks ago


    Dallas, Texas, United States ThisWay Global Full time

    Our partner seeks a Director of DevOps & Site Reliability in Dallas, TX, offering a hybrid work model. The role involves managing the infrastructure for travel supply distribution, leading initiatives to enhance reliability, scaling, management, and developer empowerment.Responsibilities:Provide strategic leadership for DevOps & Site Reliability teams for...


  • Dallas, Texas, United States Cognizant Full time

    Cognizant is seeking a Infrastructure Architect (Azure/ SRE/ Dynatrace) (Hybrid) Application Deadline – 28th May 10, 2024 Skill - Azure, SRE, Dynatrace Role Work on tasks such as preventing incidents with setting up alerts for symptoms Coordinating with multiple teams such as azure cloud platform, enterprise monitoring tools, enterprise DevOps and IT...


  • Dallas, United States Saxon Global Full time

    Job Summary: We are looking for a Site Reliability Engineer (SRE) who will be responsible for ensuring the reliability, availability, and performance of our production systems. As an SRE, you will work closely with cross development and engineering teams to design and implement tools and processes to automate deployment, observability, and troubleshooting...


  • Dallas, Texas, United States JPMorganChase Full time

    Job Description We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.As a Lead Software Engineer at JPMorgan Chase within the Consumer & Community Banking - Technology team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology...


  • Dallas, United States Sharp Decisions Full time

    NO 3RD PARTIES, NO C2C, NO H1B, NO RELOCATIONJob Title: Manager, Site ReliabilityJob Summary: As the Manager, Site Reliability Engineer (SRE), you will lead a team of SREs responsible for the availability, performance, and scalability of our services. You will work closely with development, operations, and product teams to build and maintain reliable...


  • Dallas, United States Sharp Decisions Full time

    NO 3RD PARTIES, NO C2C, NO H1B, NO RELOCATION**CONTRACT TO HIRE***Job Title: Manager, Site ReliabilityJob Summary: As the Manager, Site Reliability Engineer (SRE), you will lead a team of SREs responsible for the availability, performance, and scalability of our services. You will work closely with development, operations, and product teams to build and...