Site Reliability Engineer

2 weeks ago


Dallas, United States Tecktiva Full time

Job Title: Site Reliability Engineer

Location: Phoenix, AZ / dallas, TX

Duration: 6 Months+

Responsibilities:

  • Expert in Observability & SRE principles, SLI, SLO and SLA definition and management
  • Experienced in Grafana stack and other Application performance management tools and frameworks like Elastic stack, AppD, etc.
  • Expert in SRE observability implementation in instrumentation of metrics, logs and traces.
  • Expertise in Docker & Kubernetes is required.
  • Excellent understanding of micro-services architecture, design patterns, and standard methodologies with an eye towards scale, automation, resiliency, and high availability
  • Prior experience dealing with high volume distributed technical architectures with a high cost of failure, i.e. focus on reliability and availability
  • Experienced with telemetry tooling and observability systems such as: Jaegar, Prometheus, OpenTracing, OpenTelemetry, App Dynamics, Splunk, DataDog, NewRelic, Lightstep, Grafana.
  • Experienced with some amount of Big Data technologies such as: ElasticSearch, NoSql Stores, Kafka, Columnar Databases, DataFlow or Pipeline Systems, Graph DataStores.
  • Expert Performance Analyzer, Resilience, Chaos engineering, FMEA, Scalability, High Availability, JProfiler, Thread dump Analyser, etc.
  • Experience with leveraging common infrastructure services like Enterprise Message Bus, Configuration Services, Toggles, Logging Systems, Telemetry for Observability (e.g. OpenTelemetry).
  • Strong experience with ServiceNow ITOM, ITSM Modules that focuses Discovery, Event Management, Incident, Problem and Change Management
  • Strong sense of architecture and design for fault tolerance, scale-out approaches and stability practiced Azure Well Architected framework or other cloud such as Google Cloud Platform
  • Experience in emerging technologies like Machine Learning and AI Ops is a plus.

Warm Regards,

Dev Raj | Delivery Head - Talent Acquisition

Work:

email:

Tecktiva, LLC



  • Dallas, United States Saicon Consultants Full time

    Site Reliability Engineer (Buffer) Location:Dallas, TX Posted On: 11/08/2023 Requirement Code: 66074 Requirement Detail Job Description: Site Reliability Engineer (Buffer) • Bachelor's Degree in Computer Science or related; or equivalent combination of education and experience • 5~~@~~ yrs overall experience in Software Application Development &...


  • Dallas, United States Saxon Global Full time

    As a member of the Production Support/SRE team you will work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering service or solution that we deliver to our stakeholders. You'll excel if you have enthusiasm for digging deep, and a flare for technical communication, prioritization . You will work directly...


  • Dallas, Texas, United States Cognizant Technology Solutions Full time

    Sr. Site Reliability Engineer (SRE)Cognizant's Digital Engineering practice is seeking a highly qualified Sr. Site Reliability Engineer with 10+ years plus experience developing and building high-performing, scalable, enterprise applications. You will be part of a digital software team that works on high-demand applications. Our engineers have a passion for...


  • Dallas, United States Saxon Global Full time

    Job Summary: We are looking for a Site Reliability Engineer (SRE) who will be responsible for ensuring the reliability, availability, and performance of our production systems. As an SRE, you will work closely with cross development and engineering teams to design and implement tools and processes to automate deployment, observability, and troubleshooting...


  • Dallas, United States Saxon Global Full time

    As a member of the Production Support/SRE team you will work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering service or solution that we deliver to our stakeholders. You'll excel if you have enthusiasm for digging deep, and a flare for technical communication, prioritization . You will work directly...


  • Dallas, United States Diverse Lynx Full time

    Job Title: Site Reliability Engineer Location: Dallas, TX//Onsite Duration: Full Time-Only Job Description Responsible for ensuring the reliability of systems, minimizing downtime, and maintaining service-level objectives (SLOs). Developing, automation and implementing automation tools to streamline processes, deploy applications, and manage...


  • Dallas, United States PMG, Inc. Full time

    PMG is a digital company that helps marketers connect people with their brand. Focused on people and grounded in data, our award-winning culture fosters meaningful careers. Partnering with the most iconic brands in the world, we put people at the center of everything we do to deliver value, innovation, and business transformation. WHO WE ARE Agile....


  • Dallas, United States ConsultUSA Full time

    Description:Our client has an immediate need for a Site Reliability Engineer, who will be responsible for enabling engineering teams with guidance and tools to deliver frequent, high quality and reliable components as part of our digital platformRequirements:Bachelor’s degree in Engineering, Computer Science, or a related field5-7 years of experience with...


  • Dallas, United States Veradigm (formerly Allscripts) Full time

    Welcome to Veradigm! Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the largest community of clients in healthcare, Veradigm is able to deliver an...


  • Dallas, United States Cognizant Full time

    About Us:Cognizant (Nasdaq: CTSH) engineers modern businesses. We help our clients modernize technology, reimagine processes and transform experiences so they can stay ahead in our fast-changing world. Together, we're improving everyday life. See how at www.cognizant.com The role is encouraged to drive technology discussions and analyze the current...


  • Dallas, Texas, United States Cognizant Technology Solutions Full time

    About Us:Cognizant (Nasdaq: CTSH) engineers modern businesses. We help our clients modernize technology, reimagine processes and transform experiences so they can stay ahead in our fast-changing world. Together, we're improving everyday life. See how at The role is encouraged to drive technology discussions and analyze the current landscape for gaps in...


  • Dallas, United States PMG, Inc. Full time

    PMG is a digital company that helps marketers connect people with their brand. Focused on people and grounded in data, our award-winning culture fosters meaningful careers. Partnering with the most iconic brands in the world, we put people at the center of everything we do to deliver value, innovation, and business transformation. WHO WE ARE Agile....


  • Dallas, United States JPMorgan Chase Full time

    Job Description DESCRIPTION:Duties: Develop creative engineering solutions to operations problems by combining software and systems engineering approaches. Lead problem resolutions for aligned lines of business (LOB), providing end-to-end resolution to ensure completion of all jobs within defined objectives. Troubleshoot priority incidents, identify Root...


  • Dallas, United States PMG Full time

    Job DescriptionJob DescriptionPMG is a digital company that helps marketers connect people with their brand. Focused on people and grounded in data, our award-winning culture fosters meaningful careers. Partnering with the most iconic brands in the world, we put people at the center of everything we do to deliver value, innovation, and business...


  • Dallas, United States Cognizant North America Full time

    Sr. Site Reliability Engineer (SRE) Cognizant's Digital Engineering practice is seeking a highly qualified Sr. Site Reliability Engineer with 10+ years plus experience developing and building high-performing, scalable, enterprise applications. You will be part of a digital software team that works on high-demand applications. Our engineers have a passion for...


  • Dallas, United States Priceline Long Distance LLC Full time

    This role is eligible for our hybrid work model: Two days in-office. Site Reliability Engineer (SRE) Our Technology team is the backbone of our company: constantly creating, testing, learning and iterating to better meet the needs of our customers. If you thrive in a fast-paced, ideas-led environment, you’re in the right place. Why This Job’s a Big Deal...


  • Dallas, United States Diverse Lynx Full time

    Description: Role: Director - Site Reliability Engineering Location: Dallas, TX (Day 1 Onsite) Long term role Job Description: Domain : Telecommunications Lead strategy and development of SRE functions for consumer applications including Digital, Assisted and Marketing Technology systems to implement best in class SRE practices to improve the reliability of...


  • Dallas, United States Veradigm Full time

    Welcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today's healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At Veradigm, our primary focus is on harnessing the power of research, analytics,...


  • Dallas, United States Veradigm Full time

    Welcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today's healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At Veradigm, our primary focus is on harnessing the power of research, analytics,...


  • Dallas, United States K-Tek Resourcing LLC Full time

    Job Overview: We are looking for a motivated Junior Operations Engineer to ensure the smooth operation of our software and systems. This role combines technical expertise with problem-solving skills to automate operational processes, enhance system functionality, and maintain the reliability and efficiency of software applications. Key...