Current jobs related to Kafka Site Reliability Engineer Onsite - Austin, Texas - Cognizant North America


  • Austin, Texas, United States Diverse Lynx Full time

    Job Description for Kafka SRE:As a Site Reliability Engineer for Kafka Platform, you will be responsible for carrying out SRE duties to ensure the smooth operation of the Kafka Streaming Platform. Your key responsibilities will include having a thorough understanding of the Kafka architecture, including producers, consumers, topics, and partitions. You will...


  • Austin, Texas, United States Diverse Lynx Full time

    Job Title: Kafka AdminAt Diverse Lynx LLC, we are seeking a highly skilled Kafka Admin to join our team. As a key member of our Site Reliability Engineering (SRE) team, you will be responsible for ensuring the smooth operation of our Kafka Streaming Platform.Key Responsibilities:Carry out SRE duties for the Kafka Streaming Platform, ensuring its reliability...


  • Austin, Texas, United States Futran Tech Solutions Pvt. Ltd. Full time

    Job Title: Site Reliability Engineer/Infrastructure SpecialistLocation: RemoteJob Type: Full-timeAbout the Role:We are seeking a highly skilled Site Reliability Engineer/Infrastructure Specialist to join our team at Futran Tech Solutions Pvt. Ltd. The ideal candidate will have experience supporting internet-facing production services and distributed systems,...


  • Austin, Texas, United States Apple Full time

    Job Title: Site Reliability EngineerJob Summary:At Apple, we are seeking a highly skilled Site Reliability Engineer to join our Ad Platforms team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ad-tech systems.Key Responsibilities:Implement and improve our infrastructure and...


  • Austin, Texas, United States Unreal Gigs Full time

    Job Summary:At Unreal Gigs, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the high availability, scalability, and performance of our complex distributed systems. You'll be responsible for building and maintaining highly reliable systems, automating infrastructure...


  • Austin, Texas, United States ORACLE AMERICA Full time

    Job Summary:Oracle America is seeking a skilled Site Reliability Developer 3 to join our team in Austin, TX. As a Site Reliability Developer, you will be responsible for solving complex problems related to infrastructure and cloud services, and building automation to prevent problem recurrence.Key Responsibilities:Solve complex problems related to...


  • Austin, Texas, United States Apple Full time

    Job Title: Site Reliability Engineering ManagerAbout the Role:Apple is seeking a highly skilled Site Reliability Engineering Manager to lead our cloud services team. As a Site Reliability Engineering Manager, you will be responsible for establishing SRE practices for our private cloud service to accelerate our ability to reliably and consistently deliver...


  • Austin, Texas, United States Oxford Knight Full time

    Database Site Reliability EngineerOxford Knight is seeking an experienced Database Site Reliability Engineer to join our Trading Systems Infrastructure team. As a key member of our team, you will be responsible for designing, building, and maintaining our diverse production database infrastructure, focusing on bare metal performance, scalability, and...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Apple. As a Site Reliability Engineer, you will play a vital role in designing, building, and maintaining our core infrastructure.This infrastructure enables thousands of Apple Developers to submit their Apps to the App Store that delight millions of Apple...


  • Austin, Texas, United States Apple Full time

    Job SummaryApple is seeking a Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, performance, and maintenance of high-volume, highly available, mission-critical enterprise platforms and applications related to Apple Manufacturing & Product lifecycle.Key Responsibilities- Develop...


  • Austin, Texas, United States Terminal Industries Full time

    About UsTerminal Industries is a leading provider of software solutions for the logistics industry. Our platform digitizes, indexes, and automates the yard, leveraging best-in-class machine learning to optimize truck, trailer, chassis, container, and personnel usage.Our PlatformOur platform provides warehouse operators with the intelligence needed to...


  • Austin, Texas, United States Apple Full time

    Job SummaryAt Apple, we are seeking a highly skilled Site Reliability Engineer to join our Ad Platforms team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our ad-tech systems.Key ResponsibilitiesDesign and implement infrastructure and application monitoring and observability capabilities to improve...


  • Austin, Texas, United States AutoRABIT Holding Inc. Full time

    About the RoleAutoRABIT Holding Inc. is seeking a highly skilled Senior Site Reliability/DevOps Engineer to join our team. As a key member of our cloud services team, you will be responsible for developing, scaling, and operating our cloud infrastructure.Key Responsibilities:Design, implement, and maintain scalable, resilient, and secure infrastructure using...


  • Austin, Texas, United States Terminal Industries Full time

    About UsTerminal Industries builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers, and personnel. These are the fundamental operating assets of commerce - and represent...


  • Austin, Texas, United States Tesla Full time

    This position can be based in a dynamic work environment.Tesla is seeking Engineers to build, improve, and scale the infrastructure that powers our Energy IoT applications.These applications provide real-time monitoring, optimization, control for our flagship Tesla Energy products: Powerwall, Megapack, Solar Roof, Supercharger, Autobidder, and Virtual Power...


  • Austin, Texas, United States Apple Full time

    At Apple, we're looking for a talented Site Reliability Engineer to join our Apple Services Engineering team. As an SRE, you'll play a vital role in designing, building, and maintaining our core infrastructure, which enables thousands of Apple Developers to submit their Apps to the App Store that delight millions of Apple customers.We're seeking someone with...


  • Austin, Texas, United States Procore Technologies Full time

    Job DescriptionWe're seeking a highly skilled Staff Site Reliability Engineer to join Procore's Project Execution Group. In this role, you'll lead, collaborate, and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant, and reliable as this team is directly...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job DescriptionAt the Electric Reliability Council of Texas, we strive to create a dynamic work environment that fosters innovation and collaboration. Our team is dedicated to building a reliable and efficient power grid, and we're seeking a skilled Reliability and Compliance Engineer to join our efforts.As a key member of our team, you will work closely...


  • Austin, Texas, United States Teacher Retirement System of Texas Full time

    Job Title: Azure Cloud Engineer/Platform Reliability EngineerAbout the Role:We are seeking a highly skilled Azure Cloud Engineer/Platform Reliability Engineer to join our team at the Teacher Retirement System of Texas. As a key member of our Core Platforms Department, you will be responsible for ensuring the reliability, scalability, and performance of our...

  • Planning Engineer

    1 month ago


    Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job SummaryAt the Electric Reliability Council of Texas (ERCOT), we are seeking a highly skilled Planning Engineer to join our Regional Planning team. As a key member of our team, you will be responsible for ensuring the reliable operation of the electric power grid in compliance with NERC Standards, ERCOT Protocols, and Market Guides.Key...

Kafka Site Reliability Engineer Onsite

4 weeks ago


Austin, Texas, United States Cognizant North America Full time
About the Role:

Cognizant's Cloud, Infrastructure, and Security Services Practice (CIS) is focused on driving digital transformation through holistic modernization across layers.

We help customers transform infrastructure and workplaces to meet the evolving needs of the digital era.

Our approach delivers key results for customers by achieving cloud-driven modernization and workplace and operational transformation in a secure environment.

Key Responsibilities:

  • Carry out SRE duties for the Kafka Streaming Platform.
  • Have a detailed understanding of Kafka architecture, including producers, consumers, topics, and partitions.
  • Monitor the platform and enforce runbooks/SOPs to resolve platform and application issues.
  • Familiarize yourself with cluster maintenance processes and implement changes according to detailed installation and validation plans.
  • Conduct detailed root cause analysis of major production incidents, document for future reference, and implement proactive measures to improve system reliability.
  • Automate routine tasks using scripts or automation tools to reduce manual work, decrease the chance of human errors, and boost system reliability.

Requirements:

  • At least 2-3 years of experience for a junior level role and 5+ for mid-level/senior level working as a Site Reliability Engineer for the Kafka Platform.
  • Deep knowledge of core Kafka components, including producers, consumers, topics, and partitions.
  • Solving both Kafka platform service and application problems, and identifying the root cause.
  • Writing Ansible playbooks and automating manual tasks using Ansible, shell scripting, and Python.
  • Familiarity with Unix/Linux system internals, networking, and distributed systems.

Benefits:

  • Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:
  • Medical/Dental/Vision/Life Insurance.
  • Paid holidays plus Paid Time Off.
  • 401(k) plan and contributions.
  • Long-term/Short-term Disability.
  • Paid Parental Leave.
  • Employee Stock Purchase Plan.
  • Eligible for Cognizant's discretionary annual incentive program, based on performance and subject to the terms of Cognizant's applicable plans.