Current jobs related to Staff Site Reliability Engineer - Austin, Texas - Visa


  • Austin, Texas, United States H-E-B Full time

    Job Title: Staff Site Reliability EngineerH-E-B Digital is seeking a highly skilled Staff Site Reliability Engineer to join our team. As a key member of our engineering organization, you will be responsible for designing and implementing fault-tolerant architectures, influencing code architecture, and establishing reliability standards across...


  • Austin, Texas, United States H-E-B Full time

    Job Title: Staff Site Reliability EngineerAt H-E-B, we're seeking a highly skilled Staff Site Reliability Engineer to join our team. As a key member of our digital infrastructure team, you'll be responsible for designing and implementing fault-tolerant architectures, ensuring the reliability and scalability of our systems.Responsibilities:Design and lead the...


  • Austin, Texas, United States ProCore CPA Full time

    Job DescriptionWe're seeking a highly skilled Staff Site Reliability Engineer to join our Project Execution Group at Procore. As a key member of our team, you'll be responsible for leading, collaborating, and developing solutions to maintain the health of our core platform.The ideal candidate will have a passion for solving complex problems unique to running...


  • Austin, Texas, United States ProCore CPA Full time

    Job DescriptionProcore is seeking a highly skilled Staff Site Reliability Engineer to join our Project Execution Group. As a key member of our team, you will be responsible for leading, collaborating, and developing solutions to maintain the health of our core platform.The ideal candidate will have a passion for solving complex problems unique to running...


  • Austin, Texas, United States Procore Technologies Full time

    Job DescriptionWe're seeking a highly skilled Staff Site Reliability Engineer to join our Project Execution Group at Procore Technologies. In this role, you'll lead and collaborate with a team of reliability engineers to maintain the health of our core platform.The ideal candidate will have expertise in container orchestration (Kubernetes), cloud automation...


  • Austin, Texas, United States Procore Technologies Full time

    Job DescriptionWe're seeking a highly skilled Staff Site Reliability Engineer to join Procore's Project Execution Group. In this role, you'll lead, collaborate, and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant, and reliable as this team is directly...


  • Austin, Texas, United States Apple Full time

    Job Title: Site Reliability EngineerJob Summary:At Apple, we are seeking a highly skilled Site Reliability Engineer to join our Ad Platforms team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our ad-tech systems.Key Responsibilities:Implement and improve our infrastructure and...


  • Austin, Texas, United States Oracle Full time

    Job DescriptionOracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesDesign, develop, and deploy automation tools to improve the efficiency and reliability of our cloud...


  • Austin, Texas, United States Oracle Full time

    Job DescriptionOracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesDesign, develop, and deploy software to improve the availability, scalability, and efficiency of Oracle...


  • Austin, Texas, United States Unreal Gigs Full time

    Job Summary:At Unreal Gigs, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the high availability, scalability, and performance of our complex distributed systems. You'll be responsible for building and maintaining highly reliable systems, automating infrastructure...


  • Austin, Texas, United States Thales Full time

    Job Title: Site Reliability EngineerThales is seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and security of our cloud-based services.Key Responsibilities:Collaborate with project managers and service delivery managers to analyze traffic...


  • Austin, Texas, United States Unreal Gigs Full time

    Job Summary:At Unreal Gigs, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the high availability, scalability, and performance of our complex distributed systems. You'll be responsible for designing, implementing, and maintaining reliable systems, automating...


  • Austin, Texas, United States Cisco Full time

    About the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement automated solutions to improve infrastructure stability and scalabilityCollaborate with...


  • Austin, Texas, United States Terminal Industries Full time

    About UsTerminal Industries is a cutting-edge technology company that's revolutionizing the logistics industry with its innovative software solutions.Our platform leverages machine learning and IoT technology to digitize, index, and automate warehouse operations, providing warehouse operators with the intelligence needed to optimize their usage of trucks,...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking an innovative Site Reliability Engineer to join our Apple Services Engineering team. As a key member of our team, you will design, build, and maintain our core infrastructure, enabling thousands of Apple Developers to submit their Apps to the App Store that delight millions of Apple customers.Key ResponsibilitiesCollaborate with...


  • Austin, Texas, United States ORACLE AMERICA Full time

    Job Summary:Oracle America is seeking a skilled Site Reliability Developer 3 to join our team in Austin, TX. As a Site Reliability Developer, you will be responsible for solving complex problems related to infrastructure and cloud services, and building automation to prevent problem recurrence.Key Responsibilities:Solve complex problems related to...


  • Austin, Texas, United States Liquibase Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Liquibase. As a key member of our DevOps team, you will be responsible for designing, implementing, and maintaining highly resilient and secure infrastructure for our SaaS platform using AWS services.Key Responsibilities:Design and implement secure and scalable...


  • Austin, Texas, United States Apple Full time

    Job Title: Site Reliability Engineering ManagerAbout the Role:Apple is seeking a highly skilled Site Reliability Engineering Manager to lead our cloud services team. As a Site Reliability Engineering Manager, you will be responsible for establishing SRE practices for our private cloud service to accelerate our ability to reliably and consistently deliver...


  • Austin, Texas, United States Oxford Knight Full time

    Database Site Reliability EngineerOxford Knight is seeking an experienced Database Site Reliability Engineer to join our Trading Systems Infrastructure team. As a key member of our team, you will be responsible for designing, building, and maintaining our diverse production database infrastructure, focusing on bare metal performance, scalability, and...


  • Austin, Texas, United States Info Way Solutions Full time

    Splunk Administration and SRE ExpertiseWe are seeking a highly skilled Splunk administrator with strong expertise in Site Reliability Engineering (SRE) and DevOps to join our team at Info Way Solutions.Key Responsibilities:Administer and optimize Splunk infrastructure for maximum performance and efficiencyDevelop and implement SRE practices to ensure high...

Staff Site Reliability Engineer

2 months ago


Austin, Texas, United States Visa Full time
Job Description

**About the Role**

We are seeking a highly skilled Hadoop System Engineer to join our team at Visa. As a key member of our reliability engineering team, you will be responsible for ensuring the stability and performance of our big data platforms.

Key Responsibilities

  • Single Window Support: Provide expert-level support for Hadoop-related issues, leveraging deep knowledge of Hadoop tools such as Hive, Spark, HDFS, and Yarn.
  • System Configuration: Collaborate with platform engineering teams to recommend necessary changes to the system, ensuring optimal performance and reliability.
  • Performance Tuning: Direct team members on crafting efficient queries and leveraging expertise in performance tuning and optimization strategies for big data technologies.
  • Issue Resolution: Troubleshoot and resolve complex technical issues, identifying root causes and coordinating with cross-functional teams to resolve issues.
  • Reliability Engineering: Develop and maintain reports to define performance and resolution metrics, generating alerts and performing automation and self-healing as needed.
  • Office Hours and Liaising: Collaborate with global teams to ensure timely client delivery, sharing knowledge and best practices through wikis and communications.
  • Knowledge Cataloging and Sharing: Develop and maintain a knowledge catalog, sharing expertise with peers across geographic regions.
  • Develop Standards: Establish and maintain standard configurations for various VCA workloads, ensuring optimal cluster health and efficient job execution.
  • Continuous Learning: Stay up-to-date with changing data science job requirements, collaborating with teams to improve cluster utilization and delivery.

Requirements

  • 5+ years of relevant work experience with a Bachelor's Degree or at least 2 years of work experience with an Advanced degree (e.g. Masters, MBA, JD, MD) or 0 years of work experience with a PhD, OR 8+ years of relevant work experience.
  • Strong development skills on data pipelines using PySpark, Hive, Airflow.
  • Strong Troubleshooting and debugging skills.
  • Hands-on experience in managing Hadoop platforms, tuning application performance, and debugging Hadoop issues.
  • Experience working with scheduling tools (Airflow, Oozie) or building data processing orchestration workflows.
  • In-depth knowledge of Hadoop ecosystem/Architecture, including Zookeeper, HDFS, Yarn, Hive, and Spark.
  • Understanding of security tools like Kerberos and Ranger.
  • Excellent written and verbal communication skills.

Additional Information

  • Work Hours: Varies upon the needs of the department.
  • Travel Requirements: This position requires travel 5-10% of the time.
  • Mental/Physical Requirements: This position will be performed in an office setting, requiring the incumbent to sit and stand at a desk, communicate in person and by telephone, and frequently operate standard office equipment.