Site Reliability Engineer, VP

2 days ago


Dallas, Texas, United States The Goldman Sachs Group Full time
About the Role

We are seeking a highly skilled Site Reliability Engineer, VP to join our team at The Goldman Sachs Group. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining high-availability systems that support our business-critical applications.

Responsibilities
  • Own technical operations for systems that manage hundreds of thousands of compute cores
  • Build observability for new deployments to ensure robustness from day one, as well as mature deployments to identify and implement improvements
  • Troubleshoot and resolve issues with block devices, file descriptors, and packet loss
  • Lead real-time outage investigations and present postmortems to senior management
  • Define SLIs and SLOs and partner with development teams to ensure systems are sufficiently well designed and instrumented
  • Partner with our development team throughout development and operations
  • Plan and manage deployments and migrations (including end-of-life programs)
  • Plan and implement robust business continuity and security programs
  • Provide regional coverage for the Procmon platform and participate in on-call support
Requirements
  • 5+ years of relevant professional experience
  • 3+ years of Linux fundamentals and system administration skills
  • 3+ years of networking experience(familiarity with TCP/IP, IP routing, firewalls, secure tunneling protocols)
  • 3+ years experience working with distributed computing systems and Cloud computing environments
  • Excellent problem-solving and automation skills
  • Proficiency in at least one programming language; the team uses a mix of Go, Python and Erlang
  • Able to operate effectively in a mission critical, highly regulated financial services environment
About Goldman Sachs

At Goldman Sachs, we commit our people, capital and ideas to help our clients, shareholders and the communities we serve to grow.

We believe who you are makes you better at what you do. We're committed to fostering and advancing diversity and inclusion in our own workplace and beyond by ensuring every individual within our firm has a number of opportunities to grow professionally and personally, from our training and development opportunities and firmwide networks to benefits, wellness and personal finance offerings and mindfulness programs.

Learn more about our culture, benefits, and careers.



  • Dallas, Texas, United States The Goldman Sachs Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer, VP to join our team at The Goldman Sachs Group. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our systems, as well as identifying and implementing improvements to our technical operations.ResponsibilitiesOwn technical operations...


  • Dallas, Texas, United States The Goldman Sachs Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer, VP to join our team at The Goldman Sachs Group. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining scalable and reliable systems that support our business operations.ResponsibilitiesOwn technical operations for systems that...


  • Dallas, Texas, United States The Goldman Sachs Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer, VP to join our team at The Goldman Sachs Group. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining large-scale distributed systems that support our business operations.Key ResponsibilitiesOwn technical operations for systems...


  • Dallas, Texas, United States The Goldman Sachs Group Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Procmon Platform team at Goldman Sachs. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our systems, which manage hundreds of thousands of compute cores.ResponsibilitiesOwn technical operations for systems that...


  • Dallas, Texas, United States CV Library Full time

    {"title": "Site Reliability Engineer", "description": "Job SummaryWe are seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, reliability, and performance of our applications and services.Key ResponsibilitiesMonitor and analyze system performance to identify areas...


  • Dallas, Texas, United States Glocomms Full time

    Job Title: Site Reliability EngineerGlocomms is seeking a highly skilled Site Reliability Engineer to join their team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the company's cloud infrastructure.Responsibilities:Design and implement scalable and highly available cloud infrastructureDevelop and...


  • Dallas, Texas, United States Bayone Full time

    Job Title: Site Reliability EngineerBayone is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining highly available and scalable applications deployed in Azure.Key Responsibilities:Design and implement automation tools and scripts to streamline...


  • Dallas, Texas, United States STIAOS Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at STIAOS Technologies in Dallas, TX. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our ecommerce platform.Key Responsibilities:Collaborate with cross-functional teams to identify...


  • Dallas, Texas, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerAt Diverse Lynx LLC, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the availability, reliability, and performance of our applications and infrastructure.Key Responsibilities:Design, implement, and maintain scalable and...


  • Dallas, Texas, United States Motion Recruitment Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Motion Recruitment. As a Site Reliability Engineer, you will be responsible for ensuring the stability, scalability, and performance of our applications.About the RoleThis is a direct hire, hybrid role (3-4 days onsite) in Dallas, Texas. The...


  • Dallas, Texas, United States Themesoft Inc. Full time

    Site Reliability EngineerAt Themesoft Inc., we're seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Foster a culture of reliability and efficiency by sharing best...


  • Dallas, Texas, United States Tata Consultancy Services Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Tata Consultancy Services. As an SRE Support Analyst, you will play a critical role in ensuring the stability and sustainability of our software systems.Key ResponsibilitiesDrive the stability and sustainability of our next-generation systems and discover innovative...


  • Dallas, Texas, United States STIAOS Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at STIAOS Technologies in Dallas, TX. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our ecommerce systems.Key Responsibilities:Collaborate with cross-functional teams to identify and...


  • Dallas, Texas, United States STIAOS Technologies Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at STIAOS Technologies in Dallas, TX. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our software systems.Key Responsibilities:Collaborate with cross-functional teams to identify and...


  • Dallas, Texas, United States Diverse Lynx Full time

    Job DescriptionRole: Site Reliability Engineer/DevOps EngineerLocation: Dallas, TX (Onsite)Duration: Full-timeJob Description: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the availability, reliability, and performance of our applications...


  • Dallas, Texas, United States Avetta (formerly PICS) Full time

    Be Part of Avetta's Technical Excellence TeamAs a Site Reliability Engineer at Avetta, you will play a crucial role in optimizing and scaling our global cloud-based SaaS platform. Our focus is on maintaining highly resilient and distributed systems, integrating uptime monitors, and developing scaling algorithms to enhance end-user experience.Key...


  • Dallas, Texas, United States Net2Source Inc. Full time

    Job Title: Site Reliability Engineering ManagerNet2Source Inc. is a leading provider of total workforce solutions, recognized for its accelerated growth and commitment to delivering high-quality staffing services. As a Site Reliability Engineering Manager, you will play a critical role in ensuring the reliability and scalability of our systems, collaborating...


  • Dallas, Texas, United States Net2source Full time

    Job Title: Site Reliability Engineering ManagerAt Net2Source, we are seeking a highly skilled Site Reliability Engineering Manager to join our team. As a Site Reliability Engineering Manager, you will be responsible for leading a team of Site Reliability Engineers to ensure the reliability, scalability, and performance of our cloud-based infrastructure.Key...


  • Dallas, Texas, United States Net2Source Inc. Full time

    Job Title: Site Reliability Engineer ManagerNet2Source Inc. is a leading provider of total workforce solutions, recognized for our accelerated growth and global presence. We are seeking an experienced Site Reliability Engineer Manager to lead our SRE team and drive operational excellence.Key Responsibilities:Lead and mentor a team of Site Reliability...


  • Dallas, Texas, United States Forhyre Full time

    Job OpportunityWe are seeking a highly skilled Site Reliability Engineer to join our team at Forhyre. As a key member of our engineering team, you will play a critical role in ensuring the reliability, scalability, and efficiency of our cloud-based services.Key Responsibilities:Design and implement major infrastructure components, systems, and...