Senior Network SRE Lead

6 days ago


Santa Clara, California, United States Diverse Lynx Full time
Job Summary

We are seeking a seasoned Network SRE technical lead to help drive the SRE vision for our network infrastructure. This role demands a unique blend of hands-on expertise in network operations, engineering, and observability.

Key Responsibilities
  • Owning the operational aspect of the network infrastructure, ensuring high availability and reliability.
  • Partnering with architecture, tooling, and deployment teams to guarantee that new implementations are supportable and align with production standards.
  • Advocating for and implementing automation to reduce toil and enhance operational efficiency.
  • Monitoring network performance, identifying areas for improvement, and coordinating with relevant teams to execute enhancements.
  • Leading the network support team towards a culture of reliability and operational excellence.
  • Collaborating with SMEs to resolve production issues swiftly and effectively, maintaining customer satisfaction.
  • Identifying opportunities for operational improvements and partnering with teams to develop solutions that drive excellence and sustainability in network operations.
Requirements
  • BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent practical experience.
  • Minimum of 8 years of industry experience in network site reliability engineering, network operations, or related areas. Experience on both campus and data center networks.
  • In-depth expertise in network technologies such as TCP/UDP, IPv4/IPv6, Wireless, BGP, VPN, L2 switching, Firewalls, Load Balancers, EVPN, VxLAN, Segment Routing.
  • Proven track record in network operations.
  • Experience in one or more of the following: Python, Perl, Bash.
  • Familiarity with network management tools such as Prometheus, Grafana, Alert Manager, Nautobot/Netbox, Netdisco.
  • Knowledge of Linux system fundamentals is a plus.
  • Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive.
What We Offer

We promote and support a diverse workforce across all levels in the company. All applicants will be evaluated solely on the basis of their ability, competence, and their proven capability to perform the functions outlined in the corresponding role.

Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination.



  • Santa Clara, California, United States Diverse Lynx Full time

    Job Title: Senior Network SRE LeadWe are seeking a seasoned Senior Network SRE Lead to join our team at Diverse Lynx LLC. As a key member of our Network Support and SRE team, you will play a crucial role in actualizing our SRE vision for our network infrastructure.Key Responsibilities:Owning the operational aspect of the network infrastructure, ensuring its...


  • Santa Clara, California, United States Diverse Lynx Full time

    Job SummaryWe are seeking a seasoned Network SRE technical lead to help drive the SRE vision for our network infrastructure. This role demands a unique blend of hands-on expertise in network operations, engineering, and observability.Key ResponsibilitiesOwning the operational aspect of the network infrastructure, ensuring high availability and...


  • Santa Clara, California, United States Sage Lake Senior Living Full time

    About the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in ensuring the high availability and performance of our AI-powered applications.Key ResponsibilitiesOperate and improve the observability and maintainability of our distributed microservice cloud applications and...


  • Santa Clara, California, United States Sage Lake Senior Living Full time

    About the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in monitoring and operating our NVIDIA Inference Microservices (NIMs) factory automation and deployed services.Key ResponsibilitiesOperate a software factory that takes an AI model as input and produces a deployable...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is seeking a highly skilled Senior Production SRE Engineer to join our team. As a key member of our SRE team, you will be responsible for designing, implementing, and supporting large-scale storage clusters, including monitoring, logging, and alerting.You will work closely with peers on the team to improve the lifecycle of services –...


  • Santa Clara, California, United States Nvidia Full time

    Job SummaryNVIDIA is seeking a highly skilled Senior Production SRE Engineer to join our team. As a key member of our SRE team, you will be responsible for designing, implementing, and supporting large-scale storage clusters, as well as working with AI/ML workloads to capture and correlate behavior in large clusters and workflows.Key ResponsibilitiesAssist...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is the driving force behind the innovation revolution in AI, computing, and graphics. We are a leader in the development of technologies that power the world's most advanced computing systems.Job SummaryWe are seeking a highly skilled SRE Manager to join our NVIDIA Inference Microservices (NIM) team. As a key member of our team, you will...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleNVIDIA is a leader in AI-powered applications, and we're seeking a skilled SRE Manager to join our team. As a key member of our NIM Factory, you'll be responsible for building and managing SREs that monitor and operate both factory automation and deployed services.Key ResponsibilitiesLead the operation of highly available services that make...


  • Santa Clara, California, United States NVIDIA Full time

    About the RoleWe are seeking a highly skilled SRE Manager to lead our NVIDIA Inference Microservices (NIM) team. As a key member of our organization, you will be responsible for building and managing a team of SREs who monitor and operate the factory automation for NIMs and its deployed services.Key ResponsibilitiesLead the operation of highly available...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the technology world, known for its innovative and forward-thinking approach to AI-powered applications. We're a company that values diversity and creativity, and we're looking for talented individuals to join our team.About the RoleWe're seeking a highly skilled SRE Manager to lead our team in building and managing SREs...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in the development of AI-powered applications, and we're seeking a highly skilled SRE Manager to join our team. As a pioneer in the field of AI, we're committed to pushing the boundaries of what's possible with technology.Job SummaryWe're looking for a talented SRE Manager to lead our NIM Factory team. As a key member of our...


  • Santa Clara, California, United States ServiceNow Full time

    Job DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Key ResponsibilitiesProvide relief and sustainable resolution to issues within our infrastructure.Use expertise in software development,...


  • Santa Clara, California, United States ServiceNow Full time

    Job DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Key ResponsibilitiesProvide relief and sustainable resolution to issues within our infrastructure.Use expertise in software development,...


  • Santa Clara, California, United States ServiceNow Full time

    OverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in between.They are...


  • Santa Clara, California, United States Akraya Full time

    Job SummaryAkraya is seeking a highly skilled FedRAMP SRE to support our embedded Site Reliability Engineer team. As a key member of our team, you will ensure security and compliance requirements are met in collaboration with teams. You will work with development teams to integrate scalability and reliability from inception, aligning with Agile...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...


  • Santa Clara, California, United States Trillium Staffing Full time

    Senior SRE EngineerTrillium Staffing is seeking a seasoned Senior SRE Engineer to join its fast-paced Infrastructure, Planning and Processes organization in Santa Clara, CA. As a key member of the team, you will be responsible for developing and maintaining sophisticated internal cloud provisioning products for GPUs and Tegra systems.Key...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Senior Staff Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, building, and operating reliable, secure cloud infrastructure.Key ResponsibilitiesDevelop expertise in new technologies and contribute to the success of SRE and...


  • Santa Clara, California, United States Akraya Full time

    Job SummaryAkraya is seeking a highly skilled FedRAMP SRE to join our team. As an embedded Site Reliability Engineer, you will be responsible for supporting FedRAMP environments and ensuring security and compliance requirements are met.Key ResponsibilitiesSupport FedRAMP environments as an embedded Site Reliability Engineer.Collaborate with teams to ensure...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...