Senior Network SRE
1 week ago
Job Title: Senior Network SREWork Location: Santa Clara, CAJob Type: Contract Work Type: Hybrid - 3 days in officeDuration: 12 Months Job Description:
- The Network Support and SRE team is in search of a seasoned Network SRE technical lead to help actualize the SRE vision for our network infrastructure.
- This role demands a unique blend of hands-on expertise in network operations, engineering, and observability.
- A proficient Network SRE is dedicated to enhancing network operations, diligently working to minimize manual labor and achieve Service Level Objectives (SLOs).
- Our approach includes minimizing reactive operational tasks, conducting blameless postmortems, and proactively identifying and mitigating network risks to foster continuous improvement.
- This is crucial for ensuring client satisfaction and maintaining an engaging, varied work environment.
- We encourage autonomy in tackling significant projects and are committed to fostering a supportive atmosphere that offers the mentorship necessary for professional development and growth.
- Owning the operational aspect of the network infrastructure, ensuring its high availability and reliability.
- Partnering with architecture, tooling, and deployment teams to guarantee that new implementations are supportable and align with production standards.
- Advocating for and implementing automation to reduce toil and enhance operational efficiency.
- Monitoring network performance, identifying areas for improvement, and coordinating with relevant teams to execute enhancements.
- Leading the network support team towards a culture of reliability and operational excellence.
- Collaborating with SMEs to resolve production issues swiftly and effectively, maintaining customer satisfaction.
- Identifying opportunities for operational improvements and partnering with teams to develop solutions that drive excellence and sustainability in network operations.
- Educational Background: BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent practical experience.
- Professional Experience: Minimum of 8 years of industry experience in network site reliability engineering, network operations, or related areas. Experience on both campus and data center networks.
- Network Skills: In depth expertise in network technologies such as TCP/UDP, IPv4/IPv6, Wireless, BGP, VPN, L2 switching, , Firewalls, Load Balancers, EVPN, VxLAN, Segment Routing. Proven track record in network operations.
- Programming and Scripting Skills: Experience in one or more of the following: Python, Perl, Bash.
- Monitoring Tools: Familiarity with network management tools such as Prometheus, Grafana, Alert Manager, Nautobot/Netbox, Netdisco
- System Administration: Knowledge of Linux system fundamentals is a plus.
- Problem-Solving and Communication: Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive.
- Support Automation: Track record of taking operational signals through means such as SNMP, Syslog, Streaming Telemetry to solve operational challenges
- Code Optimization: Ability to debug and optimize code; automate routine tasks.
- Platform Exposure: Experience with Mellanox/Cumulus Linux, Palo Alto firewalls, Netscalers and F5 load balancers
- Previous SRE experience
-
Senior SRE Engineer, NIM Factory
5 days ago
Santa Clara, California, United States Sage Lake Senior Living Full timeAbout the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in ensuring the high availability and performance of our AI-powered applications.Key ResponsibilitiesOperate and improve the observability and maintainability of our distributed microservice cloud applications and...
-
Senior SRE Engineer
16 hours ago
Santa Clara, United States Trillium Staffing Full timeTrillium Professional is now seeking Senior SRE Engineers in Santa Clara, CA! Pay rate is $75 - $90/hour, depending on experience. Our client is looking for a seasoned SRE to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced...
-
Senior SRE Engineer, NIM Factory
1 week ago
Santa Clara, United States Sage Lake Senior Living Full timeNVIDIA is the platform upon which every new AI-powered application is built. We are seeking a senior SRE to monitor and operate both the factory automation for NVIDIA Inference Microservices (NIMs) and its deployed services. The right person for this role brings technical drive and creativity to change the way NVIDIA provides high-performance inferencing for...
-
Senior SRE Engineer
1 week ago
Santa Clara, United States NVIDIA Full timeNVIDIA is looking for a seasoned SRE to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains NVIDIA’s internal cloud provisioning product for GPUs and Tegra systems. The team works with various...
-
Senior SRE Engineer
3 weeks ago
Santa Clara, United States NVIDIA Full timeNVIDIA is looking for a seasoned SRE to join its complex and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated NVIDIA's internal Jenkins based CI/CD product for GPUs and Tegra systems. The team works with...
-
Senior SRE Engineer
2 months ago
Santa Clara, United States NVIDIA Full timeNVIDIA is looking for a seasoned SRE to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated Nvidia’s internal cloud provisioning product for GPUs and Tegra systems. The team works...
-
Santa Clara, California, United States Sage Lake Senior Living Full timeAbout the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in monitoring and operating our NVIDIA Inference Microservices (NIMs) factory automation and deployed services.Key ResponsibilitiesOperate a software factory that takes an AI model as input and produces a deployable...
-
Senior Network SRE
1 week ago
Santa Clarita, United States Cloudious LLC Full timeJob DescriptionJob DescriptionEducational Background: BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent practical experience. Professional Experience: Minimum of 8 years of industry experience in network site reliability engineering, network operations, or related areas. Experience on both campus and data...
-
Sr. SRE Engineer
3 months ago
Santa Clara, United States TCWGlobal Full timeSr. SRE EngineerW2 Contract to Possible HireHybrid, Santa Clara, CA$75-90/hr + PTO, Paid Holidays, Benefits We are looking for a seasoned SRE to join our multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and...
-
Senior Manager
4 weeks ago
Santa Clara, United States NVIDIA Full timeAs a Sr Manager in Site Reliability Engineering (SRE), you will lead a team dedicated to the design, construction, and maintenance of expansive production systems, emphasizing high efficiency and availability. This role spans various domains, including software and systems engineering, cloud-scale storage, data management, and services. SRE Senior Managers...
-
Sr. SRE Engineer
2 months ago
Santa Clara, United States TCWGlobal Full timeJob DescriptionJob DescriptionSr. SRE EngineerW2 Contract to Possible HireHybrid, Santa Clara, CA$75-90/hr + PTO, Paid Holidays, Benefits We are looking for a seasoned SRE to join our multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced...
-
Sr. SRE Engineer
3 months ago
Santa Clara, United States TCWGlobal Full timeJob DescriptionJob DescriptionSr. SRE EngineerW2 Contract to Possible HireHybrid, Santa Clara, CA$75-90/hr + PTO, Paid Holidays, Benefits We are looking for a seasoned SRE to join our multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced...
-
Senior Manager
3 months ago
Santa Clara, United States NVIDIA Full timeAs a Sr Manager in Site Reliability Engineering (SRE), you will lead a team dedicated to the design, construction, and maintenance of expansive production systems, emphasizing high efficiency and availability. This role spans various domains, including software and systems engineering, cloud-scale storage, data management, and services. SRE Senior Managers...
-
Santa Cruz, United States Geico - Government Employees Insurance Company Full time?Develop and drive the overall reliability strategy for the Network and DC-Ops SRE organization, aligning it with the organization's business goals and objectives ?Provide thought leadership in datacenter reliability for networks and servers, staying Network, Hardware, Server, Staff, Engineer, Software Engineer, Technology, Insurance
-
Senior SRE Software Engineer
3 months ago
Santa Clara, United States NVIDIA Full timeSite Reliability Engineering (SRE) is an engineering discipline that involves designing, building, and maintaining large-scale production systems with high efficiency and availability. It encompasses various areas, including software and systems engineering practices, storage, data management, and services. SRE professionals are highly specialized and...
-
Site Reliability Engineering
5 days ago
Santa Clara, United States Diverse Lynx Full timeSkills: Site Reliability Engineering (SRE), GIT(Bitbucket), Jenkins, AWS CodeBuild, AWS CodeDeploy Job Description: AWS application and CI/CD pipelines, Microsoft Server admin and workload support (Data center and AWS) •Initial responsibility is application platform promotion to controlled environments for test, staging, and production AWS accounts. o...
-
SRE Manager, NIM Factory
3 weeks ago
Santa Clara, United States NVIDIA Full timeNVIDIA is the platform upon which every new AI-powered application is built. We are seeking a SRE Manager to build and manage SREs which monitor and operate both the factory automation for NVIDIA Inference Microservices (NIMs) and its deployed services. The right person for this role brings leadership that encourages the team's technical drive and creativity...
-
ServiceNow SRE: Scalable Solutions Engineer
5 days ago
Santa Clara, California, United States ServiceNow Full timeJob DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Key ResponsibilitiesProvide relief and sustainable resolution to issues within our infrastructure.Use expertise in software development,...
-
Senior Site Reliability Engineer
2 weeks ago
Santa Clara, California, United States ServiceNow Full timeCompany OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...
-
Senior Site Reliability Engineer
2 weeks ago
Santa Clara, California, United States ServiceNow Full timeCompany OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...