Site Reliability Engineer
3 days ago
NVIDIA is seeking a seasoned Site Reliability Engineer to join its Infrastructure, Planning and Processes organization. This is an ON PREM data center role that requires experience with Linux operating systems and an understanding of Kubernetes.
Key Responsibilities:
- Design and implement reliable systems and processes to ensure high availability and performance of NVIDIA's internal cloud infrastructure.
- Monitor system performance and troubleshoot issues related to CPU, memory, disk, and network utilization.
- Collaborate with cross-functional teams to resolve complex technical issues and implement automation solutions.
- Develop and maintain scripts and tools to automate tasks and improve efficiency.
- Provide expert-level technical guidance and support to internal stakeholders.
Requirements:
- 4+ years of experience with SRE systems admin knowledge.
- Familiarity with Kubernetes.
- Proficiency in Linux systems (and preferably Windows as well).
- Strong understanding of Ansible for configurations and running playbooks.
- On premise data center experience - this is not a cloud environment.
- Virtualization experience Kibana, Grafana, Splunk etc.
- Experience with BMC (Redfish), KVM, and IPMI tools.
- Ability to run automated tests with code (Python, Bash, etc.).
Preferred Skills:
- Windows server infrastructure.
- Open Stack experience (MySQL, Prometheus, Jenkins, etc.).
Benefits:
- Medical, dental, and vision insurance.
- HSA, FSA, and DCFSA account options.
- 401k retirement account access with employer matching.
- Paid sick leave and/or other paid time off as provided by applicable law.
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Capgemini Full timeJob OpportunityCapgemini is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and availability of our services.Key ResponsibilitiesDesign and implement strategies to improve system reliability and availability.Develop and maintain service level objectives...
-
Site Reliability Engineer
2 weeks ago
Reston, Virginia, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the availability, performance, and security of our systems and applications.Key Responsibilities:Expertise in TCP/IP related technologies, including networking...
-
Site Reliability Engineer
3 days ago
Reston, Virginia, United States WideNet Consulting Group Full timeJob Title: Site Reliability EngineerAbout the RoleWe are seeking an experienced Site Reliability Engineer to join our team at WideNet Consulting Group. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities· Monitor and analyze system performance...
-
Site Reliability Engineer
3 days ago
Reston, Virginia, United States Microsoft Corporation Full timeJob Title: Site Reliability EngineerAt Microsoft, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services.Key Responsibilities:Design, develop, and deliver software engineering solutions to serve and...
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Microsoft Full timeTransforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, working closely with our team to prioritize...
-
Reston, Virginia, United States Microsoft Full timeJob SummaryMicrosoft is seeking a highly skilled Site Reliability Engineer II/Senior Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and scalability of our cloud services.Key ResponsibilitiesDesign and implement...
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Microsoft Full timeTransforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, working closely with our big government...
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Microsoft Full timeTransforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, working closely with our team to prioritize...
-
Site Reliability Engineer
2 days ago
Reston, Virginia, United States Microsoft Corporation Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Microsoft Corporation. As a key member of our Office 365 government cloud team, you will be responsible for designing, developing, and delivering software engineering solutions to serve and protect our O365 government clouds.Key ResponsibilitiesOwn deployment,...
-
Reston, Virginia, United States Microsoft Full timeTransforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help us shape the future of cloud services. As a Site Reliability Engineer, you'll play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and scalability of...
-
Site Reliability Engineer
2 weeks ago
Reston, Virginia, United States ITPeopleNetwork Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at ITPeopleNetwork. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and develop APIs using the Java/Spring Boot...
-
Reston, Virginia, United States Microsoft Corporation Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Microsoft Corporation. As a Site Reliability Engineer, you will play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and scalability of our cloud services.ResponsibilitiesCollaborate with cross-functional teams to...
-
Site Reliability Engineer
2 days ago
Reston, Virginia, United States Blue Sky Innovative Solutions Full timeJob Title: Site Reliability EngineerBlue Sky Innovative Solutions is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure. You will be responsible for designing, implementing, and maintaining our cloud...
-
Reston, Virginia, United States Microsoft Corporation Full timeTransforming the Future of Cloud ServicesAt Microsoft, we're committed to being cloud-first, and we're looking for talented Site Reliability Engineers to help design and implement scenarios for our customers. As a Site Reliability Engineer, you'll play a critical role in shaping the future of cloud services and ensuring the reliability and scalability of our...
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Intelligent Waves Full timeJob Title: Site Reliability EngineerOverview:We are seeking a highly skilled Site Reliability Engineer to join our team at Intelligent Waves. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based systems.Responsibilities:Design and implement resilient infrastructure...
-
Site Reliability Engineer
4 weeks ago
Reston, Virginia, United States Streaming Talent Full timeStreaming Talent is seeking a highly skilled Site Reliability Engineer to join their US team at a leading Content Delivery Network.This role reports directly to the Site Reliability Team Lead and involves:Identifying and resolving deployment issues with the platformOverseeing the production environment to ensure optimal performanceImplementing improvements...
-
Site Reliability Engineer
18 hours ago
Reston, Virginia, United States Infosys Full timeAbout the RoleInfosys is seeking a highly skilled Site Reliability Engineer to join our team. As an SRE Consultant, you will play a critical role in defining business process consulting solutions that enable our clients to meet the changing needs of the global landscape.Key ResponsibilitiesDefine problems, propose, and create solutions to drive business...
-
Site Reliability Engineer II
2 weeks ago
Reston, Virginia, United States Microsoft Corporation Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer II to join our team at Microsoft Corporation. As a Site Reliability Engineer II, you will be responsible for designing, developing, and delivering software engineering solutions to serve and protect O365 government clouds.Key ResponsibilitiesOwn deployment, availability, reliability,...
-
Site Reliability Engineer Manager
4 weeks ago
Reston, Virginia, United States Microsoft Corporation Full timeJob DescriptionWe are seeking a highly skilled Site Reliability Engineer Manager to join our team at Microsoft Corporation. As a key member of our organization, you will be responsible for providing deep technical leadership to a team of highly passionate and skilled engineers.Your primary focus will be on recruiting, on-boarding, and growing a team of...
-
Site Reliability Engineer
3 weeks ago
Reston, Virginia, United States Hexaware Technologies Full timeAbout Hexaware TechnologiesWe are a leading global IT Services company, dedicated to driving digital transformation and innovation for businesses around the world. Our team of over 30,000 professionals across 30 countries is driven by a shared passion for innovation and excellence.Job DescriptionWe are seeking a skilled Site Reliability Engineer with a...