Senior Staff Site Reliability Engineer
1 month ago
NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. Its a unique legacy of innovation thats fueled by great technologyand amazing people. Today, were tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing whats never been done before takes vision, innovation, and the worlds best talent. As an NVIDIAN, youll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Join the team and see how you can make a lasting impact on the world.NVIDIA is looking to hire a deeply technical, creative, and Staff Site Reliability Engineer to build, support and maintain the next generation AI powered enterprise products that improve engineering efficiency, data security, and power our product development. This role will give an opportunity to collaborate with the Cloud and AI/ML workforce in a dynamic and agile working environment. This role will be pivotal in shaping the technological future of our organization, ensuring our systems are scalable, reliable, and efficient.What you will be doing:Develop and implement automation frameworks to enhance efficiency for existing and future applications. Collaborate with stakeholders, vendors, architects, and business teams to ensure optimal operation and reliability of applications.Define and lead technical roadmap and cross-organizational projects to align with industry trends.Analyze, troubleshoot complex distributed systems, and promote best practices from observed design patterns and incidents.Innovate to tackle operational challenges and lead AI technology implementation.Mentor junior engineers, promoting professional growth within the team.Evaluate and implement new technologies and tools to enhance the overall reliability of the infrastructure.What we need to see:8 years in cloud, platform, or SRE roles with relevant degree or equivalent experience.Require a solid background and understanding of networking.Strong expertise in programming languages like Python, Go, Perl, Ruby creating automation or general use software.Experience with version control, CI/CD processes and tools.Expertise in architecting, designing, and developing distributed systems.Excellent problem-solving, negotiation, and interpersonal skills.Outstanding written and verbal communication skills that drive executional impact at scale.Ways to stand out from the crowd:Proven expertise in leading cloud platforms like AWS, Azure.Experience with full software life cycle from architecture to development to deployment and operations. Passion for and experience with AI methodologies.Systematic problem-solving approach with a proactive attitude and ownership drive.Experience writing Infrastructure as code.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.With competitive salaries and a generous benefits package, we are widely considered to be one of the technology worlds most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growth, our exclusive engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from youThe base salary range is 164,000 USD - 310,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.SummaryLocation: US, CA, Santa ClaraType: Full time
-
Site Reliability Engineer
3 weeks ago
Santa Clara, United States NVIDIA Full timeNVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and outstanding people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...
-
Cloud Site Reliability Engineer
3 weeks ago
Santa Clara, United States Centrify Corporation Full timeOur software runs on public clouds with 99.9% or better uptime and is mission critical for our customers. Our cloud operations team is where the rubber meets the road and needs innovative Site Reliability Engineers. Join a professional team of smart and hard-working professionals building enterprise-class cloud-based services in the rapidly growing market of...
-
Systems Administrator/Site Reliability Engineer
3 weeks ago
Santa Clara, United States TEKsystems Full timeDescription: As a Senior Site Reliability Engineer, you will have the responsibility for provisioning and operating our high-availability systems that provide automated control, monitoring, and alerting at our production data centers. Your duties will include: • Ensuring high levels of systems reliability and availability in a global enterprise data...
-
Senior Site Reliability Engineer, FlashArray
3 weeks ago
Santa Clara, United States Pure Storage Full timeCompany Overview: BE PART OF BUILDING THE FUTURE. What do NASA and emerging space companies have in common with COVID vaccine R&D teams or with Roblox and the Metaverse? The answer is dataall fast moving, fast growing industries rely on data for a competitive edge in their industries. And the most advanced companies are realizing the full data advantage by...
-
Senior Site Reliability Engineer, FlashArray
2 weeks ago
Santa Clara, United States Pure Storage Full timeCompany OverviewBE PART OF BUILDING THE FUTURE. What do NASA and emerging space companies have in common with COVID vaccine R&D teams or with Roblox and the Metaverse? The answer is data-all fast moving, fast growing industries rely on data for a competitive edge in their industries. And the most advanced companies are realizing the full data advantage by...
-
Senior Site Reliability Engineer, FlashArray
1 week ago
Santa Clara, United States Pure Storage Full timeCompany OverviewBE PART OF BUILDING THE FUTURE. What do NASA and emerging space companies have in common with COVID vaccine R&D teams or with Roblox and the Metaverse? The answer is data-all fast moving, fast growing industries rely on data for a competitive edge in their industries. And the most advanced companies are realizing the full data advantage by...
-
Senior Silicon Reliability Engineer
3 weeks ago
Santa Clara, United States NVIDIA Full timeSenior Silicon Reliability Engineer page is loaded Senior Silicon Reliability Engineer Apply locations US, CA, Santa Clara time type Full time posted on Posted 3 Days Ago job requisition id JR1981353 NVIDIA has continuously reinvented itself over three decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern...
-
Principal Site Reliability Engineer
3 weeks ago
Santa Clara, United States Kofi Group Full timeTo Apply for this Job Click HerePrincipal Site Reliability EngineerSan Francisco Bay Area, CAWe are partnering with a late-stage Cloud Security company that is looking for a Principal Level SRE The ideal candidate will have:Strong sense of architecture and design for fault tolerance, scale-out approaches, and stability Deep experience in building tools...
-
Senior Reliability Test Engineer
1 week ago
Santa Clara, United States Johnson & Johnson Full timeJohnson & Johnson's Robotic and Digital Solutions (RAD) group is recruiting for a Senior Reliability Test Engineer , located in Santa Clara, CA . Robotics & Digital Solutions is part of Ethicon, Inc., a global leader in surgery with products and solutions found in almost every operating room around the world. Ethicon has made significant contributions to...
-
Senior Reliability Test Engineer
1 month ago
Santa Clara, United States Johnson & Johnson Full timeJohnson & Johnson's Robotic and Digital Solutions (RAD) group is recruiting for a Senior Reliability Test Engineer , located in Santa Clara, CA . Robotics & Digital Solutions is part of Ethicon, Inc., a global leader in surgery with products and solutions found in almost every operating room around the world. Ethicon has made significant contributions to...
-
Senior Reliability Test Engineer
2 weeks ago
Santa Clara, United States Johnson & Johnson Full timeJob Description Johnson & Johnson’s Robotic and Digital Solutions (RAD) group is recruiting for a Senior Reliability Test Engineer , located in Santa Clara, CA . Robotics & Digital Solutions is part of Ethicon, Inc., a global leader in surgery with products and solutions found in almost every operating room around the world. Ethicon has made...
-
Power Electronics, Staff Reliability Engineer
1 month ago
Santa Clara, United States QCells Full timeHanwha Q CELLS Co., Ltd., is one of the world´s largest and most recognized photovoltaic manufacturers for its high-performance, high-quality solar cells and modules. It is headquartered in Seoul, South Korea (Global Executive HQ) and Talheim, Germany (Technology & Innovation HQ). Through its growing global business network spanning Europe, North...
-
Santa Clara, United States QCells Full timeHanwha Q CELLS Co., Ltd., is one of the world´s largest and most recognized photovoltaic manufacturers for its high-performance, high-quality solar cells and modules. It is headquartered in Seoul, South Korea (Global Executive HQ) and Talheim, Germany (Technology & Innovation HQ). Through its growing global business network spanning Europe, North...
-
Santa Clara, United States Palo Alto Networks Full timeOur Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking for...
-
Santa Clara, United States Palo Alto Networks Inc. Full timeCompany Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done,...
-
Santa Clara, United States Palo Alto Networks Inc. Full timeCompany Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done,...
-
Santa Clara, United States Qcells Full timeDescription ABOUT US Hanwha Q CELLS Technologies, Inc. a subsidiary of Hanwha Q CELLS, one of the world´s largest and most recognized photovoltaic manufacturers for its high-performance, high-quality solar cells and modules. It is headquartered in Seoul, South Korea (Global Executive HQ) and Talheim, Germany (Technology & Innovation HQ). Through its...
-
Site Reliability Engineer
3 weeks ago
Santa Clara, United States Cryptoware Technologies Inc Full timeJob DescriptionJob Description Responsibility • Lead the effort of global expansion of Huobi globe spanning infrastructure. • Work with engineering teams to make sure new features and changes are deployed quickly and safely. • Constantly improve our system performance and reliability through better tools, process and monitoring system. • Staffing an...
-
Site Reliability Engineer
1 week ago
Santa Clara, United States Cryptoware Technologies Inc Full timeJob DescriptionJob Description Responsibility • Lead the effort of global expansion of Huobi globe spanning infrastructure. • Work with engineering teams to make sure new features and changes are deployed quickly and safely. • Constantly improve our system performance and reliability through better tools, process and monitoring system. • Staffing an...
-
Santa Clara, United States Palo Alto Networks Full timeJob DescriptionJob DescriptionCompany DescriptionOur MissionAt Palo Alto Networks® everything starts and ends with our mission:Being the cybersecurity partner of choice, protecting our digital way of life.Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting...