Senior Cloud Reliability Engineer
3 days ago
We are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Geospatial And Cloud Analytics Inc. As a key member of our engineering team, you will be responsible for designing, implementing, and supporting operational and reliability aspects of large-scale cloud infrastructure.
Key Responsibilities- Design and implement operational and reliability aspects of large-scale Kubernetes clusters, focusing on performance at scale, real-time monitoring, logging, and alerting.
- Engage in and improve the entire lifecycle of services, from inception and design through deployment, operation, and refinement.
- Support services before they go live through activities such as system design consulting, developing software tools, platforms, and frameworks, capacity management, and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless postmortems.
- Participate in an on-call rotation to support production systems.
- BS degree in Computer Science or a related technical field involving coding (e.g., physics or mathematics), or equivalent experience.
- 5+ years of experience with infrastructure automation, distributed systems design, and experience with designing, developing tools for running large-scale private or public cloud systems in production.
- Experience in one or more of the following: Python, Go, Perl, or Ruby.
- In-depth knowledge of Linux, Networking, and Containers.
- A competitive base salary range of $132,000 - $310,500 USD, determined by location, experience, and pay of employees in similar positions.
- Eligibility for equity and benefits.
- A diverse and inclusive work environment.
-
Site Reliability Engineer
4 days ago
Santa Clara, California, United States Veear Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Veear. As a key member of our infrastructure team, you will play a critical role in ensuring the reliability, scalability, and security of our cloud-based systems.Key ResponsibilitiesCollaboration and PartnershipPartner with cross-functional teams to ensure security...
-
Senior Cloud Engineer
5 days ago
Santa Clara, California, United States NVIDIA Full timeAbout the RoleNVIDIA is seeking a seasoned Cloud Engineer to join its fast-paced Infrastructure, Planning and Processes organization. As a Senior Cloud Engineer, you will be part of a dynamic team that develops and maintains NVIDIA's internal cloud provisioning product for GPUs and Tegra systems.Key ResponsibilitiesDesign and implement scalable, resilient...
-
Senior Site Reliability Engineer
2 weeks ago
Santa Clara, California, United States ServiceNow Full timeCompany OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...
-
Cloud Site Reliability Engineer
4 days ago
Santa Clara, California, United States Centrify Corporation Full time**About Centrify Corporation**Centrify Corporation is a leading provider of cloud-based identity and access management solutions. Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers.**Job Summary**We are seeking a highly skilled Cloud Site Reliability Engineer to join our Cloud DevOps team. As a Cloud Site...
-
Reliable Cloud Infrastructure Specialist
7 days ago
Santa Clara, California, United States ServiceNow Full timeJob DescriptionOverviewThe ServiceNow SRE team is a group of highly technical engineers who are tasked with maintaining and developing the reliability, scalability, and performance of the ServiceNow cloud infrastructure.Our SREs are empowered to drive technical resolutions across the technology stack from hardware through to application and all stops in...
-
Senior Site Reliability Engineer
2 weeks ago
Santa Clara, California, United States ServiceNow Full timeCompany OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...
-
Senior Cloud Security Engineer
2 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeCompany OverviewOur VisionAt Palo Alto Networks, our journey begins and ends with our core mission: To be the premier cybersecurity partner, safeguarding our digital existence.We envision a future where each day is more secure than the last. Our foundation is built on challenging the status quo and redefining norms, and we seek innovators dedicated to...
-
Site Reliability Engineer
2 days ago
Santa Clara, California, United States Diverse Lynx Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and infrastructure.Key ResponsibilitiesDesign, implement, and maintain cloud infrastructure on...
-
Senior Site Reliability Engineer
2 months ago
Santa Clara, California, United States Nvidia Full timeNVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables unique creativity and discovery, and powers what were...
-
Senior Cloud Infrastructure Architect
2 weeks ago
Santa Clara, California, United States Astera Labs Full timeAstera Labs stands at the forefront of innovative connectivity solutions, enabling the full potential of AI and cloud infrastructure. Our Intelligent Connectivity Platform seamlessly integrates PCIe, CXL, and Ethernet semiconductor-based solutions alongside the COSMOS software suite, delivering a software-defined architecture that is both scalable and...
-
Senior Cloud Engineer
7 days ago
Santa Clara, California, United States NVIDIA Full timeJob SummaryNVIDIA is seeking a highly skilled Senior SRE Engineer to join its fast-paced Infrastructure, Planning and Processes organization. As a key member of the team, you will be responsible for designing and implementing scalable, resilient cloud infrastructure platforms for NVIDIA's internal cloud provisioning product.Key ResponsibilitiesDesign and...
-
Senior IT Engineer
2 weeks ago
Santa Clara, California, United States OmniVision Technologies Full timeAbout OmniVision TechnologiesWe are a leading developer of advanced digital imaging solutions, providing a diverse culture that works together on the development of cutting-edge imaging technology, products, and solutions.Job SummaryWe are seeking a highly skilled Senior IT Engineer to lead our cloud infrastructure team. The successful candidate will be...
-
Cloud Software Engineer, GenAI
2 weeks ago
Santa Clara, California, United States Amazon Full timeAbout the RoleWe are seeking a Cloud Software Engineer to join our innovative team focused on enhancing the Developer Experience. Our mission is to leverage GenAI to empower developers in creating applications that are faster, more cost-effective, secure, and reliable.GenAI will enable a diverse range of builders to harness the capabilities of AWS,...
-
Senior Cloud Operations Engineer
3 days ago
Santa Clara, California, United States Trillium Staffing Full timeJob DescriptionTrillium Staffing is seeking a seasoned Senior Cloud Operations Engineer to join its fast-paced Infrastructure, Planning and Processes organization. The ideal candidate will have a strong background in cloud infrastructure and highly available production environments.Key ResponsibilitiesDesign, implement, and maintain sophisticated cloud...
-
Santa Clara, California, United States Palo Alto Networks Full timeCompany OverviewPalo Alto Networks is driven by a mission to be the cybersecurity partner of choice, safeguarding our digital lifestyle. Our vision encompasses a world where each day is more secure than the last.We are built on the principle of challenging the status quo and are in search of innovators dedicated to shaping the future of cybersecurity.Work...
-
Santa Clara, California, United States Palo Alto Networks Full timeCompany OverviewPalo Alto Networks is driven by a singular mission: to be the cybersecurity partner of choice, safeguarding our digital existence.Our vision encompasses a world where each day is more secure than the last. We are built on the principles of challenging norms and innovating in the cybersecurity landscape, seeking individuals who are equally...
-
Senior SRE Engineer, NIM Factory
7 days ago
Santa Clara, California, United States Sage Lake Senior Living Full timeAbout the RoleWe are seeking a seasoned Senior SRE Engineer to join our team at Sage Lake Senior Living, where you will play a critical role in ensuring the high availability and performance of our AI-powered applications.Key ResponsibilitiesOperate and improve the observability and maintainability of our distributed microservice cloud applications and...
-
Santa Clara, California, United States XPENG Motors Full timeAbout XPeng MotorsXpeng Motors is a leading innovator in the electric vehicle industry, dedicated to designing, developing, and manufacturing cutting-edge smart electric vehicles that seamlessly integrate advanced Internet, AI, and autonomous driving technologies.Job SummaryWe are seeking a highly skilled Senior Staff AI Infrastructure Site Reliability...
-
Senior Reliability Engineer
2 weeks ago
Santa Clara, California, United States Omnivision Technologies Full timeQualifications:Bachelor's degree in Physics, Electrical Engineering, Materials Science, or a related engineering field, with coursework focused on semiconductor physics and electronics. Familiarity with electronic component reliability standards such as JEDEC and AEC-Q100 is advantageous. Experience in wafer-level reliability testing is also beneficial.Key...
-
Senior Product Reliability Engineer
1 month ago
Santa Clara, California, United States Anello Full timeAbout Anello Photonics:ANELLO Photonics is a leading-edge technology company based in Santa Clara, CA. The company has developed integrated photonic system-on-chip technology for next generation navigation. ANELLO's SIPHOGTM gyroscope is based on its patented photonic integrated circuit technology. The result is a product that is higher performance, much...