Current jobs related to Kafka Site Reliability Engineer - Santa Clara, California - Palo Alto Networks
-
Kafka Site Reliability Engineer DevOps
3 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeWe are revolutionizing the cybersecurity landscape with our cloud-delivered security services, and our cloud infrastructure is rapidly expanding globally.We're seeking experienced SREs and software engineers interested in production engineering to help us scale the world's largest enterprise security cloud infrastructure.Palo Alto Networks has transformed...
-
Senior Staff Site Reliability Engineer
3 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RoleWe are seeking a highly skilled Senior Staff Site Reliability Engineer to join our CDL/SLS team at Palo Alto Networks. As a key member of our team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Our Infrastructure Platform stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps,...
-
Senior Staff Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RolePalo Alto Networks is seeking a highly skilled Senior Staff Site Reliability Engineer to join our CDL/SLS team. As a key member of our team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Key ResponsibilitiesContribute to the success of SRE and DevOps teamsDevelop expertise in new...
-
Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Diverse Lynx Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design, implement, and maintain cloud infrastructure on AWS,...
-
Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Syntricate Technologies Full timeJob Title: Site Reliability EngineeringWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using...
-
Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Syntricate Technologies Full timeJob DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design, implement, and maintain cloud infrastructure on AWS, including EC2,...
-
Principal Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob DescriptionPalo Alto Networks is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and implement scalable and reliable...
-
Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States NVIDIA Full timeUnlock the Power of Cloud ServicesWe are seeking a highly motivated Site Reliability Engineer to join our Applications Infrastructure organization.This team is responsible for automating, deploying, and maintaining infrastructure for various NVIDIA AI workflows and applications such as Metropolis, ACE, and Riva hosted in the cloud.The SRE role focuses on...
-
Principal Site Reliability Engineer
3 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob DescriptionPalo Alto Networks is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining scalable and reliable infrastructure for our cloud-based products.Key Responsibilities:Design and implement scalable and reliable infrastructure for...
-
Site Reliability Engineering Lead
7 days ago
Santa Clara, California, United States NVIDIA Full timeAs a Senior Manager in Site Reliability Engineering (SRE) at NVIDIA, you will lead a team dedicated to the design, construction, and maintenance of expansive production systems, emphasizing high efficiency and availability. This role spans various domains, including software and systems engineering, cloud-scale storage, data management, and services. SRE...
-
Principal Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob DescriptionPalo Alto Networks is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.You will work closely with our development team to ensure that applications are production-ready,...
-
Senior Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States NVIDIA Full timeAbout NVIDIANVIDIA is a leader in the field of artificial intelligence, machine learning, and datacenter acceleration. Our company has a rich history of innovation, with a legacy that dates back to the invention of the GPU in 1999. This groundbreaking technology sparked the growth of the PC gaming market, redefined modern computer graphics, and...
-
Principal Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Palo Alto Networks. As a key member of our Global Customer Operation Team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...
-
Principal Site Reliability Engineer SASE
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob DescriptionPalo Alto Networks is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Your ResponsibilitiesDesign and Implement Scalable...
-
Senior Staff Site Reliability Engineer
3 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob OverviewPalo Alto Networks is seeking a highly skilled Cloud Infrastructure Engineer to join our CDL/SLS team. As a Senior Staff Site Reliability Engineer, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Our team is at the forefront of innovation, constantly pushing the boundaries of what is...
-
Senior Site Reliability Engineer
3 weeks ago
Santa Clara, California, United States NVIDIA Full timeNVIDIA is a leader in AI, machine learning, and datacenter acceleration. Our company is expanding its leadership into datacenter networking with ethernet switches, NICs, and DPUs. We have continuously reinvented ourselves over two decades, with our invention of the GPU in 1999 sparking the growth of the PC gaming market, redefining modern computer graphics,...
-
Senior Staff Site Reliability Engineer
3 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout the RoleWe are seeking a highly skilled Senior Staff Site Reliability Engineer to join our team at Palo Alto Networks. As a key member of our Cloud Infrastructure team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Our ideal candidate will have a strong background in cloud computing, with...
-
Senior Staff Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeAbout UsPalo Alto Networks is a leader in the cybersecurity industry, dedicated to protecting the digital way of life. Our mission is to be the cybersecurity partner of choice, and we're looking for innovators who share our passion for shaping the future of cybersecurity.We're a company built on disruption, and we're looking for individuals who are...
-
Senior Staff Site Reliability Engineer
4 weeks ago
Santa Clara, California, United States Palo Alto Networks Full timeJob DescriptionPalo Alto Networks is seeking a highly skilled Senior Staff Site Reliability Engineer to join our CDL/SLS team. As a key member of our infrastructure team, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Key Responsibilities:Develop expertise in new technologies and contribute to the...
-
Service Reliability Engineer
2 months ago
Santa Clara, California, United States Software Technology, Inc Full timeJob Title: Service Reliability EngineerSoftware Technology, Inc is seeking a highly skilled Service Reliability Engineer to join our team.Key Responsibilities:Develop and implement DevOps practices to ensure 24x7 SaaS operationCollaborate with micro-service software developers, architects, and field integration resources to architect and deliver Ericsson's...
Kafka Site Reliability Engineer
2 months ago
We are revolutionizing the cybersecurity landscape with our cloud-delivered security services, and our cloud infrastructure is rapidly expanding with a global presence.
We're seeking exceptional SREs and software engineers interested in production engineering to help us scale the largest enterprise security cloud infrastructure worldwide.
About UsPalo Alto Networks has disrupted the enterprise firewall market, growing from a start-up to a multi-billion-dollar company.
Our Application Framework, the latest offering in our cloud-delivered security services, ingests security events from hundreds of thousands of firewalls deployed globally to provide a massive data analytics platform for deep inspection, anomaly detection, and actionable security automation.
Our cloud infrastructure is home to complex distributed systems and virtualization software platforms that enable big data processing for security services, sandboxing, and malware detection, URL categorization, and malicious site/domain identification, and security research/response.
Responsibilities- Maintain and scale production Kafka clusters with high ingestion rates, Zookeeper clusters, and other big data pipeline systems such as Kafka and HDFS.
- Improve scalability, service reliability, capacity, and performance.
- Write automation code for managing, monitoring, measuring, expanding, and healing clusters.
- Participate in the occasional on-call rotation supporting the infrastructure.
- Roll up your sleeves to troubleshoot incidents, formulate theories, and test hypotheses to find the root cause.
- Hands-on experience with managing production Kafka clusters.
- Strong development/automation skills, with a focus on Python.
- In-depth understanding of Kafka cluster management, Zookeeper, partitioning, topic replication, and mirroring.
- Excellent grasp of monitoring and metrics collection, performance tuning, and troubleshooting complicated distributed systems.
- Tools-first mindset, with a focus on building tools for efficiency and ease of use.
We're looking for organized, focused individuals who can build, improve, resolve, and deliver. Good communication skills, teamwork, and a character of taking ownership are essential.