Current jobs related to SRE Product Reliability Engineer - San Jose - Visionary Innovative Technology Solutions LLC


  • San Jose, United States Adobe Full time

    Site Reliability Engineer page is loadedAdobe’s Reliability Engineering team is looking for a Site Reliability Engineer (SRE) to help build and operate services like Adobe Sign. Adobe Sign is the fastest, and easiest way to get contracts signed and filed.You have a track record as a site reliability engineer in large-scale SaaS businesses, and a strong...


  • San Jose, California, United States Adobe Full time

    Site Reliability Engineer page is loadedAdobe's Reliability Engineering team is looking for a Site Reliability Engineer (SRE) to help build and operate services like Adobe Sign. Adobe Sign is the fastest, and easiest way to get contracts signed and filed.You have a track record as a site reliability engineer in large-scale SaaS businesses, and a strong...


  • San Jose, United States F5 Full time

    F 5 Inc. is actively seeking an exceptional Senior Site Reliability Engineer to play a pivotal role in our SRE team for the groundbreaking F 5 Distributed Cloud Product. Due to the nature of work this role requires US Citizenship. Primary Responsibil Reliability Engineer, Liability, Engineer, Reliability, Reliability, Technology, Support


  • San Jose, California, United States Western Digital Full time

    Job OverviewCompany Overview:At Western Digital, we are dedicated to driving global innovation and redefining the limits of technology, transforming what was once deemed impossible into reality.As a company rooted in problem-solving, we empower individuals to achieve remarkable feats through advanced technology. Our innovations have played a pivotal role in...


  • San Jose, United States Zscaler Full time

    Our Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...


  • San Jose, California, United States Western Digital Full time

    Job OverviewCompany OverviewAt Western Digital, we are driven by a vision to ignite global innovation and redefine the limits of technology, transforming the seemingly impossible into reality.Western Digital is fundamentally a collective of problem solvers. Our team has consistently achieved remarkable feats with the right technological tools. For decades,...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security platform provider, offering a comprehensive suite of solutions to protect businesses from cyber threats. Our team of experts has built a robust platform that enables organizations to harness the power of the cloud while ensuring the security and integrity of their data.Job SummaryWe are seeking an experienced...


  • San Jose, California, United States Zscaler Full time

    About ZscalerAt Zscaler, our Engineering team has developed the largest cloud security platform globally, and we continue to innovate. With over 100 patents and ambitious plans for service enhancement and global expansion, our team has established us as a leader in cloud security, serving more than 15 million users across 185 countries. We invite you to...


  • San Jose, California, United States Zscaler Full time

    About UsZscaler has developed the world's largest cloud security platform, continually innovating and expanding our services. With a robust portfolio of over 100 patents and ambitious plans for global growth, our team has established itself as a leader in cloud security, serving more than 15 million users across 185 countries. We are looking for talented...


  • San Jose, California, United States Western Digital Full time

    Job OverviewCompany OverviewAt Western Digital, we are driven by a vision to inspire global innovation and redefine technological possibilities. Our legacy as problem solvers has empowered us to achieve remarkable feats, including contributions to monumental projects like the moon landing.As a trusted partner to leading organizations worldwide, we enhance...


  • San Jose, California, United States Zscaler Full time

    About ZscalerAt Zscaler, our Engineering team has developed the largest cloud security platform globally, and we continue to innovate. With over 100 patents and ambitious plans for service enhancement and global expansion, our team has established us as the leader in cloud security, serving more than 15 million users across 185 countries. We invite you to...

  • SRE Engineer

    2 months ago


    San Francisco, United States Dice Full time

    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Akshaya Inc, is seeking the following. Apply via Dice today! Client : Gspann Job Title: Full Stack SRE Engineer (UI/UX, Vue.js, E-commerce) Job Location: San Francisco, CA Job Type- Long Contract Mode- Onsite, 3-4 days per week We are looking for passionate...


  • San Jose, California, United States ZEDEDA Full time

    ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly...


  • San Francisco, United States Com One Full time

    At Cloudflare, we have our eyes set on an ambitious goal: to help build a better Internet. Today the company runs one of the world’s largest networks that powers approximately 25 million Internet properties, for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online...

  • SRE Tech Lead

    3 months ago


    San Francisco, United States talentslab.io Full time

    Job DescriptionJob DescriptionOur client is seeking a talented SRE Tech Lead to offer leadership and supervision over the development and automation of the diverse services within the Graph ecosystem. This role involves being an active member of the SRE team, contributing hands-on expertise. You will be instrumental in enhancing availability and reliability...


  • San Jose, United States Tik Tok Full time

    Responsibilities TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is...


  • San Francisco, United States Cloudflare, Inc. Full time

    About UsAt Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without...


  • San Francisco, California, United States Block Full time

    Company DescriptionIt all started with an idea at Block in 2013. Initially built to take the pain out of peer-to-peer payments, Cash App has gone from a simple product with a single purpose to a dynamic ecosystem, developing unique financial products, including Afterpay/Clearpay, to provide a better way to send, spend, invest, borrow and save to our 47...


  • San Diego, United States Platform Science Full time

    Who We Are At Platform Science, we’re working to connect everything that moves. Founded in 2015, we are an open IoT platform that partners with innovative fleets, application developers, vehicle manufacturers, and equipment providers in the transportation industry to deliver revolutionary solutions to supply chain professionals across the globe. Our...


  • San Francisco, United States Cloudflare Inc Full time

    About Us At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without...

SRE Product Reliability Engineer

2 months ago


San Jose, United States Visionary Innovative Technology Solutions LLC Full time

Position: SRE Product Reliability Engineer

Location: San Jose, CA – Hybrid (3 days in office, 2 days remote)

Duration: 6+ Month


Job Description:


Must have EKS (Elastic Kubernetes Services) and AWS



Key Responsibilities:

Design and Implementation: Develop and implement observability solutions for Kubernetes based applications using Fluent bit, Cloud Watch, Stack Driver, Grafana Loki,

Grafana Tempo, Prometheus, Envoy Health Probes, Open Telemetry, and ArgoCD.

Monitoring and Logging: Configure and maintain logging pipelines using Fluent bit to collect, process, and route logs for storage and analysis.

Metrics and Tracing: Set up Prometheus for metrics collection and Grafana Tempo for distributed tracing. Integrate these with Grafana for real-time monitoring and alerting via open telemetry.

Telemetry: Utilize Open Telemetry to instrument applications for better traceability and observability.

CI/CD: Use ArgoCD for continuous deployment and ensure observability tools are integrated into the CI/CD pipeline to deploy the observability suite.

Observability Optimization: Analyze and optimize the performance of the observability stack to ensure minimal overhead and maximum efficiency.

Troubleshooting: Proactively identify and resolve issues related to the observability infrastructure. Collaborate with development and operations teams to troubleshoot and resolve incidents.

Documentation and Training: Document observability processes and best practices.

Provide training and support to other team members on observability tools and techniques.


Required Skills and Qualifications:


Experience: Proven experience as an SRE or in a similar role, with a strong focus on observability in Kubernetes environments supporting applications in EKS in AWS.

Technologies: Hands-on experience with Fluent bit, Cloud Watch, Stack Driver, Grafana Loki, Grafana Tempo, Prometheus, Envoy Health Probes, Open Telemetry, and ArgoCD.

Kubernetes: In-depth knowledge of Kubernetes and container orchestration.

Scripting and Automation: Proficiency in scripting languages such as Python, Bash, or similar for automation tasks.

Monitoring and Logging: Strong understanding of monitoring, logging, and tracing concepts and best practices.

Problem Solving: Excellent analytical and problem-solving skills.

Collaboration: Strong communication skills and the ability to work effectively in a team environment.

Continuous Improvement: A proactive attitude towards identifying opportunities for improvement and implementing solutions.


Preferred Qualifications:


Certifications: Relevant certifications such as Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD)

Cloud Platforms: Experience with cloud platforms such as AWS and EKS.

DevOps Practices: Familiarity with DevOps practices and tools.