Site Reliability Engineer

3 weeks ago


Town of Poland, United States DevOps projects Full time

Site Reliability EngineerJob OverviewAs a Site Reliability Engineer (SRE) at Ververica, you will design, provision, and maintain the infrastructure for Ververica’s Unified Streaming Data Platform across multiple cloud providers, including AWS, GCP, and Azure. Your role will involve architectural improvements, implementation ownership, and driving reliability best practices.Key ResponsibilitiesBuild and maintain the infrastructure for Ververica’s Unified Streaming Data Platform across AWS, GCP, and Azure.Design and manage Infrastructure as Code (IaC) using Terraform.Implement and enhance observability tooling, including Grafana, Prometheus, logging systems, traces, metrics, dashboards, and alerts.Ensure system reliability through SRE best practices.Improve infrastructure architecture and engineering efficiency through continuous evaluation and optimization.Enhance CI/CD pipelines to automate development workflows.Monitor, identify, and resolve security vulnerabilities.Contribute to the successful development and launch of new products, features, and services.Participate in on‑call rotations to manage incidents in a 24/7 live infrastructure.Maintain and update documentation.Required QualificationsBachelor’s degree in Computer Science, Information Technology, or a related field.Minimum 2 years of hands‑on experience with Kubernetes clusters, Helm charts, controllers, and operators.Proficiency in designing and maintaining Terraform code.Strong knowledge of observability tools and practices.Experience implementing SRE principles.Solid understanding of Linux systems and networking in cloud environments.Familiarity with distributed systems or streaming data platforms.Knowledge of cloud-native security best practices. #J-18808-Ljbffr



  • Town of Poland, United States Mirantis Full time

    About Mirantis Mirantis is a Kubernetes-native AI infrastructure company that enables organizations to build and operate scalable, secure, and sovereign infrastructure for modern AI, machine learning, and data‑intensive applications. By combining open‑source innovation with deep expertise in Kubernetes orchestration, Mirantis empowers platform...


  • Town of Poland, United States E-Solutions Full time

    Site Reliability Engineer Build and maintain SRE dashboards using SLIs to measure and monitor SLO adherence. Define and implement auto-healing, resilient, and fault-tolerant systems from design through production. Serve as the primary contact for production application issues, coordinating with engineering teams to resolve incidents efficiently. Diagnose and...


  • Town of Poland, United States XM Full time

    Site Reliability Engineers (SRE) - Multiple Openings The Role: You will join a team working with Observability, Escalations, Post-mortems, Correction of Errors, and other practices that will contribute to the company's goal of cloud resiliency. You will be responsible for driving processes around reliability, best practices, cultural change, and enforcement...


  • Town of Poland, United States EPAM Systems Full time

    Lead Site Reliability Engineer with Dynatrace 2 days ago Be among the first 25 applicants We are seeking a Lead Site Reliability Engineer to enhance and migrate observability solutions using Dynatrace. You will play a key role in establishing advanced monitoring frameworks and deploying AI‑driven anomaly detection to improve system reliability. This...


  • Town of Poland, United States VGW Malta Limited Full time

    VGW is an interactive entertainment company, harnessing technology and creativity to deliver world-class, free-to-play online social games.We have an exciting opportunity to join our Engineering team in Poland and are currently looking for a Senior Site Reliability Engineer to join the team.You'll focus on ensuring the reliability of our systems as we bring...


  • Town of Poland, United States Coupa Software, Inc. Full time

    Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a global network of 10M+ buyers and suppliers. We empower you with the ability to predict, prescribe, and automate smarter,...


  • Town of Poland, United States PRIMUS Global Technologies Pvt Ltd Full time

    Sr. Site Reliability Engineer, 100% Remote Work (Poland) 4 days ago Be among the first 25 applicants Sr. Site Reliability Engineer, 100% Remote Work 6 months contract to hire Bill Rate: $49.00/hr. USD (From Apex to PRIMUS US) – Cannot go above this bill rate Client is ABBYY Interview Process: 2 Technical Video Interview IMP NOTE: Candidates must be in...


  • Town of Charlotte, United States National Black MBA Association Full time

    Job Description At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day. Being a Great Place to Work is core to how we drive Responsible Growth. This includes our...


  • Town of Belgium, United States Intigriti Full time

    Your mission As a Site Reliability Engineer, you are part of the Product & Engineering team at Intigriti. In your day-to-day activities, you ensure the continuous availability of our development pipeline and cloud infrastructure. In a proactive way, you safeguard our cloud environment by analyzing, implementing, and delivering qualitative ‘cloud...


  • Town of Texas, United States InfStones Full time

    Blockchain Site Reliability Engineer Location: Dallas, TX, USA (Remote Acceptable - USA Applicants Only) Company: InfStones (https://infstones.com/) Contact: recruiter-usa@infstones.com About Company InfStones is an advanced, enterprise-grade Platform as a Service (PaaS) blockchain infrastructure provider trusted by the top blockchain companies in the world....