Site Reliability Engineer

3 weeks ago

California, United States PayNearMe Full time

At PayNearMe, we’re on a mission to make paying and getting paid as simple as possible. We build innovative technology that transforms the way businesses and their customers experience payments. Our industry-leading platform, PayXM™, is the first of its kind—designed to manage the entire payment experience from start to finish. Every click, swipe or tap is seamless, fast and secure, helping non-commerce businesses boost customer satisfaction, accelerate payments, and reduce costs. Our single platform handles it all: cards, ACH, digital wallets such as PayPal, Venmo, Cash App Pay, Apple Pay and Google Pay, and even cash at more than 62,000 retail locations nationwide. Today, thousands of businesses across consumer lending, iGaming and online sports betting, property management, and toll…………………… In September 2025, we raised a $50 million Series E positi… We’re a team of 200+ employees across 41 states, headquartered in Silicon Valley with satellite offices in Dallas, TX and Holmd vertrouwid. Join us and be part of a team that’s shaping the future of payments—one experience at a time. Job Description As our Site Reliability Engineer, you will design, build, and maintain the systems and infrastructure that power our applications, ensuring their reliability, scalability, and performance. You will bring a software engineering approach to operations, automating processes, and continuously improving the infrastructure and tools to support our business needs. Responsibilities Infrastructure Management: Design, implement, and maintain scalable and resilient infrastructure using Terraform for infrastructure as code, ensuring high availability and performance Kubernetes and Containers: Deploy, manage, and optimize Kubernetes clusters and containerized applications using Docker. Implement best practices for container orchestration and management Systems and Application Monitoring/Observability: Develop and maintain comprehensive monitoring and observability solutions using Datadog. Ensure detailed visibility into system performance and application health SLOs and SLA Management: Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure reliable and consistent service delivery Incident Response and Troubleshooting: Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence. Participate in post-incident reviews and contribute to blameless postmortems Reliability and Production Environment Management: Ensure the reliability and stability of our production environments. Continuously assess and improve system reliability, identifying and addressing potential points of failure Automation and Scripting: Develop automation scripts and tools to reduce manual intervention and improve system reliability using Python, Bash, or Go. Implement and improve CI/CD pipelines CI/CD Pipeline Management: Enhance and maintain continuous integration and continuous deployment pipelines using GitLab CI. Ensure seamless and reliable deployment processes Capacity Planning and Scaling: Assist in capacity planning and ensure that systems are scalable to meet future demands. Implement auto-scaling strategies where applicable Security and Compliance: Implement security best practices and ensure compliance with industry standards. Regularly review and update security policies and procedures < услов>Collaboration and Support: Work closely with development teams to ensure reliability and scalability of new features and services. Provide technical support and guidance on infrastructure-related issues Software Engineering for Operations: Develop and maintain internal tools and services that enhance the efficiency and reliability of our operations On-Call Rotation: Participate in an on-call rotation to address production issues and collaborate in incident response efforts Qualifications +3 years of experience in SRE, DevOps,ankanrole Cloud Platform Experience: Proficient with cloud platforms such as AWS, GCP QCOMPARE… Kubernetes and Containers: Strong experience with Kubernetes and Docker, including deployment, scaling, and management of containerized applications Infrastructure as Code: Expert in using Terraform for infrastructure as code. Proficient with configuration management tools such as Ansible, Puppet, or Chef Monitoring and Observability: Extensive experience with monitoring and observability tools like Datadog, Prometheus, Grafana, ELK stack, or Splunk. Skilled in setting up detailed monitoring and logging systems SLOs and SLA Management: Proven ability to define, monitor, and maintain SLOs and SLAs to ensure reliable service delivery Scripting and Automation: Strong skills in scripting languages like Python, Bash, or Go. Experience automating repetitive tasks and processes CI/CD Practices: Familiarity with GitLab CI or similar tool for continuous integration and deployment. Experience in setting up and managing pipelines Production Environments: Experience supporting production environments running Go or Ruby/Rails applications Tool Development: Ability to write and update_B> DevOps Best Practices: Deep understanding of DevOps principles, practices, and tools to drive continuous improvement in the software development lifecycle Soft Skills: Strong organizational skills, attention to detail, and the ability to work collaboratively in a team environment. Excellent documentation skills to ensure accurate and detailed records Problem-Solving Ability: Excellent analytical and problem-solving skills to diagnose and resolve complex system issues quickly and effectively Additional Information Location: Flexible/Remote (within the US) 100% Remote (must be in US) Base salary per year (paid semi-monthly) Fast‑paced and professional work culture Stock options with standard startup vesting - 1 year cliff; 4 years total $50 monthly communication expense stipend to go towards your phone/internet bill $250 stipend to enhance your WFH setup Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200) Premium medical benefits including vision and dental (100% coverage for employees) Company‑sponsored life and disability insurance Paid parental bonding leave Paid sick leave, jury duty, bereavement 401k plan Flexible Time Off (our team members typically take off ~3-4 weeks per year) 2x / year in-person team meet‑ups (2-3 days, company paid) Salary Range: $175,000 - $195,000 PayNearMe strives to create a workplace where all employees thrive. Ourcore values represent who we are today and we take pride in the way we work with each other as well as with our stakeholders. We’re in thistogether todo the right thing . We deliverreal results we are proud of while remainingrespectful ,transparent , andflexible . PayNearMe is an equal opportunity employer. We are diligently and thoughtfully working towards cultivating a diverse workforce which in turn, enhances our products and services for the communities we serve. Applicants who represent all backgrounds are strongly encouraged to apply. Alternative formats of this Notice are available to individuals with a disability. Please let us know if you need assistance. All your information will be kept confidential according to EEO guidelines. #J-18808-Ljbffr

Site Reliability Engineer

3 weeks ago

California, United States Booz Allen Hamilton Full time

BE EMPOWERED TO SUCCEED Connect with others in our people‑first culture and enhance our collective ingenuity. SUPPORT YOUR WELLBEING Learn how we’ll support you as you pursue a balanced, fulfilling life. YOUR CANDIDATE JOURNEY Discover what to expect during your journey as a candidate with us. Site Reliability Engineer Engineering to make a system more...
Site Reliability Engineer

3 weeks ago

California, United States Phase2 Technology Full time

Job Number: R The Opportunity: Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. Whether you come from a background in network engineering, systems administration, or software development—if you have a passion for making systems better, we need you! As a Site Reliability Engineer on our team,...
Site Reliability Engineer

29 minutes ago

California, MD, United States Booz Allen Hamilton Full time

Job Number: R0230215Site Reliability EngineerThe Opportunity:Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. Whether you come from a background in network engineering, systems administration, or software development—if you have a passion for making systems better, we need youAs a Site...
Site Reliability Engineer

3 weeks ago

California, United States Longbridge Singapore Full time

Site Reliability Engineer (Remote - United States) Longbridge is a fast-growing online brokerage platform on a mission to make investing smarter, simpler, and more accessible for everyone. As part of our global expansion, we’re looking for a hands-on Site Reliability Engineer (SRE) to design, scale, and safeguard the reliability of our next-generation...
Remote Site Reliability Engineer

3 weeks ago

California, United States PayNearMe Full time

A leading fintech company in California is seeking an experienced Site Reliability Engineer to design, build, and maintain systems and infrastructure for enhanced reliability and performance. You'll manage Kubernetes clusters, implement monitoring solutions, and automate processes using Terraform and scripting. This remote role offers competitive benefits,...
Staff Site Reliability Engineer

3 weeks ago

California, United States Motive Full time

Who we are: Motive empowers the people who run physical operations with tools to make their work safer, more productive, and more profitable. For the first time ever, safety, operations and finance teams can manage their drivers, vehicles, equipment, and fleet related spend in a single system. Combined with industry leading AI, the Motive platform gives you...
Director, Site Reliability Engineering

2 hours ago

US - California (San Diego - Office) Insulet Corporation Full time

The Director of Site Reliability Engineering (SRE) will provide strategic leadership and technical direction for the reliability, scalability, and performance of our mission‑critical systems and services. This role combines deep SRE expertise with strong engineering leadership, driving organizational transformation toward reliability-first principles. The...
Site Reliability Engineer — Build Resilient, Automated Infra

3 weeks ago

California, United States Booz Allen Hamilton Full time

A leading technology and consulting firm is seeking a Site Reliability Engineer to lead the development of robust systems and enhance operational efficiencies. The ideal candidate will have over 7 years of experience maintaining highly scalable systems and a passion for automation and infrastructure management. With a focus on improving system resilience,...
Senior Site Reliability Engineer — On-Prem

3 weeks ago

California, United States Phase2 Technology Full time

A leading tech firm in Maryland is seeking a Site Reliability Engineer to enhance system resilience and efficiency. This role involves leading the development of robust systems, implementing monitoring tools, and automating repairs. The ideal candidate will have over 7 years of experience and a Bachelor's degree. Top Secret clearance is mandatory. Join us to...
RELIABILITY ENGINEER

3 weeks ago

California, United States StaffWorthy Full time

Preferred Skills/Qualifications Bachelors Degree in Industrial, Mechanical or Electrical Engineering 5 years of experience in Maintenance, Reliability, Production Management, Engineering or Operations experience Preferred Skills/Qualifications: Previous experience/education in the aluminum industry Strong knowledge of preventative and predictive maintenance...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer