Senior Site Reliability Engineer

17 hours ago


Atlanta, Georgia, United States STORD Full time
About Stord

Stord is a leading commerce enablement provider of fulfillment services and technology that powers seamless checkout and delivery experiences for high-volume mid-market and enterprise brands across all channels. With a strong presence in the market, Stord manages over $5 billion of commerce annually through its fulfillment, warehousing, transportation, and operator-built software suite including OMS, Pre- and Post-Purchase, and WMS platforms.

Job Description

We are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for designing and implementing a resilient, efficient, and secure infrastructure and platform that meets the needs of our customers. You will work closely with cross-functional teams to deliver high-quality products and services at scale, while ensuring operational excellence and continuous improvement.

Key Responsibilities
  • Collaborate with cross-functional teams to design and implement CI/CD pipelines that automate fast and safe delivery of software to our customers.
  • Lead efforts in automating deployment, monitoring, and infrastructure management.
  • Proactively identify and resolve performance bottlenecks, system failures, and security vulnerabilities.
  • Develop SLOs and SLIs to manage risk through continuous monitoring and measurement of system performance.
  • Build, manage and deploy highly available, self-healing, customer facing production infrastructure and applications using Docker, Kubernetes, Helm and Terraform.
  • Leverage 12 Factor App methodology when building and deploying all our services and systems.
  • Implement best practice infrastructure as code (IaC) principles for configuration management and deployment of infrastructure.
  • Enhance operational efficiency by identifying repetitive tasks and developing automation to eliminate toil work.
  • Implement robust metrics, monitoring and alerting for proactive issue identification and resolution.
  • Participate in incident response, on-call rotation and post-incident reviews to ensure 24/7 availability of critical systems and to learn from failures and continuously improve system reliability.
  • Implement and enforce security best practices for infrastructure and applications.
  • Collaborate with security teams to ensure compliance with industry standards and regulations.
Requirements
  • Proven experience as a Senior DevOps Engineer or Senior Site Reliability Engineer.
  • Strong expertise in cloud platforms such as AWS, GCP or Azure.
  • Strong experience with CI/CD tools (Github Actions, GitLab CI, CircleCI) and version control systems (Git).
  • Proficiency with infrastructure-as-code tools (e.g., Terraform, Ansible, Cloudformation).
  • Hands-on experience with container orchestration tools like Docker and Kubernetes.
  • Solid understanding of networking, security, and system engineering.
  • Experience with monitoring and logging tools (e.g., Datadog, Prometheus, Grafana, ELK stack).
  • Strong scripting skills in languages such as Python, Shell or similar.
  • Familiarity with security best practices and compliance requirements.
  • Excellent problem-solving and troubleshooting skills.
  • Ability to work collaboratively in a fast-paced, agile environment.
  • Passion for building the highest-quality solutions for the long term that delight the customer (both internal and external customers).
  • Automation first mindset.
  • High degree of ownership and pride for work.
What We Offer
  • Competitive salary and bonus.
  • Friendly, Passionate, and Intelligent Employee Base.
  • Creative Problem Solving and Entrepreneurial Thinking.
  • Fast-Paced Environment.
  • Low-Ego, Solution-Driven Culture.
  • Community Involvement and Volunteer Opportunities.
  • Employee Resource Groups: Women of Stord, JEDI (Justice, Equity, Diversity, & Inclusion), Stord-Serves, & More.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Stord participates in E-verify and will provide the federal government with your Form I-9 information to confirm that you are authorized to work in the U.S.



  • Atlanta, Georgia, United States PagerDuty Full time

    About the RolePagerDuty is seeking a highly skilled Senior Site Reliability Engineer to join our SRE-Platform team. As a key contributor, you will be responsible for building, maintaining, and scaling the Kubernetes platform that powers our operations.Key ResponsibilitiesMaintain the overall health of the platform, including triaging and troubleshooting...


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Cox Automotive. As a key member of our Manheim Logistics SRE team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Atlanta, Georgia, United States Diversity Resource Staffing Inc Full time

    Job SummaryDiversity Resource Staffing Inc is seeking a highly skilled Senior Site Reliability Engineer to join our Consumer SRE Team. As a Senior Site Reliability Engineer, you will play a critical role in ensuring the security, resilience, scalability, and maintainability of our services for mortgage borrowers and lenders.About the RoleAs a Senior Site...


  • Atlanta, Georgia, United States Diversity Resource Staffing Inc Full time

    Job SummaryDiversity Resource Staffing Inc is seeking a highly skilled Senior Site Reliability Engineer to join our Consumer SRE Team. As a Senior Site Reliability Engineer, you will play a critical role in ensuring the security, resilience, scalability, and maintainability of our services for mortgage borrowers and lenders.About the RoleAs a Senior Site...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Job Title: Senior Site Reliability Engineer - Cloud ExpertJob Summary:Motion Recruitment is seeking a highly skilled Senior Site Reliability Engineer - Cloud Expert to join our client's team. As a key member of the infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and highly available cloud infrastructure on...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, Georgia Employment Type: Hybrid, Direct Hire Salary: $150k - $170k A prominent organization within the financial services sector is seeking a talented individual to enhance their team. They are on the lookout for a Senior Site Reliability Engineer to contribute full-time in their Atlanta office, specifically on-site Monday, Tuesday, and...


  • Atlanta, Georgia, United States PagerDuty Full time

    PagerDuty empowers teams of all kinds to do the critical work that moves business forward through the PagerDuty Operations Cloud.Visit our careers site to explore life at PagerDuty, discover opportunities, and sign-up for job alertsPagerDuty is seeking a Senior Site Reliability Engineer to join our SRE-Platform team. In this role you will be a key...


  • Atlanta, Georgia, United States Advansys Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Advansys. As a key member of our infrastructure team, you will be responsible for maintaining and improving the reliability, performance, and availability of our software systems.Key Responsibilities:Maintain and improve the reliability, performance, and availability...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Job Title: Senior Site Reliability EngineerJob Type: Full-timeLocation: Atlanta, GeorgiaJob Description:A leading healthcare and software company in Atlanta, Georgia, is seeking a highly skilled Senior Site Reliability Engineer to join its team. The company specializes in cancer treatments and best practices for chemotherapy, aiming to provide the most...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, GeorgiaEmployment Type: Hybrid, Full TimeSalary: $150k - $170kAbout the Role:A prominent organization in the healthcare and technology sector is seeking to enhance its team in Atlanta, Georgia. This firm specializes in cancer treatment solutions and implements optimal strategies for chemotherapy tailored to various treatment...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, Georgia Type: Hybrid, Direct Hire Salary: $150k - $170k A prominent player in the financial services sector is seeking a dedicated Senior Site Reliability Engineer to enhance their team. This organization prides itself on safeguarding numerous corporate environments against fraudulent activities. As a frontrunner in their field, they have...


  • Atlanta, Georgia, United States Ultimate Software Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Ultimate Software. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsDevelop and maintain...

  • DevOps Engineer

    1 week ago


    Atlanta, Georgia, United States Motion Recruitment Full time

    About Motion RecruitmentMotion Recruitment is a leading provider of engineering talent to the insurance industry. We are currently seeking a skilled DevOps/Site Reliability Engineer to join our client's team in Atlanta.Job SummaryWe are looking for a highly experienced Site Reliability Engineer to collaborate with our development and operations teams to...


  • Atlanta, Georgia, United States Dynatrace Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Dynatrace. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure to ensure high availability, performance, and security.Key ResponsibilitiesDesign and implement automation solutions to...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, Georgia Type: Hybrid Compensation: $150k - $170k A prominent organization within the financial services sector is seeking a dedicated professional to enhance their team. They are in search of a Senior Site Reliability Engineer to contribute full-time in an office setting three days a week in the Atlanta region. This company takes pride in...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, GeorgiaEmployment Type: Hybrid, Direct HireSalary: $150k - $170kA prominent player in the financial services sector is seeking a talented individual to enhance their team. They are in search of a Senior Site Reliability Engineer to work full-time, with in-office attendance required on select weekdays in the Atlanta region.This organization...


  • Atlanta, Georgia, United States Ultimate Software Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Ultimate Software. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsDevelop and maintain...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Location: Atlanta, Georgia Type: Hybrid, Direct Hire Salary: $150k - $170k A prominent player in the financial services sector is seeking a dedicated professional to enhance their team. They are in search of a Senior Site Reliability Engineer to work full-time, primarily in-office during the week. This organization takes pride in safeguarding numerous...


  • Atlanta, Georgia, United States Cox Enterprises Full time

    This Software Engineer will be part of the Site Reliability Engineering (SRE) team. The SRE team is an innovative team devoted to providing automated solutions and services for Cox Automotive to measure, evaluate and plan for visible, reliable application delivery and maintenance. As a member of the SRE team, you will work with development teams to help...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Job Title: Senior Cloud Reliability EngineerJob Type: Full-timeLocation: Atlanta, GAJob Description:We are seeking a highly skilled Senior Cloud Reliability Engineer to join our team at Motion Recruitment. As a Senior Cloud Reliability Engineer, you will be responsible for designing, implementing, and maintaining the company's cloud infrastructure, ensuring...