Current jobs related to Senior Site Reliability Engineer - Washington - Sparibis


  • Washington, United States Cinder LLC Full time

    [Full Time] Site Reliability Engineer at Cinder (United States) Site Reliability Engineer Cinder United States Date Posted: 31 Oct, 2022 Work Location: Washington, DC, United States Salary Offered: $110 — $220 yearly Job Type: Full Time Experience Required: 1+ years Remote Work: Yes Stock Options: No Vacancies: 1 available About Cinder Cinder provides a...


  • Washington, United States Varada Consulting Full time

    Site Reliability EngineerJob Location-Washington, DC; Hybrid Overview:Varada Consulting, LLC is seeking a full-time highly skilled and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications through automation, monitoring, and...


  • Washington, United States Alldus Full time

    Our client is a Series A startup within the Generative AI space and they are hiring a Site Reliability Engineer to join the team. Backed by one of the leading venture capital firms in the industry, this is an exciting opportunity to join a SaaS company that is revolutionizing their industry. Responsibilities: As the Site Reliability Engineer, you will...


  • Washington, United States Red Frog Solutions Full time

    Site Reliability Engineer - SRE - (TS/SCI) Full Time Perm Washington D.C. (Hybrid - 3 days onsite, 2 days remote) $180K - $200K Salary Plus Competitive Benefits As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the...


  • Washington, Washington, D.C., United States CruitZi, INC Full time

    Position OverviewCruitZi, INC is seeking a dedicated Senior Reliability Engineer to join our team. This pivotal role focuses on enhancing observability, performance, and reliability within our systems, contributing significantly to our mission.This position is Hybrid, requiring regular attendance in a designated location.At CruitZi, INC, we recognize that...


  • Washington, United States StaffWorthy Inc. Full time

    We are a leading technology services provider with a rich history of assembling exceptional teams dedicated to delivering outstanding solutions. For over two decades, we have been committed to excellence, with a mission centered around our passion for our people and the value they deliver to our customers. Responsibilities Monitor platform and containerized...


  • Washington, United States System One Full time

    Site Reliability Engineer Work Location: 3 days onsite DC - JBAB, 2 days remote Clearance: Active TS/SCI with ability to clear PSD As a Site Reliability Engineer (SRE), you’ll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the federal government. What You’ll Do Monitor platform and...


  • Washington, United States StaffWorthy Inc. Full time

    We are a leading technology services provider with a rich history of assembling exceptional teams dedicated to delivering outstanding solutions. For over two decades, we have been committed to excellence, with a mission centered around our passion for our people and the value they deliver to our customers.ResponsibilitiesMonitor platform and containerized...


  • Washington, United States Mount Indie Full time

    Job DescriptionJob DescriptionAs aSite Reliability Engineer (SRE), youll continuously drive improvements in observability, performance, and reliability,with the goal to make an impact across the federal government. This role requires a current TS/SCI that has been obtained within the last 51 months and the ability to pass additional background...


  • Washington, United States Kansas Action for Children, Inc Full time

    at T-Mobile USA, Inc. in Overland Park, Kansas, United States Job DescriptionBe unstoppable with us!T-Mobile is synonymous with innovation-and you could be part of the team that disrupted an entire industry! We reinvented customer service, brought real 5G to the nation, and now we're shaping the future of technology in wireless and beyond. Our work is as...


  • Washington, United States Kansas Action for Children, Inc Full time

    at T-Mobile USA, Inc. in Overland Park, Kansas, United StatesJob DescriptionBe unstoppable with us!T-Mobile is synonymous with innovation-and you could be part of the team that disrupted an entire industry! We reinvented customer service, brought real 5G to the nation, and now we're shaping the future of technology in wireless and beyond. Our work is as...


  • Washington, United States CruitZi, INC Full time

    Job DescriptionJob DescriptionOur Client is currently hiring a full-time Sr. Site Reliability Engineer (SRE), who will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government.This role is Hybrid, requiring travel to downtown Washington, DC, at...


  • Washington, United States Abnormal Company Full time

    Enterprises of all sizes trust Abnormal Security’s cloud products to stop cybercrime. These products must scale with the growth of our customers and ensure reliability and availability by being resilient. In FY25, Abnormal Security has an ambitious goal to establish their product offerings to heavily restricted environments. This is where a SRE comes in,...


  • Washington, United States Karsun Solutions Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineering Manager to join our team at Karsun Solutions. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our systems and services.Key Responsibilities:Lead a team of engineers in designing, implementing, and maintaining robust...


  • Washington, United States Veterans Enterprise Technology Solutions Full time

    Job Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Veterans Enterprise Technology Solutions. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our systems and applications.Key Responsibilities:Monitor and analyze system performance to identify...


  • Washington, United States Veterans Enterprise Technology Solutions Full time

    Job Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Veterans Enterprise Technology Solutions. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our infrastructure.Key Responsibilities:Monitor and Maintain Infrastructure: Continuously monitor our...


  • Washington, United States Abnormal Full time

    Enterprises of all sizes trust Abnormal Security’s cloud products to stop cybercrime. These products must scale with the growth of our customers and ensure reliability and availability by being resilient. In FY25, Abnormal Security has an ambitious goal to establish their product offerings to heavily restricted environments. This is where a SRE comes in,...


  • Washington, United States Kansas Action for Children, Inc Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Kansas Action for Children, Inc. in Overland Park, Kansas, United States.This is an exciting opportunity for a technical professional who is passionate about innovation and wants to be part of a team that is reshaping the future of technology in the wireless...


  • Washington, United States Karsun Solutions Full time

    We are seeking a highly skilled and experienced Site Reliability Manager to join our team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our systems and services. They will lead a team of engineers in designing, implementing, and maintaining robust infrastructure and automation solutions. The ideal...


  • Washington, United States Tik Tok Full time

    About the RoleTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our platform.Key ResponsibilitiesCollaborate with infrastructure, product, and platform engineering teams to operate and...

Senior Site Reliability Engineer

3 months ago


Washington, United States Sparibis Full time
Location: 100% remote

Years' Experience: 10+ Year's of experience

Education: Bachelor's degree

Work Authorization: United States Citizenship is required as part of the eligibility criteria to be able to obtain a security clearance.

Clearance: Applicants must be able to obtain and maintain a Public Trust security clearance.

Key Skills:
  • Must experience serving as a SRE
  • Prior leadership and experience with leading a team
  • Deep understanding of SRE principles for highly scalable and reliable systems.
  • Configuration Management and Infrastructure as Code expertise
Responsibilities
  • Responsible for incident response, monitoring, alerting, triaging and closing of real problems
  • Ensure platform stability and availability
  • Responsible for the metrics reporting and tracking, evaluation of proper function, support to the teams for enhance performance
  • Design and implement end-to-end continuous delivery pipelines.
  • Leverage extensive AWS cloud experience in a production environment (e.g., network, security, deployment, automation, serverless technologies).
  • Utilize a deep understanding of SRE principles for highly scalable and reliable systems.
  • Leverage extensive experience with Configuration Management and Infrastructure as Code.
  • Works with application teams to document application internal/external interface requirements for Development, Testing, Staging and Production environments
  • Works with application teams to ensure compliance with High Availability and Disaster Recovery related concept of operations.
  • Build service level requirements for SLA's
  • Implements middleware application specific requirements as needed
  • Implements migration efforts with application teams, including data migration
  • Serve as a thought leader for agile development teams.
  • Establish clarity of direction and a shared vision of success that is championed by team members, stakeholders, and product owners.
  • Build relationships, and work in collaboration with team members, stakeholders, product owners, and technical team leads.
  • Help enhance processes, communication, and delivery through new norms that improve how work is done - from discovery to delivery.
  • Provides technical guidance to application teams to take advantage of cloud technologies, and implement cloud infrastructure, as needed.
Qualifications
  • 10+ years of software engineering and DevOps experience
  • Bachelor degree or higher education required
  • Must be able to obtain and maintain a Public Trust security clearance
  • Must have experience with highly scalable and reliable systems by implementing and maintaining processes and tools
  • Incident response, monitoring performance and releases, alerting, and triaging expertise
  • ServiceNow, AWS Insight, Splunk, VictorOPS, CloudWatch, New Relic, and Confluence expertise preferred
  • Experience in designing and implementing end-to-end continuous delivery pipelines.
  • A deep AWS cloud experience in a production environment (e.g., network, security, deployment, automation, serverless technologies).
  • Experience and understanding in SRE principles for highly scalable and reliable systems.
  • A strong experience with Configuration Management and Infrastructure as a Code.
  • Experience designing and implementing end to end CI/CD pipelines
  • AWS Cloud experience in the production environment (ie. network, security, deployment, automation, serverless technologies)
  • Experience designing and building web application environments on AWS including services such as EC2, S3, Lambda, ELB, ECS etc.
  • Experience in deploying of the cloud resources using IaC tools like Terraform.
  • Experience with monitoring and logging tools such as Cloud Watch, App Dynamics and Splunk. Create CloudWatch rules to capture the apps alerts and send notifications
  • Previous experience migrating application teams from on-prem to cloud infrastructure (AWS, Azure) preferred.
  • Experience with CI/CD frameworks (ie. Jenkins, Docker, Ansible, Chef, Puppet, Git)
  • Experience in at least one automation and scripting tool experience (ie. Bash, Python, Shell, Perl)
  • Experience in designing and building of CIFS and NFS on-premises File share migration using AWS Datasync and VPC endpoints to AWS storage services S3, EFS or FSx.
  • Experience in creating build plans for AWS deployment by listing out compute resources, Security groups, LB, target group, NACL and all other components for various environments (Dev, TQA, and Prod etc.)
  • Experience maintaining and administering configuration management systems such as Enterprise GitHub.
  • Experience maintaining and administering software build systems such as Jenkins.
  • Experience maintaining and administering artifact repository systems such as Artifactory.
  • Ability to automate workflows through scripting or other technologies such as Ansible or Puppet.
  • Expertise in Agile and DevSecOps approaches


About Sparibis

Sparibis LLC is a professional solution firm that Clients rely on to access the best talent to drive their business success.

Sparibis is an equal opportunity employer that values diversity at all levels. All individuals, regardless of personal characteristics, are encouraged to apply.