Site Reliability Engineer

4 weeks ago


Reston, Virginia, United States Red Gate Group Full time

Company Description

At RED GATE we do everything we can to serve our clients:
Using the right technical skills, unique methodologies, best practices, and integrated technology, we help clients implement bold solutions. New approaches to emerging and evolving threats. Non-traditional ways to overcome entrenched obstacles. Advantage through opportunity. If you have a serious challenge or problem, we can help you solve it. The below job description provides details on how this role will help to serve our clients.

Job Description

The Red Gate Group is seeking a Site Reliability Engineer to support DTRA. This hybrid role combines on-site and remote work, where you'll enhance system resilience and efficiency for the DoD by building a robust infrastructure. By leveraging your expertise in Kubernetes, Ansible, AWS Cloud Migration, and Cloudera, you'll build in redundancy, implement monitoring tools, and automate processes to reduce toil. This position offers the opportunity to guide junior engineers and expand your knowledge base while contributing to innovative cloud migration solutions.

Responsibilities:

  • Develop resilient infrastructure for the DoD.
  • Implement monitoring tools and automate routine tasks.
  • Build or modify Ansible playbooks with Bash scripts.
  • Troubleshoot and resolve issues related to CI/CD pipeline failures.
  • Collaborate with application development teams across the software development life cycle.

Qualifications

Required Skills & Qualifications

  • Active TS/SCI
  • 5+ years of experience with working in Linux environments
  • 5+ years of experience with troubleshooting, triaging, and resolving issues related to CI/CD pipeline failures or slowness on production Enterprise environments
  • Experience with developing enterprise cloud-native solutions involving Kubernetes, Docker, Cloudera, AWS, Jenkins, or RHEL Systems
  • Experience in working with application development teams across the software development life cycle and creating solutions to complex problems in a collaborative team environment
  • Ability to build or modify Ansible playbooks with Bash scripts
  • Active DoD 8570 Level II Security Certification, including Security+

Desired Skills & Qualifications

  • Experience with Python and Go, Microservices, Serverless, MLOps, AIOps, Cloudera, and Kubernetes
  • Experience with Big Data stack using Hadoop, Spark, Accumulo or MongoDB, and Solr or Elasticsearch
  • Experience with software development processes and code management tools and processes
  • Experience with declarative Infrastructure as Code tools, including Puppet, Terraform, and Ansible
  • Experience with GitOps and CI/CD tools, including ArgoCD, Gitlab CI, or Jenkins
  • Possession of excellent verbal and written communication skills

Additional Information

The Red Gate Group, Ltd. is an Equal Opportunity/Affirmative Action Employer. The Red Gate Group, Ltd. considers applicants without regard to race, color, religion, age, national origin, ancestry, ethnicity, gender, gender identity, gender expression, sexual orientation, marital status, veteran status, disability, genetic information, citizenship status, or membership in any other group protected by federal, state, or local law. EEO is the Law



  • Reston, Virginia, United States Microsoft Full time

    Unlock the Power of Cloud Services with MicrosoftAs a leader in cloud innovation, Microsoft is revolutionizing the business world with cutting-edge solutions. We're seeking skilled Site Reliability Engineers to design and implement top-notch solutions for our customers.Contribute to Shaping the Future of Cloud Computing3+ years of experience in software...


  • Reston, Virginia, United States Microsoft Full time

    Are you driven by a commitment to excellence in large-scale service delivery? We are seeking a Lead Manager of Site Reliability Engineering who possesses a unique blend of software development expertise, experience in online services, and a dedication to quality. This role is pivotal in conceptualizing, designing, and executing government cloud service...


  • Reston, Virginia, United States Microsoft Full time

    Are you driven by a commitment to excellence in large-scale service delivery? We are seeking a Lead Manager of Site Reliability Engineering who possesses a blend of software development expertise, online service experience, and a dedication to quality. This role involves envisioning, designing, and executing cloud service offerings tailored for government...


  • Reston, Virginia, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled and experienced Senior Site Reliability Engineering Manager to join our team at Microsoft. As a key member of our Cloud Services organization, you will be responsible for providing technical leadership and direction to a team of engineers focused on ensuring the reliability, availability, and performance of our...


  • Reston, Virginia, United States Microsoft Full time

    About the Role: Join the Office 365 team as a Principal Site Reliability Engineer, where you will play a pivotal role in enhancing the delivery of essential features within our government cloud offerings. Your expertise in quality, reliability, and innovation will be crucial in advancing the continuous delivery of services that enhance the Teams Phone...


  • Reston, Virginia, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled and experienced Senior Site Reliability Engineering Manager to join our team at Microsoft. As a key member of our engineering organization, you will be responsible for providing technical leadership to a team of highly passionate and skilled engineers.Key Responsibilities:Recruit, on-board, and grow a team of...


  • Reston, Virginia, United States Microsoft Full time

    About the Role: Microsoft is seeking a Principal Site Reliability Engineer to join our dynamic Office 365 team, which is dedicated to delivering exceptional communication and collaboration solutions. In this pivotal role, you will leverage your expertise in ensuring the reliability and quality of our services, particularly within the government cloud sector....


  • Reston, Virginia, United States Microsoft Full time

    Microsoft is seeking a Senior Site Reliability Engineer to join our Cloud and Artificial Intelligence Silver Team. This team plays a crucial role in deploying and managing a Secure Work Area, which includes the infrastructure necessary for collaboration within a highly secure environment. In this position, you will collaborate with engineers who facilitate a...


  • Reston, Virginia, United States Microsoft Corporation Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our cloud-first team at Microsoft Corporation. As a key member of our team, you will play a critical role in designing and implementing scalable and reliable cloud services for our customers.About the RoleDesign and implement cloud infrastructure solutions that meet the reliability...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is in search of a Reliability Engineer to become a vital part of our dynamic Engineering Innovation Factory Team, which consists of solution architects and digital engineers. This team is responsible for defining and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is in search of a Reliability Engineer to become a vital part of our dynamic Engineering Innovation Factory Team. This team comprises solution architects and digital engineers dedicated to shaping and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our initiatives in creating...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is in search of a Reliability Engineer to become a vital member of our dynamic Engineering Innovation Factory Team. This team of solution architects and digital engineers is dedicated to defining and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our initiatives in creating...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is on the lookout for a Reliability Engineer to become a vital part of our dynamic Engineering Innovation Factory Team. This team of solution architects and digital engineers is dedicated to shaping and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our initiatives in creating...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is in search of a Reliability Engineer to be a vital part of our dynamic Engineering Innovation Factory Team. This team of solution architects and digital engineers is dedicated to defining and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our initiatives encompass a diverse...


  • Reston, Virginia, United States Microsoft Full time

    Microsoft is seeking a Senior Site Reliability Engineer to join our Cloud and Artificial Intelligence Silver Team. This team is tasked with the deployment and management of a Secure Work Area, which includes the infrastructure necessary for collaboration within a highly secure environment. In this position, you will collaborate with engineers who facilitate...


  • Reston, Virginia, United States Comcast Full time

    Job SummaryWe are seeking a highly skilled Senior Software Engineer - Site Reliability Engineering to join our team at Comcast. As a key member of our engineering team, you will be responsible for ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning for our FreeWheel...


  • Reston, Virginia, United States Microsoft Full time

    About the Role:Microsoft is seeking a Principal Site Reliability Engineer to join our Office 365 team, which is dedicated to delivering advanced communication and collaboration solutions. This role is pivotal in enhancing the reliability and performance of our services within the government cloud sector.Key Responsibilities:Drive the evolution of our...


  • Reston, Virginia, United States SAIC Full time

    Position OverviewSAIC is looking for a Reliability Engineer to be part of our dynamic Engineering Innovation Factory Team, comprised of solution architects and digital engineers. This role is pivotal in defining and constructing the infrastructure that drives the Digital Engineering Transformation across various sectors. Our initiatives in creating digital...


  • Reston, Virginia, United States Microsoft Corporation Full time

    Job SummaryMicrosoft Corporation is seeking a highly skilled Site Reliability Engineer to join our cloud-first team. As a key member of our team, you will play a critical role in designing and implementing scenarios for our customers, ensuring the reliability and scalability of our cloud services.About the RoleDesign and implement cloud-based solutions to...


  • Reston, Virginia, United States Dewberry Full time

    Dewberry is currently seeking a Site/Civil Engineering Intern for Summer 2025 to join our Site/Civil department in our Fairfax, VA office to work on a variety of projects for clients within the commercial and government sector. This is an excellent career opportunity for an enthusiastic and talented individual to join a team of outstanding professionals, and...