Cloud Site Reliability Engineer

3 weeks ago


Fort Bragg, United States Venatore Llc Full time
Job DescriptionJob Description

What You'll Get to Do:

As a Cloud Site Reliability Engineer (SRE) you’ll help ensure the mission is never interrupted. As an SRE you will help ensure today is safe and tomorrow is smarter. Our work depends on talented people joining our team to help transition legacy technologies to cloud infrastructure in an efficient and secure manner.

  • Run the production environment by monitoring availability and taking a holistic view of system health.
  • Build software and systems to manage platform infrastructure and applications.
  • Improve reliability, quality, performance for cloud-hosted applications.
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
  • Provide primary operational support and engineering for multiple large, distributed software applications.
  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Participate in system design consulting, platform management, and capacity planning.
  • Create sustainable systems and services through automation and uplifts.
  • Balance feature development speed and reliability with well-defined service level objectives.


WHAT YOU’LL NEED TO SUCCEED:
Education:

  • Bachelor’s Degree in a STEM field.
  • DoD 8570 Level II (Security +)


Required Experience: 8+ years of related experience
● Required Technical Skills:

  • Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C++, Ruby, and JavaScript.
  • Adept Shell/BASH scripter
  • Experience with distributed storage technologies like NFS, HDFS, Ceph, and S3.
  • 2+ years of experience working with container orchestration technologies, specifically Kubernetes.

● Security Clearance Level: Secret to start, must be able to obtain TS/SCI


Required Skills and Abilities:

  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks along with an ability to offer and implement solutions to address these.
  • Experience creating dashboards to track service health that appeal to both technical and non-technical audiences preferably with Splunk.
  • Excellent written and verbal communication skills, with a strong attention to detail and a head for problem solving.
  • Skilled at working in tandem with a team, or unsupervised as required

Preferred Skills:

  • Experience working with identity and access management technologies and solutions.
  • Experience with Agile development methodologies; using collaboration tools such as Jira and Confluence.
  • Experience with monitoring and logging solutions, specifically Splunk
  • Any of the following: AWS Certified SysOps Administrator Associate or AWS Certified Solutions Architect Associate or any Professional level of the above-mentioned certs where applicable
  • 1+ years’ experience working with Gitlab
  • Skilled at creating Ansible playbooks, working with AWX/Ansible Tower

Location: On Customer Site


US Citizenship Required



  • Fort Bragg, United States Booz Allen Hamilton Full time

    Site Reliability EngineerThe Opportunity:Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Site Reliability Engineer, you know how to build resilient platforms that meets customer needs and takes advantage of the power of containerization both in the cloud and on premises. What if you could use your Kubernetes...


  • Fort Worth, United States M2S Tech Solutions Full time

    Role: Senior site reliability engineerLocation: Fort Worth, TX (Hybrid)Visa: USC Mandatory Skills:Azure DevOps , Azure Cloud Native Security , SRE , GITHUB , Infra CI CD Pipelines, DynatraceSenior site reliability engineer Requirements & Skills:Hands on experience as SREExperience with Azure cloudExperience with APM tools Dynatrace SaaS, Mezmo (LogDNA) and...


  • Fort Washington, United States JR Technologies Full time

    At JR Technologies, our vision is to create the new customer-centric distribution landscape of tomorrow. Working with us offers many opportunities to experienced professionals who are interested in joining a strong team, learning and mentoring in a dynamic environment, honing professional and technical abilities, and who thrive on new challenges. We provide...


  • Fort Liberty, United States Booz Allen Hamilton Full time

    Job Description Location: Fort Bragg,NC,US Remote Work: No Job Number: R0188764 Site Reliability Engineer The Opportunity: Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Kubernetes platform engineer, you know how to build resilient platforms that meet customer needs and take advantage of the power of...


  • Fort Worth, United States Cynet Systems Full time

    Job Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development...


  • Fort Worth, United States Cynet Systems Full time

    Job Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development projects. ...


  • Fort Wayne, United States Sentara Healthcare Full time

    Role Overview Site reliability engineers (SREs) are responsible for improving system reliability and resilience to make it faster and easier to develop and deploy new software capabilities. SREs focus especially on building automation to reduce manual effort and prevent operations incidents. Key Responsibilities Work with stakeholders such as product owners...


  • Fort Washington, United States Devstyler Full time

    You’ll know A1 Bulgaria is the right place for you if you are driven by: Opportunities to learn and build your career; Meaningful work in a stable and fast-paced company; Diversity of people, projects, and platforms; A supportive, fun, and inspiring place to work. For our team we are looking for: Cloud Service Reliability Lead We are seeking a highly...

  • Senior Cloud Engineer

    3 weeks ago


    Fort Bragg, United States Kaimetrix, L.L.C. Full time

    Job DescriptionJob DescriptionKaimetrix (KMX) is a highly disciplined IT Services organization serving the commercial sector and federal agencies across the civilian, defense, and intelligence communities.Formed in 2013, our mission to build a better way to solve the federal government’s most urgent and challenging IT issues. Instead of delivering status...

  • Senior Cloud Engineer

    2 weeks ago


    Fort Bragg, United States Kaimetrix, L.L.C. Full time

    Job DescriptionJob DescriptionKaimetrix (KMX) is a highly disciplined IT Services organization serving the commercial sector and federal agencies across the civilian, defense, and intelligence communities.Formed in 2013, our mission to build a better way to solve the federal government’s most urgent and challenging IT issues. Instead of delivering status...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...

  • Cloud DevOps Engineer

    3 weeks ago


    Fort Liberty, United States General Dynamics Information Technology Full time

    Type of Requisition: Pipeline Clearance Level Must Currently Possess: Top Secret/SCI Clearance Level Must Be Able to Obtain: Top Secret/SCI Suitability: Public Trust/Other Required: Job Family: Cloud Job Qualifications: Skills: Amazon Web Services (AWS), Cloud DevOps, Kubernetes Certifications: Experience: 1 + years of related experience US Citizenship...

  • Cloud Engineer

    4 weeks ago


    Fort Meade, United States SAIC Full time

    Description SAIC has a new opportunity for a Cloud Engineer either at Fort Meade, MD or Offutt AFB, NE providing support to the Nuclear Command, Control, and Communications (NC3) Enterprise Center (NEC) Systems Engineering & Integration (SE&I) Division for an Amazon Web Services (AWS) based environment. Duties Include: Design and support deployment of...


  • Fort Bragg, United States PVM Inc Full time

    Job DescriptionJob DescriptionSalary: Regional Support Engineer Site LeadFort Liberty, North CarolinaACTIVE TS clearance with SCI eligibility requiredPVM is actively seeking a Regional Support Engineer Site Lead located in Fort Liberty, North Carolina. In this role, you will oversee and manage technical support operations within a designated region. This...

  • Reliability Engineer

    2 months ago


    Fort Wayne, United States Lozier Full time

    Job Description - Reliability Engineer (24000264) Reliability Engineer - ( 24000264 ) COMPANY OVERVIEW Lozier Corporation is an industry leader in providing store fixtures to major retailers across the U.S. and around the world. Headquartered in Omaha, Nebraska, Lozier began manufacturing fixtures in 1956, and originated the basics of today’s shelving...


  • Fort Lauderdale, United States Venture Tech Solutions, Inc. Full time

    Role: GCP DevOps Security Engineer This position requires employees to be in the Ft Lauderdale, FL office weekly Qualified candidates must be legally authorized to work in the US and not require sponsorship now or in the future. Position Overview: We are seeking a highly skilled and motivated Cloud DevOps Engineer to join our dynamic team. As a Cloud DevOps...


  • Fort Bragg, United States iGov Full time

    Job DescriptionJob DescriptionTitle: Regional Support Engineer Site LeadLocations: Fort Liberty, NC iGov is seeking experienced Support Site Engineer Lead's to join our USSOCOM team and drive mission success for some of the most demanding Special Operations missions in the world. The Support Engineer may provide onsite support to SOCOM units in the...


  • Fort Lauderdale, United States VentureTech Solutions Full time

    Job DescriptionJob DescriptionRole: GCP DevOps Security Engineer This position requires employees to be in the Ft Lauderdale, FL office weekly Qualified candidates must be legally authorized to work in the US and not require sponsorship now or in the future.  Position Overview:We are seeking a highly skilled and motivated Cloud DevOps Engineer to join our...