Cloud Site Reliability Engineer
3 weeks ago
What You'll Get to Do:
As a Cloud Site Reliability Engineer (SRE) you’ll help ensure the mission is never interrupted. As an SRE you will help ensure today is safe and tomorrow is smarter. Our work depends on talented people joining our team to help transition legacy technologies to cloud infrastructure in an efficient and secure manner.
- Run the production environment by monitoring availability and taking a holistic view of system health.
- Build software and systems to manage platform infrastructure and applications.
- Improve reliability, quality, performance for cloud-hosted applications.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
- Provide primary operational support and engineering for multiple large, distributed software applications.
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
- Partner with development teams to improve services through rigorous testing and release procedures.
- Participate in system design consulting, platform management, and capacity planning.
- Create sustainable systems and services through automation and uplifts.
- Balance feature development speed and reliability with well-defined service level objectives.
WHAT YOU’LL NEED TO SUCCEED:
Education:
- Bachelor’s Degree in a STEM field.
- DoD 8570 Level II (Security +)
Required Experience: 8+ years of related experience
● Required Technical Skills:
- Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C++, Ruby, and JavaScript.
- Adept Shell/BASH scripter
- Experience with distributed storage technologies like NFS, HDFS, Ceph, and S3.
- 2+ years of experience working with container orchestration technologies, specifically Kubernetes.
● Security Clearance Level: Secret to start, must be able to obtain TS/SCI
Required Skills and Abilities:
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks along with an ability to offer and implement solutions to address these.
- Experience creating dashboards to track service health that appeal to both technical and non-technical audiences preferably with Splunk.
- Excellent written and verbal communication skills, with a strong attention to detail and a head for problem solving.
- Skilled at working in tandem with a team, or unsupervised as required
Preferred Skills:
- Experience working with identity and access management technologies and solutions.
- Experience with Agile development methodologies; using collaboration tools such as Jira and Confluence.
- Experience with monitoring and logging solutions, specifically Splunk
- Any of the following: AWS Certified SysOps Administrator Associate or AWS Certified Solutions Architect Associate or any Professional level of the above-mentioned certs where applicable
- 1+ years’ experience working with Gitlab
- Skilled at creating Ansible playbooks, working with AWX/Ansible Tower
Location: On Customer Site
US Citizenship Required
-
Site Reliability Engineer
5 days ago
Fort Bragg, United States Booz Allen Hamilton Full timeSite Reliability EngineerThe Opportunity:Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Site Reliability Engineer, you know how to build resilient platforms that meets customer needs and takes advantage of the power of containerization both in the cloud and on premises. What if you could use your Kubernetes...
-
Senior Site Reliability engineer
4 weeks ago
Fort Worth, United States M2S Tech Solutions Full timeRole: Senior site reliability engineerLocation: Fort Worth, TX (Hybrid)Visa: USC Mandatory Skills:Azure DevOps , Azure Cloud Native Security , SRE , GITHUB , Infra CI CD Pipelines, DynatraceSenior site reliability engineer Requirements & Skills:Hands on experience as SREExperience with Azure cloudExperience with APM tools Dynatrace SaaS, Mezmo (LogDNA) and...
-
Site Reliability Engineer
3 weeks ago
Fort Washington, United States JR Technologies Full timeAt JR Technologies, our vision is to create the new customer-centric distribution landscape of tomorrow. Working with us offers many opportunities to experienced professionals who are interested in joining a strong team, learning and mentoring in a dynamic environment, honing professional and technical abilities, and who thrive on new challenges. We provide...
-
Site Reliability Engineer
6 days ago
Fort Liberty, United States Booz Allen Hamilton Full timeJob Description Location: Fort Bragg,NC,US Remote Work: No Job Number: R0188764 Site Reliability Engineer The Opportunity: Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Kubernetes platform engineer, you know how to build resilient platforms that meet customer needs and take advantage of the power of...
-
Senior Site Reliability Engineer
4 weeks ago
Fort Worth, United States Cynet Systems Full timeJob Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development...
-
Senior Site Reliability Engineer
2 weeks ago
Fort Worth, United States Cynet Systems Full timeJob Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development projects. ...
-
Site Reliability Engineer
3 weeks ago
Fort Wayne, United States Sentara Healthcare Full timeRole Overview Site reliability engineers (SREs) are responsible for improving system reliability and resilience to make it faster and easier to develop and deploy new software capabilities. SREs focus especially on building automation to reduce manual effort and prevent operations incidents. Key Responsibilities Work with stakeholders such as product owners...
-
Cloud Service Reliability Lead
3 weeks ago
Fort Washington, United States Devstyler Full timeYou’ll know A1 Bulgaria is the right place for you if you are driven by: Opportunities to learn and build your career; Meaningful work in a stable and fast-paced company; Diversity of people, projects, and platforms; A supportive, fun, and inspiring place to work. For our team we are looking for: Cloud Service Reliability Lead We are seeking a highly...
-
Senior Cloud Engineer
3 weeks ago
Fort Bragg, United States Kaimetrix, L.L.C. Full timeJob DescriptionJob DescriptionKaimetrix (KMX) is a highly disciplined IT Services organization serving the commercial sector and federal agencies across the civilian, defense, and intelligence communities.Formed in 2013, our mission to build a better way to solve the federal government’s most urgent and challenging IT issues. Instead of delivering status...
-
Senior Cloud Engineer
2 weeks ago
Fort Bragg, United States Kaimetrix, L.L.C. Full timeJob DescriptionJob DescriptionKaimetrix (KMX) is a highly disciplined IT Services organization serving the commercial sector and federal agencies across the civilian, defense, and intelligence communities.Formed in 2013, our mission to build a better way to solve the federal government’s most urgent and challenging IT issues. Instead of delivering status...
-
Site Reliability Engineer II
3 days ago
Fort Lauderdale, United States Chewy Full timeOur Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...
-
Site Reliability Engineer II
3 weeks ago
Fort Lauderdale, United States Chewy Full timeOur Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...
-
Site Reliability Engineer II
2 weeks ago
Fort Lauderdale, United States Chewy Full timeOur Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...
-
Cloud DevOps Engineer
3 weeks ago
Fort Liberty, United States General Dynamics Information Technology Full timeType of Requisition: Pipeline Clearance Level Must Currently Possess: Top Secret/SCI Clearance Level Must Be Able to Obtain: Top Secret/SCI Suitability: Public Trust/Other Required: Job Family: Cloud Job Qualifications: Skills: Amazon Web Services (AWS), Cloud DevOps, Kubernetes Certifications: Experience: 1 + years of related experience US Citizenship...
-
Cloud Engineer
4 weeks ago
Fort Meade, United States SAIC Full timeDescription SAIC has a new opportunity for a Cloud Engineer either at Fort Meade, MD or Offutt AFB, NE providing support to the Nuclear Command, Control, and Communications (NC3) Enterprise Center (NEC) Systems Engineering & Integration (SE&I) Division for an Amazon Web Services (AWS) based environment. Duties Include: Design and support deployment of...
-
Regional Support Engineer Site Lead
5 days ago
Fort Bragg, United States PVM Inc Full timeJob DescriptionJob DescriptionSalary: Regional Support Engineer Site LeadFort Liberty, North CarolinaACTIVE TS clearance with SCI eligibility requiredPVM is actively seeking a Regional Support Engineer Site Lead located in Fort Liberty, North Carolina. In this role, you will oversee and manage technical support operations within a designated region. This...
-
Reliability Engineer
2 months ago
Fort Wayne, United States Lozier Full timeJob Description - Reliability Engineer (24000264) Reliability Engineer - ( 24000264 ) COMPANY OVERVIEW Lozier Corporation is an industry leader in providing store fixtures to major retailers across the U.S. and around the world. Headquartered in Omaha, Nebraska, Lozier began manufacturing fixtures in 1956, and originated the basics of today’s shelving...
-
Cloud DevOps Security Engineer
3 weeks ago
Fort Lauderdale, United States Venture Tech Solutions, Inc. Full timeRole: GCP DevOps Security Engineer This position requires employees to be in the Ft Lauderdale, FL office weekly Qualified candidates must be legally authorized to work in the US and not require sponsorship now or in the future. Position Overview: We are seeking a highly skilled and motivated Cloud DevOps Engineer to join our dynamic team. As a Cloud DevOps...
-
Regional Support Engineer Site Lead
1 day ago
Fort Bragg, United States iGov Full timeJob DescriptionJob DescriptionTitle: Regional Support Engineer Site LeadLocations: Fort Liberty, NC iGov is seeking experienced Support Site Engineer Lead's to join our USSOCOM team and drive mission success for some of the most demanding Special Operations missions in the world. The Support Engineer may provide onsite support to SOCOM units in the...
-
Cloud DevOps Security Engineer
4 weeks ago
Fort Lauderdale, United States VentureTech Solutions Full timeJob DescriptionJob DescriptionRole: GCP DevOps Security Engineer This position requires employees to be in the Ft Lauderdale, FL office weekly Qualified candidates must be legally authorized to work in the US and not require sponsorship now or in the future. Position Overview:We are seeking a highly skilled and motivated Cloud DevOps Engineer to join our...