See more Collapse

Site Reliability Engineer SME

4 weeks ago


San Antonio, United States Rackner Full time

Title:

Site Reliability Engineer SME Location:

San Antonio, TX (Top Secret required personnel may be remote but may be required to be onsite to access the SCIF facilities in San Antonio up to one week, 2 times per month) Clearance:

Top Secret (SCI eligible)

About this role:

Rackner is seeking a Site Reliability Engineer SME to work with the DSO Platform Product Line Manager and Infrastructure as Code SME, as well as other team members to ensure that the overall platform design and implementation is able to meet or exceed the service level objectives and agreements relating to uptime, availability, and mean-time-to-recover.

We are seeking professionals with:

BS; Engineering, Computer Science, or technical degree or industry experience equivalent

10+ years; Contributing on a technical team for a software or IT project.

4+ years; use of DevSecOps in support of system integrity, availability, and security

3+ years; Development containerized applications and delivery to a containerized platform

3+ years; Providing technical guidance to more junior team members

DoD 8570/8140 IASAE Level II within 30 days of start:

CASP+ CE

CISSP (or Associate)

CSSLP

CISSP-ISSAP

CISSP-ISSEP

CCSP

What will make you successful:

Manage and maintain availability and reliability of critical platform services and applications, ensuring they meet requirements of internal and external users

Collaborate with business leaders in building and running sustainable production systems, which can evolve and adapt to changes in a global business environment

Evaluates performance results and recommends major changes affecting short-term project growth and successRun infrastructure with Chef, Ansible, Terraform, GitLab, CI/CD and Kubernetes

Build monitoring that alerts on symptoms rather than on outages

Document every action so your findings turn into repeatable actions and then into automation

Use the GitLab product to run GitLab.com as a first resort and improve the product as much as possible

Improve operational processes

Design build and maintain core infrastructure that enables GitLab scaling to support hundreds of thousands of concurrent users

Debug production issues across services and levels of the stack

Nice to have:

M.S.; Computer Science or Engineering field

10+ years; Contributing on a technical team

5+ years; providing technical guidance to morejunior team members

DoD 8570/8140 IASAE Level II at start

Who We Are:

Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector.

We are an energetic, growing consultancy with a passion for solving big problems for both startups and enterprises.

Each of us enable digital transformation for large organizations through the newest in distributed technologies as we are laser focused on end-to-end application development, DevSecOps, AI/ML and systems architecture and our methodology focuses on cloud-first and cost-effective innovation.

Our customers hail from a diverse, ever-growing list of industries.

Benefits/Additional Info:

Rackner embraces and promotes employee development and training and covers the cost of certifications relevant to a position and the technologies/services provided . Fitness/Gym membership eligibility, weekly pay schedule and employee swag, snacks & events are offered as well

401K with 100% matching up to 6%

Highly competitive PTO

Great health insurance with large network of providers

Medical/Dental/Vision

Life Insurance, and short & long term disability

Industry-Leading Weekly Pay Schedule

Home office & equipment plan

#J-18808-Ljbffr


We have other current jobs related to this field that you can find below


  • San Antonio, Texas, United States Global InfoTek Full time

    Clearance Level:Secret US Citizenship : Required Job Classification : Full-time Location:Remote Experience : 3-5 years Education:Bachelor of Science Degree Position Description:Global InfoTek, Inc. is looking for a Site Reliability Engineer to join our team to support our customers and join our passionate team of high-impact problem solvers. This opportunity...


  • San Antonio, United States Cherokee Nation Businesses Full time

    Job DescriptionReliability Centered Maintenance (RCM) SME The RCM SME is expected to coordinate and provide Reliability Centered Maintenance (RCM) methodology that encompasses Condition Base Monitoring (CBM) technologies, while obtaining or consult measurements/data, industry best practices/codes, and asset/MTF strategy development & benchmarking to support...


  • San Antonio, United States Global InfoTek Inc Full time

    Job DescriptionJob DescriptionClearance Level: SecretUS Citizenship: RequiredJob Classification: Full-timeLocation: RemoteExperience: 3-5 yearsEducation: Bachelor of Science Degree Position Description:Global InfoTek, Inc. is looking for a Site Reliability Engineer to join our team to support our customers and join our passionate team of high-impact problem...


  • San Antonio, United States Global InfoTek Inc Full time

    Job DescriptionJob DescriptionClearance Level: SecretUS Citizenship: RequiredJob Classification: Full-timeLocation: RemoteExperience: 3-5 yearsEducation: Bachelor of Science Degree Position Description:Global InfoTek, Inc. is looking for a Site Reliability Engineer to join our team to support our customers and join our passionate team of high-impact problem...


  • San Antonio, United States Dunhill Professional Search Full time

    Site Reliability Engineer San Antonio, TX - Hybrid**US Citizenship RequiredWe are looking for a Site Reliability Engineer to join a project with the Department of Education based out of San Antonio, TX! This role is mostly telework, but some onsite work is needed. We are sponsoring a clearance for this role, so it is a great opportunity to break into the...


  • San Diego, United States ObjectWin Technology Full time

    Job Title: Site Reliability Engineer Location: San Diego, CA or Remote in CA Duration: 6 Months Description: It is an exciting time to be part of SIE's CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make PlayStation highly reliable,...


  • San Diego, California, United States PEAK Technical Staffing USA Full time

    Hiring Senior Site Reliability Engineer; primary responsibilities will include contributing to the implementation and delivery of the end-to-end automation platform, to support continuous integration and continuous delivery (CI/CD), with a focus on developer self-service capabilities.NOTE:Must have build out experience with Kubernetes. This position requires...


  • San Diego, United States ObjectWin Technology Full time

    Job Title: Site Reliability Engineer Location: San Diego, CA or Remote in CA Duration: 6 Months Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make PlayStation highly...


  • San Diego, United States ObjectWin Technology Full time

    Job Title: Site Reliability Engineer Location: San Diego, CA or Remote in CA Duration: 6 Months Description: It is an exciting time to be part of SIE’s CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make PlayStation highly...


  • San Diego, United States ObjectWin Technology Full time

    Job Title: Site Reliability Engineer Location: San Diego, CA or Remote in CA Duration: 6 Months Description: It is an exciting time to be part of SIE's CICD and Cloud Site Reliability Engineering (SRE) team. SREs operate right at the intersection of Software Engineering and Infrastructure Engineering. The SRE team strives to make PlayStation highly reliable,...


  • San Francisco, CA, United States Apollo Solutions Full time

    Site Reliability Engineer Apollo Solutions have partnered with a groundbreaking artifical inteligence business who are making major developments in how we use AI/ML for gaming/security. They are working closely with government contracts as well as gaming consoles companys and are now searching for an SRE to join their growing team. The Site Reliability...


  • San Ramon, California, United States LaSalle Network Full time

    LaSalle Network has partnered with a well-established software provider that's based in San Ramon, CA, who's in need of a well-rounded, Site Reliability Engineer (SRE) – Grafana Observability – with a strong background in Grafana and related tools such as Prometheus and Telegraf. The ideal candidate will play a crucial role in accelerating the transition...


  • San Jose, United States IBM Full time

    ENGINEERING Site Reliability Engineer, IBM Corporation, San Jose, CA (Up to 40% telecommuting permitted): Work with development teams to enable a continuous integration environment that sustains high productivity levels and emphasizes defect prevention techniques. Manage delivery pipeline....


  • San Mateo, California, United States eTek IT Full time

    Position : Site Reliability EngineerLocation : San Mateo, CARequired Skills Must Haves: 3 to 5 years exp. Kubernetes, DataDog, cloud services, large scale systems, AWS&GCP, minor Azure GKE, home strung clusters on prem, and AKS (Very Small), EKS Consistent upgrades across all the clusters and clouds Nice to Have: Gaming experience bonusAdditional SkillsJob...


  • San Francisco, United States Patreon Full time

    Patreon is the best place for creators to build exclusive content and community for their fans. We enable creators (podcasters, writers, musicians, illustrators, etc) to connect with their fans directly and make money from their creative work. Creators can sell one-off items from their own shops or offer recurring monthly memberships with exclusive access to...


  • San Francisco, United States Apollo Solutions Full time

    Principal Site Reliability Engineer Apollo Solutions have partnered with a groundbreaking Fintech start-up backed by top tier venture capital. They are looking to significantly disrupt how we view, store and invest our personal finance and have already made significant waves in the industry. The Principal Site Reliability Engineer will be working closely...


  • San Francisco, United States Pelago Full time

    Role Overview: At Pelago, we run a serverless architecture on AWS, with infrastructure managed using Terraform. Our system has been built to deliver our virtual clinic for Substance Use Management, and we are looking for a talented Site Reliability Engineer to join the engineering team supporting Pelago.As a HIPAA compliant, HITRUST certified organization it...


  • San Diego, United States TalentBurst Full time

    SENIOR SITE RELIABILITY ENGINEERLocation: San Diego, CA 92127 - 100% onsite (San Diego site preferred, open to other sites located in San Francisco 94107, San Mateo 94404, Los Angeles 90045 or Aliso Viejo 92656)Duration: 6 months **W2 Acceptable It is an exciting time to be part of Continuous Integration/Continuous Deployment (CI/CD) and Cloud Site...


  • San Diego, United States TalentBurst Full time

    SENIOR SITE RELIABILITY ENGINEER Location: San Diego, CA 92127 - 100% onsite (San Diego site preferred, open to other sites located in San Francisco 94107, San Mateo 94404, Los Angeles 90045 or Aliso Viejo 92656) Duration: 6 months **W2 Acceptable It is an exciting time to be part of Continuous Integration/Continuous Deployment (CI/CD) and Cloud Site...


  • San Francisco, United States Apollo Solutions Full time

    Principal Site Reliability Engineer Apollo Solutions have partnered with a groundbreaking Fintech start-up backed by top tier venture capital. They are looking to significantly disrupt how we view, store and invest our personal finance and have already made significant waves in the industry. The Principal Site Reliability Engineer will be working closely...