Lead Azure Site Reliability Engineer

4 weeks ago


Washington DC, United States Mechanicode.io Full time

We are looking for a Lead Azure Site Reliability Engineer (SRE) to enable efficient monitoring and observability of the CDC Azure infrastructure and and applications.

The SRE will lead operations of the cloud environment with observability, IAC, and cloud-native best practices.

The engineer will be part of a larger effort to modernize the CDC DevOps enterprise framework by joining the team of 20 which is comprised of data scientists, software engineers, product owners, and DevOps engineers.

Mechanicode is a remote-first company, and this role will be 100% remote.

W2-Salary : 140-160k

Required

  • Must be a U.S citizen or green-card holder
  • 8+ years of professional experience
  • Proven leadership track record
  • Ability to pass a background check and obtain a public trust security clearance

Essential Skills, Experience, and Competencies :

  • Proficient with Observability in the cloud, building monitoring & alerting frameworks (grafana, datadog, newrelic etc.)
  • Has built alert escalation plans, disaster recovery infrastructure, and setup on-call rotations
  • Proficient with implementing cloud infrastructure on Azure.
  • Proficient with Terraform
  • Experience with Linux, and Bash scripting.
  • Experience with Kubernetes (AKS)
  • Substantial experience with programming languages like Python
  • Experience with containerization technologies (e.g.Docker, containerD)
  • Ability to develop the architecture for continuous integration and deployment as well as continuous monitoring
  • Experience supporting scalable and elastic applications on distributed architectures.
  • Strong ability and understanding of securing systems on the application, network, and infrastructure layers.
  • Experience managing network/compute/database infrastructure with infrastructure-as-code.
  • Expert in basic git actions like cloning, creating branches, navigating between branches, staging code for commit, committing code, resetting, and merging.
  • Ability to mentor & support junior members
  • Proven ability to work under pressure and in fast-paced environments.
  • Ability to operate and manage work, strategically reason, build relationships and influence others.

Nice to Have

  • Azure Certifications

Interview Steps

  • Preliminary Screen
  • CoderByte Assessment
  • Technical review
  • Client Review

Why Mechanicode?

Mechanicode’s vision is to bring peace of mind with technology.

We do so by building self-healing cloud infrastructure, resilient enough to withstand failures and sufficiently predictable to resolve issues without human intervention.

We do that by having automation as the cornerstone of our cloud solutions, significantly improving workforce attrition, and introducing agile rapid development conventions that improve the developer's experience.

About Mechanicode

Mechanicode a Cloud Digital services firm providing comprehensive DevSecOps, Cloud Native Engineering, IT Modernization & Automation services.

Founded by a former USDS engineer, Mechanicode has 13 years of experience developing innovative automation solutions improving the feedback loop in the developer experience, and using AWS/Azure Certified best practices for clients.

Mechanicode has experience in both the public and private sectors, providing modernization services that engage Agile best practices, scalable cloud architectures, and continuous integration & deployment standards.

#J-18808-Ljbffr

  • Washington, United States Mechanicode.io Full time

    We are looking for a Lead Azure Site Reliability Engineer (SRE) to enable efficient monitoring and observability of the CDC Azure infrastructure and and applications. The SRE will lead operations of the cloud environment with observability, IAC, and cloud-native best practices. The engineer will be part of a larger effort to modernize the CDC DevOps...


  • Washington, United States Mount Indie Full time

    Mount Indie is on the search for a Lead Site Reliability Engineering (SRE) to work remotely, focusing on delivering mission critical services that empower end users. The role will involve designing and implementing end to end CI/CD pipelines using AI/ML tooling. Responsibilities: • Design and implement end-to-end CI/CD pipelines. • Employ extensive...


  • Washington, Washington, D.C., United States SAIC Career Site Full time

    Description SAIC is seeking a motivated, experienced individual to act as an integral part of a client's program. As a member of the engineering team, the Azure DevOps Server Administrator is a critical contributor to the team's mission. We specialize in leveraging Microsoft Azure DevOps to streamline our development processes and enhance collaboration...


  • Washington, United States Mount Indie Full time

    Job DescriptionJob DescriptionMount Indie is on the search for a Lead Site Reliability Engineering (SRE) to work remotely, focusing on delivering mission critical services that empower end users. The role will involve designing and implementing end to end CI/CD pipelines using AI/ML tooling.Responsibilities:Design and implement end-to-end CI/CD...


  • Washington, United States Mount Indie Full time

    Job DescriptionJob DescriptionMount Indie is on the search for a Lead Site Reliability Engineering (SRE) to work remotely, focusing on delivering mission critical services that empower end users. The role will involve designing and implementing end to end CI/CD pipelines using AI/ML tooling.Responsibilities:Design and implement end-to-end CI/CD...


  • Washington, United States MetroStar Systems Full time

    As a Site Reliability Engineer (SRE), youll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the federal government. We know that you cant have great technology services without Reliability Engineer, Liability, Reliability, Engineer, Reliability, Manufacturing, Technology


  • Washington, United States SAIC Full time

    Description SAIChas an opening for a Cloud Engineer with experience designing and testing Azure Hyperconverged Infrastructure (HCI) to include integration with the Enterprise Azure Cloud Services (EACS). The Vanguard 2.2.1 program provides transparent, interconnected systems and security support for the Department of State (DOS) Bureau of Information...


  • Washington, United States Sparibis Full time

    Location: 100% remote Years' Experience: 10+ Year's of experience Education: Bachelor's degree Work Authorization: United States Citizenship is required as part of the eligibility criteria to be able to obtain a security clearance. Clearance: Applicants must be able to obtain and maintain a Public Trust security clearance. Key Skills: Must experience...


  • Washington, United States SAIC Full time

    Description SAIC is seeking a motivated, experienced individual to act as an integral part of a client's program. As a member of the engineering team, the Azure DevOps Server Administrator is a critical contributor to the team's mission. We specialize in leveraging Microsoft Azure DevOps to streamline our development processes and enhance collaboration...


  • Washington, United States Engility Corporation Full time

    SAIC is seeking a motivated, experienced individual to act as an integral part of a client's program. As a member of the engineering team, the Azure DevOps Server Administrator is a critical contributor to the team's mission. We specialize in leveraging Microsoft Azure DevOps to streamline our development processes and enhance collaboration across teams. As...


  • Washington, DC, United States Exelon Full time

    DescriptionWe're powering a cleaner, brighter future.Exelon is leading the energy transformation, and we're calling all problem solvers, innovators, community builders and change makers. Work with us to deliver solutions that make our diverse cities and communities stronger, healthier and more resilient.We're powered by purpose-driven people like you who...


  • Washington, United States Palantir Technologies Full time

    Site Reliability Engineer - Security Infrastructure Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. The Role Our products support...

  • Azure Data Engineer

    1 week ago


    Washington, United States Ampcus Full time

    The Azure Data Engineer will be responsible for designing, developing, and maintaining data pipelines, data models, and data storage solutions using Azure services. The candidate will work closely with cross-functional teams to ensure successful delivery of projects and support business objectives. The ideal candidate should have a strong technical...

  • Azure Data Engineer

    1 week ago


    Washington, United States AKT Corp Full time

    Job DescriptionJob DescriptionTitle: Azure Data EngineerLocation: Washington DCResponsibilities:Design, develop, and deploy data pipelines and ETL processes on Azure using services such as Azure Data Factory, Azure Databricks, and Azure Synapse Analytics.Implement data storage solutions on Azure, including Azure SQL Database, Azure Cosmos DB, and Azure Blob...


  • Washington, Washington, D.C., United States SAIC Career Site Full time

    Description SAIC is looking for an Azure Dynamics 365 Power Platform Engineer to join our Pension Benefit Guaranty Corporation (PBGC) team in Washington, DC. The Azure Dynamics CRM Engineer's primary responsibilities include the architecture (design/planning), management, support, and oversight of the Azure Dynamics CRM environment.Job Duties:Conceptualize,...

  • Azure Lead

    1 week ago


    Washington, United States Omni Inclusive Full time

    Technical Expertise: A strong background in software development is essential, typically with at least 9 years of experience in software development roles. The technical lead should have expertise in React, node, .net core programming languages, frameworks, and They should have hands-on experience in designing and developing complex software systems....


  • Washington, United States Magnus Technology Solutions Full time

    Job DescriptionJob DescriptionWe are seeking a Site Reliability Engineer Entry Level to join our team! You will develop and implement unique web-based applications.Responsibilities:Design, create, and modify websitesConvert written, graphic, audio, and video components to compatible web formatsCreate back-end code and interfaces for new web...

  • Azure Cloud Architect

    4 weeks ago


    Washington, United States Super Systems Inc Full time

    Hybrid - First 2-3 weeks will be onsite for training etc. Once acclimated will move to 2x a week onsite - 3x a week remote! Onsite days are Tuesdays/Thursday Summary **Responsibilities include**: - Architect, design, and deploy enterprise-level cloud-based solutions for hosting, backup, off-site data replication and service level capacity solutions. -...


  • Washington, Washington, D.C., United States SAIC Career Site Full time

    Description SAIC is looking for a System Administrator (Microsoft System Center Operation Manager/Cloud Monitoring Engineer) to join our team. The SCOM Engineer will provide support for our data centers that host core applications and infrastructure. This position is part of a team that tracks system performance using in-place monitoring tools such as SCOM...


  • Washington, Washington, D.C., United States SAIC Career Site Full time

    Description SAIC is Seeking a motivated, experienced Active Directory (AD) Senior System Engineer with advanced PowerShell scripting capability, responsible for analysis, design, and implementation coordination and Tier-3 level support for complex, enterprise level AD and Cloud solutions in Washington, DC.As a senior member of the engineering team, will...