Principal Site Reliability Engineer
Found in: Jooble US O C2 - 2 weeks ago
- Passion for reliability and performance, you will own uptime and support all customer-facing services and products
- Own and drive improvements to observability of service performance metrics, monitors, and alerting
- Provision, manage and automate our SaaS platform across multiple production and test environments
- Support and enhance build and release pipelines using process and tooling to provide self-service automations
- Collaborate with development teams on software and platform, helping to identify and remove potential performance bottlenecks
- Help our engineering partners establish SLIs and SLOs for their services
- Participate in the on-call rotation with the team
- Resolve incidents, perform root cause analysis, and grow our library of runbooks
- Implement and automate security controls, governance processes, and compliance validation
- Actively participate in and drive infrastructure architecture decisions
- Mentor junior members of the team
- Occasional domestic travel required for in-person team, department, and company meetings
- Minimum of 10 years of build automation and release management experience in a SaaS production environment
- Hands-on experience with Linux and system administration and engineering
- Comfortable in a containerized world of Kubernetes (EKS), helm, and ArgoCD
- Proficiency with configuration management tools such as Ansible, Chef, Salt
- Production experience in operations for an always-up, always-available mission-critical service
- Strong knowledge of ephemeral infrastructure, horizontal scaling, self-healing architectures, service discovery, logging, monitoring and alerting
- Expert level experience with AWS and hybrid cloud systems/designs
- Proficiency with IaC tools such as Terraform and AWS CloudFormation
- Expert understanding and ability to troubleshoot systems at the protocol layer - TCP/IP, UDP, SSL/TLS, and DNS
- Proficient with multiple scripting languages such as Bash, Python, or Go
- Experience developing CI/CD pipelines using Jenkins or BitBucket Pipelines
- Knowledge of best-practice security, performance, and networking techniques for high-traffic customer-facing systems
- Experience with monitoring and logging tools such as New Relic or AWS CloudWatch
- Experience with relational and NoSQL databases, including Microsoft SQL, Postgres, and MongoDB
- Excellent troubleshooting and testing skills
- A passion for learning new technologies
- Experience with Agile methodology and passion for software development best practices
- Strong sense of collaboration, teamwork, and accountability
- Bonus: Experience working for a B2B SaaS company
- Competitive compensation package, including Employee Equity Appreciation Program
- Health insurance benefits
- 401k with employer match
- 100% remote work environment
- Unlimited paid time off (which includes paid holidays and Winter Break)
- Paid parental leave
- Tuition assistance and Professional development and growth opportunities
- 100% paid life, short and long term disability insurance
- Pre-tax medical and dependent care flexible spending accounts (FSA)
- Voluntary life and critical illness insurance
-
Principal Site Reliability Engineer
7 days ago
Bethesda, United States Teaching Strategies Full timeBe a Part of our Team! Join a working family that is dedicated to the mission of the work we do! Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate all of the essential elements...
-
Principal Site Reliability Engineer
6 days ago
Bethesda, United States Teaching Strategies, LLC Full timeJob DescriptionJob DescriptionBe a Part of our Team!Join a working family that is dedicated to the mission of the work we do!Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate...
-
Principal Site Reliability Engineer
1 week ago
Bethesda, United States Teaching Strategies, LLC Full timeJob DescriptionJob DescriptionBe a Part of our Team!Join a working family that is dedicated to the mission of the work we do!Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate...
-
Sr. Site Reliability Engineer
6 days ago
Bethesda, United States Marriott Full timeJob DescriptionJOB SUMMARYLead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of applications and infrastructure in support of incident and problem investigation and application release management. Develops solutions to automate IT operations tasks. Develops custom monitoring solutions when...
-
Sr. Site Reliability Engineer – Monitoring
Found in: beBee S US - 6 days ago
Bethesda, United States Bethesda Marriott Full timeJob Number 24059351Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJob SummaryLead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis...
-
Sr. Site Reliability Engineer – Monitoring
Found in: beBee jobs US - 6 days ago
Bethesda, Maryland, United States Bethesda Marriott Full timeJob Number Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJob SummaryLead role in the Monitoring and Performance Management function at Marriott. Performs detailed performance analysis of...
-
Senior Site Reliability Engineer
6 days ago
Bethesda, United States Teaching Strategies, LLC Full timeJob DescriptionJob DescriptionBe a Part of our Team!Join a working family that is dedicated to the mission of the work we do!Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate...
-
Senior Site Reliability Engineer
1 week ago
Bethesda, United States Teaching Strategies, LLC Full timeJob DescriptionJob DescriptionBe a Part of our Team!Join a working family that is dedicated to the mission of the work we do!Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate...
-
Site Reliability Engineer with Security Clearance
Found in: Careerbuilder One Red US C2 - 2 days ago
Annapolis Junction, MD, United States 2HB Incorporated Full timeThis is a full-time position, and requires a TS/SCI/Full Scope Polygraph Clearance2HB Incorporated is seeking a Site Reliability Engineer in order to support its government customer in Annapolis Junction, MDThe Site Reliability Engineer provides support in software development/engineering, including requirements analysis, software development, installation,...
-
Database Reliability Engineer
Found in: Jooble US O C2 - 2 weeks ago
Bethesda, MD, United States Teaching Strategies, LLC Full timeBe a Part of our Team! Join a working family that is dedicated to the mission of the work we do! Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate all of the essential elements...
-
Principal Engineer, Network DevOps
Found in: beBee S US - 1 day ago
Bethesda, United States Bethesda Marriott Full timeJob Number 24073054Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJob SummaryThe Principal Engineer, Network DevOps will lead in technical direction and innovation strategy for Cloud and On Prem...
-
Principal Engineer, Network DevOps
Found in: beBee jobs US - 15 hours ago
Bethesda, Maryland, United States Bethesda Marriott Full timeJob Number Job Category Information TechnologyLocation Marriott International HQ, 7750 Wisconsin Avenue, Bethesda, Maryland, United States VIEW ON MAPSchedule Full-TimeLocated Remotely? YRelocation? NPosition Type ManagementJob SummaryThe Principal Engineer, Network DevOps will lead in technical direction and innovation strategy for Cloud and On Prem journey...
-
DevOps Engineer Assoc. Principal
Found in: Jooble US O C2 - 2 weeks ago
Bethesda, MD, United States Bigbear.ai Full timeOverview BigBear.ai is seeking a DevOps Engineer to support the National Media Exploitation Center. This position will be based out of Bethesda, MD and offers some hybrid/remote work flexibility. Candidates Must possess and maintain an active TS/SCI clearance and be eligible for a CI Poly. Duties include examining customer problems and implementing...
-
Principal Machine Learning Engineer
Found in: Jooble US O C2 - 5 days ago
Bethesda, MD, United States Teaching Strategies, LLC Full timeBe a Part of our Team! Join a working family that is dedicated to the mission of the work we do! Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate all of the essential elements...
-
Database Reliability Engineer
6 days ago
Bethesda, United States Teaching Strategies, LLC Full timeJob DescriptionJob DescriptionBe a Part of our Team!Join a working family that is dedicated to the mission of the work we do!Teaching Strategies is an innovative edtech organization focused on connecting teachers, children, and families. As front runners in the early childhood education market, we build dynamic, top-quality digital products that integrate...
-
Bethesda, United States Restore360 Full timeOverview Black Canyon Consulting (BCC) is searching for a Technical Lead - Program Principal Engineer for SeqPlus to support our work for the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM), an institute of the National Institutes of Health. This opportunity is full time and onsite at the NIH-NCBI in...
-
Bethesda, United States Restore360 Full timeOverview Black Canyon Consulting (BCC) is searching for a Technical Lead - Program Principal Engineer for SeqPlus to support our work for the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM), an institute of the National Institutes of Health. This opportunity is full time and onsite at the NIH-NCBI in...
-
Principal Readiness Consultant
Found in: Jooble US O C2 - 2 weeks ago
Columbia, MD, United States Bigbear.ai Full timeOverview BigBear.ai is seeking a Principal Readiness Consultant to assist our customer in assessing organizational ability to achieve mission outcomes. This position will be based out of Ft. Meade and will report on-site daily (no remote flexibility). Candidates for this position will provide independent analytic advice and contribute directly to...
-
Principal / Sr. Principal Logistics Engineer with Security Clearance
Found in: Careerbuilder One Red US C2 - 2 weeks ago
Linthicum Heights, MD, United States Associates Systems LLC Full timeThe Advanced Technology Lab of Northrop Grumman Mission Systems is seeking Logistics EngineersThe qualified candidate will become part of Northrop Grumman's Transformational Technology Business AreaTransformational Technology is focused on developing state-of-the-art next generation superconducting electronics (SCE) and advanced computing...
-
1457 Site Reliability Engineer Skill Level 3 with Security Clearance
Found in: Careerbuilder One Red US C2 - 2 weeks ago
Laurel, MD, United States Momentum Engineering Full timeThis opening is for a Site Reliability Engineer 3 that has development and system administration experience with large systems who can use their experience to formulate and implement automation solutions to support our monitoring and system administration teams in tasks that either are risky to the system, prone to mistakes, labor intensive, time consuming...