Lead Site Reliability Engineer
1 week ago
At Tenth Mountain, we're committed to helping veterans transition into rewarding civilian careers. As a Lead Site Reliability Engineer, you'll play a critical role in ensuring the reliability and availability of our Payments infrastructure.
Key Responsibilities:- Provide 24/5 round-the-clock support for the Payments team, covering multiple regions.
- Manage and resolve incidents related to the Payments infrastructure, ensuring minimal downtime.
- Participate in weekend on-call rotations, acknowledging incidents within 10 minutes and responding appropriately.
- Maintain and manage the Payments IKP infrastructure, ensuring high availability and performance.
- Implement and enhance monitoring, alerting, and incident response processes.
- Automate manual tasks to boost efficiency and reduce human error.
- Proven experience as a Senior DevOps Engineer or Site Reliability Engineer in an Agile environment.
- Strong background in Linux/Unix systems and Shell Scripting (BASH).
- Experience with Kubernetes, preferably GKE on-premises.
- Proficiency in programming with one or more high-level languages, such as Python or Go.
- Experience with building and managing automated CI/CD pipelines and related tools (GitLab CI/CD, Jenkins).
- Familiarity with VMware and other virtualization platform technologies.
- Knowledge of Istio and Anthos Service Mesh.
- Familiarity with monitoring and logging tools (Splunk, Prometheus, Datadog, Kiali).
- Kubernetes certification is a plus.
- Experience with load balancers, reverse proxies (Nginx Controller/Seesaw), and containerization technologies (Docker).
- Exposure to infrastructure-as-code tools (Terraform) and technologies like OpenTelemetry, OpenMetrics, and Kafka.
We believe in your potential and are committed to helping you transition smoothly into a rewarding civilian career. Join us and be part of a company that values your skills, experience, and dedication.
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Alchemy Full timeAbout the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our globally used developer platform.Key ResponsibilitiesDesign, deploy, and continuously improve the infrastructure supporting...
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Apollo Solutions Full timeSite Reliability EngineerApollo Solutions is partnering with a pioneering artificial intelligence business that is revolutionizing the use of AI/ML in gaming and security.The company is working closely with government contracts and gaming console companies and is seeking a Site Reliability Engineer to join their growing team.The Site Reliability Engineer...
-
Site Reliability Engineer
2 weeks ago
New York, New York, United States Cynet Systems Full timeJob Title: Site Reliability EngineerJob Summary:Cynet Systems is seeking a highly skilled Site Reliability Engineer to lead the development and implementation of geospatial application performance monitoring strategies. The ideal candidate will have a strong background in Site Reliability Engineering (SRE) and proven experience in using Dynatrace for...
-
Site Reliability Engineer
1 month ago
New York, New York, United States Braze Full timeAbout the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Braze. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize...
-
Site Reliability Engineering Manager
4 weeks ago
New York, New York, United States Intuit Inc Full timeJob Title: Site Reliability Engineering ManagerAt Intuit Inc, we're seeking an experienced Site Reliability Engineering Manager to lead our Site Reliability Engineering Team. As a key member of our Engineering organization, you will be responsible for ensuring the reliability, scalability, and performance of our application used by both internal engineers...
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Alchemy Full timeAbout the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our globally used developer platform.Our mission is to empower builders with the tools they need to create exceptional on-chain products....
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Fourier Ltd Full timeSite Reliability EngineerFourier Ltd is seeking a skilled Site Reliability Engineer to join our technical operations team. As a Site Reliability Engineer, you will play a critical role in ensuring the superior performance and availability of our production applications throughout the development cycle.Key Responsibilities:Configure and manage multiple...
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Tik Tok Full timeAbout TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide.Our mission is to provide a secure and reliable platform for users to express themselves, learn, and be entertained.Role OverviewWe are seeking a skilled Site Reliability Engineer to join our U.S....
-
Site Reliability Engineering Manager
2 weeks ago
New York, New York, United States Intuit Inc Full timeJob OverviewMailchimp is a leading marketing platform for small businesses, empowering millions of customers worldwide to build their brands and grow their companies with a suite of marketing automation, multichannel campaigns, CRM, and analytics tools.Job DescriptionWe are seeking an experienced Engineering Leader to lead our Site Reliability Engineering...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Tik Tok Full timeAbout TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide.Our mission is to provide a secure and reliable platform for users to express themselves, learn, and be entertained.Site Reliability Engineering at TikTokAs a Site Reliability Engineer at TikTok, you...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Lorven Technologies Full timeJob Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and reliable cloud...
-
Site Reliability Engineer
2 weeks ago
New York, New York, United States CapB InfoteK Full timeJob Title: Site Reliability EngineerAbout the Role:At CapB InfoteK, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:• Develop and build low-level component...
-
Site Reliability Engineer
2 weeks ago
New York, New York, United States Lorven Technologies Full timeJob Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly available...
-
Site Reliability Engineer
2 weeks ago
New York, New York, United States Lorven Technologies Full timeJob Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain infrastructure automation...
-
Site Reliability Engineer
1 month ago
New York, New York, United States FLOAT LLC Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Float LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure, enabling our engineering teams to focus on delivering high-quality software to our customers.Key...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Unreal Gigs Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Unreal Gigs. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems.Key Responsibilities:Design, implement, and maintain scalable infrastructure solutions to support...
-
Senior IT Site Reliability Engineer
4 weeks ago
New York, New York, United States Hudson River Trading Full timeJob Title: Senior IT Site Reliability EngineerHudson River Trading (HRT) is a leading financial services company that utilizes a scientific approach to trading. We are seeking a highly skilled Senior IT Site Reliability Engineer to join our team.Job Summary:The Senior IT Site Reliability Engineer will be responsible for ensuring the availability and...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Tik Tok Full timeAbout the RoleTikTok is seeking a highly skilled Site Reliability Engineer to join our AML team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining highly available, scalable, and fault-tolerant systems.ResponsibilitiesDesign and implement large-scale systems to ensure high availability and scalability.Monitor...
-
Site Reliability Engineer
1 month ago
New York, New York, United States Radar Full timeAbout the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Radar, a leading provider of location infrastructure for every product and service. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our production infrastructure.Key ResponsibilitiesDesign, implement, and...
-
Site Reliability Engineer
1 week ago
New York, New York, United States Insight Global Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will be responsible for ensuring the uptime and reliability of our production and non-production environments. You will work closely with our development teams to build and maintain the infrastructure and applications...