Site Reliability Engineer
1 day ago
We're seeking a skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in building, operating, and maintaining high-performance, scalable, and reliable services for our production infrastructure.
Key Responsibilities- Maintain the availability of cloud and physical Linux servers that power the Palantir platform in air-gapped production environments.
- Design, deploy, and operate infrastructure to support customer and product requirements via modern orchestration and monitoring platforms.
- Collaborate closely with product teams on requirements and SLOs for deploying software into air-gapped environments.
- Identify, troubleshoot, and solve network and systems issues.
- Script to automate away routine operational tasks.
- Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.
- Confidence in troubleshooting complex systems issues independently using stack traces and observability and systems tools.
- Comfort with managing large-scale production systems and technologies with configuration management, load balancing, monitoring and alerting infrastructure, and container orchestration.
- Demonstrated ability to continuously learn and work independently, making decisions with minimal supervision while working in secure facilities.
- DOD 8570 IAT Level II or greater (CISSP, Sec+), Unix/Linux Computing Environment (e.g Linux+, RHCE).
- Proficiency with scripting in Python or Go is a plus.
- 5+ years of experience with Linux system administration (RHEL or equivalent preferred).
- Experience with cloud-based hosting platforms like AWS, Azure, or GCP and/or experience with hardware-based environments.
- Familiarity with monitoring systems using tools like Prometheus and writing health checks.
We offer a comprehensive benefits package, including medical, dental, and vision insurance, life and disability coverage, paid leave for new parents and emergency back-up care for all parents, family planning support, including fertility, adoption, and surrogacy assistance, stipend to help with expenses that come with a new child, commuter benefits, relocation assistance, unlimited paid time off, and 2 weeks paid time off built into the end of each year.
SalaryThe estimated salary range for this position is $125,000 - $185,000/year. Total compensation for this position may also include Restricted Stock units, sign-on bonus and other potential future incentives.
-
Site Reliability Engineer
3 days ago
Washington, Washington, D.C., United States MetroStar Systems Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at MetroStar Systems. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and scalability of our systems.Key Responsibilities:Monitor and analyze platform and containerized applications to identify...
-
Site Reliability Engineer
2 days ago
Washington, Washington, D.C., United States Alldus Full timeSite Reliability EngineerAlldus is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems.Key Responsibilities:Perform root cause analysis to identify and resolve system or application issues in a timely and...
-
Site Reliability Engineer
2 days ago
Washington, Washington, D.C., United States Tik Tok Full timeAbout the RoleTikTok is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our software systems.ResponsibilitiesWork with infrastructure, product, and platform engineering teams to operate and deploy software platforms, capacity planning,...
-
Site Reliability Engineer
7 hours ago
Washington, Washington, D.C., United States CloudFit Software Full timeJob Title: Site Reliability EngineerCloudFit Software is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the quality, performance, and reliability of our CloudFit Managed Applications and Services systems.Key Responsibilities:Collaborate with cross-functional teams...
-
Site Reliability Engineer
2 days ago
Washington, Washington, D.C., United States Cinder LLC Full timeAbout Cinder LLCCinder LLC provides a cutting-edge investigation platform to protect the internet.Our software helps Trust and Safety teams at the world's most influential companies innovate and adapt quickly to emerging threats.Job Title: Site Reliability EngineerWe're seeking an experienced Site Reliability Engineer to lead the development and deployment...
-
Site Reliability Engineer
3 days ago
Washington, Washington, D.C., United States Palantir Technologies Full time{"title": "Site Reliability Engineer", "description": "Job SummaryPalantir Technologies is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications.Key ResponsibilitiesCollaborate with cross-functional teams...
-
Site Reliability Engineer
1 day ago
Washington, Washington, D.C., United States MetroStar Systems Full timeTransforming Government Services with Reliability and PerformanceAs a Site Reliability Engineer at MetroStar Systems, you will play a pivotal role in driving improvements in observability, performance, and reliability across high-level government platforms. Your expertise will be instrumental in making a lasting impact.Key Responsibilities:Monitor and...
-
Site Reliability Engineer
3 days ago
Washington, Washington, D.C., United States MetroStar Corporation Full timeMetroStar Corporation is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our organization, you will play a critical role in driving improvements in observability, performance, and reliability across our systems.**Key Responsibilities:*** Monitor and analyze platform and containerized applications to identify...
-
Site Reliability Engineer
1 week ago
Washington, Washington, D.C., United States MetroStar Systems Full timeTransforming Government Services with Reliability and PerformanceAs a Site Reliability Engineer at MetroStar Systems, you will play a pivotal role in driving improvements in observability, performance, and reliability across high-level government platforms. Your expertise will be instrumental in making a lasting impact.Key Responsibilities:Monitor and...
-
Principal Site Reliability Engineer
3 days ago
Washington, Washington, D.C., United States Kansas Action for Children Full timeTransforming System ReliabilityWe're seeking a seasoned Principal Site Reliability Engineer to spearhead the improvement of system reliability and resilience at T-Mobile USA, Inc. in Overland Park, Kansas, United States.About the RoleAs a key member of our team, you'll apply your expertise to minimize manual effort and prevent operational incidents. Your...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Mount Indie Full timeAbout the RoleMount Indie is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, performance, and scalability of our cloud-based systems.Key ResponsibilitiesMonitor and analyze platform and containerized applications to identify performance and...
-
Site Reliability Engineer
2 days ago
Washington, Washington, D.C., United States Palantir Technologies Full timeJob Title: Site Reliability EngineerWe are seeking a skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications.Key Responsibilities:Collaborate with cross-functional teams to design,...
-
Site Reliability Manager
6 days ago
Washington, Washington, D.C., United States Karsun Solutions Full timeJob Title: Site Reliability ManagerWe are seeking a highly skilled and experienced Site Reliability Manager to join our team at Karsun Solutions. As a key member of our organization, you will be responsible for ensuring the reliability, scalability, and performance of our systems and services.Key Responsibilities:Lead a team of engineers in designing,...
-
Site Reliability Engineer
3 days ago
Washington, Washington, D.C., United States Splunk Full timeAbout the RoleSplunk is seeking a highly skilled Site Reliability Engineer to join our Cloud Traffic Engineering team. As a Site Reliability Engineer, you will play a critical role in ensuring the availability, performance, efficiency, and security of our Cloud SaaS platform.Key ResponsibilitiesDevelop and deploy software to improve the scalability and...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Tik Tok Full timeAbout TikTok U.S. Data SecurityTikTok U.S. Data Security is a subsidiary of TikTok in the U.S., dedicated to providing heightened focus and governance to our data protection policies and content assurance protocols. Our mission is to keep U.S. users safe, allowing millions of Americans to continue using TikTok to learn, earn, express themselves creatively,...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Cape Full timeAbout CapeCape is a pioneering company in the field of privacy-centric telecommunications. Founded in 2022 by a team of experts from Palantir and Anduril, we aim to revolutionize the way people think about data privacy and national security.The RoleWe are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Radius Networks Inc Full timeAbout Radius Networks IncRadius Networks Inc is the global leader in location technology solutions, powering some of the world's largest restaurant, grocery, retail, and hospitality brands with its Flybuy platform.Job SummaryWe're seeking a talented Site Reliability Engineer to join our team and contribute to the development of our cloud-based location and...
-
Reliability Engineer
2 weeks ago
Washington, Washington, D.C., United States MetroStar Systems Full timeJob Summary:As a Site Reliability Engineer at MetroStar Systems, you will play a critical role in driving improvements in observability, performance, and reliability across our high-impact government projects.Key Responsibilities:Monitor and analyze platform and containerized applications to identify performance and availability risks and issues.Collaborate...
-
Site Reliability Engineer
6 days ago
Washington, Washington, D.C., United States Mission Box Solutions, Inc. Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team at Mission Box Solutions, Inc. As a key member of our team, you will play a vital role in driving improvements in observability, performance, and reliability across the federal government.ResponsibilitiesMonitor and optimize platform and containerized applications for...
-
Site Reliability Engineer
7 days ago
Washington, Washington, D.C., United States Palantir Technologies Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Security Infrastructure team at Palantir Technologies. As a key member of our team, you will be responsible for architecting and operating multiple, geographically distributed Kubernetes clusters supporting our mission software. You will also be responsible for operating...