Site Reliability Engineer

3 weeks ago


New York, New York, United States Palantir Technologies Full time
About the Role

We're seeking a highly skilled Site Reliability Engineer to join our team at Palantir Technologies. As a Site Reliability Engineer, you will play a critical role in designing, deploying, and operating high-performance, scalable, and reliable services for our production infrastructure.

Key Responsibilities
  • Maintain the availability of cloud and physical Linux servers that power the Palantir platform in air-gapped production environments.
  • Design, deploy, and operate infrastructure to support customer and product requirements via modern orchestration and monitoring platforms.
  • Collaborate closely with product teams on requirements and SLOs for deploying software into air-gapped environments.
  • Identify, troubleshoot, and solve network and systems issues.
  • Script to automate away routine operational tasks.
What We Value
  • Active US Security clearance, or eligibility and willingness to obtain a US Security clearance.
  • Confidence in troubleshooting complex systems issues independently using stack traces and observability and systems tools.
  • Comfort with managing large-scale production systems and technologies with configuration management, load balancing, monitoring and alerting infrastructure, and container orchestration.
  • Demonstrated ability to continuously learn and work independently, making decisions with minimal supervision while working in secure facilities.
  • Experience with containers (Docker/Podman) and orchestration (OpenShift/Kubernetes) at scale is a plus.
  • Preferred Certifications: DOD 8570 IAT Level II or greater (CISSP, Sec+), Unix/Linux Computing Environment (e.g Linux+, RHCE).
  • Proficiency with scripting in Python or Go is a plus.
Requirements
  • 5+ years of experience with Linux system administration (RHEL or equivalent preferred).
  • Experience with cloud-based hosting platforms like AWS, Azure, or GCP and/or experience with hardware-based environments.
  • Familiarity with monitoring systems using tools like Prometheus and writing health checks.

We offer a competitive salary range of $125,000 - $185,000/year, as well as a comprehensive benefits package, including medical, dental, and vision insurance, commuter benefits, relocation assistance, and paid time off.

Palantir Technologies is an Equal Employment Opportunity and Affirmative Action employer, committed to promoting a culture of diversity, equity, and inclusion. We welcome candidates from a wide range of backgrounds, perspectives, and lived experiences to join our team in solving the world's hardest problems.



  • New York, New York, United States CapB InfoteK Full time

    Job Title: Site Reliability EngineerAbout the Role:At CapB InfoteK, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:• Develop and build low-level component...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly available...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain infrastructure automation...


  • New York, New York, United States Insight Global Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will be responsible for ensuring the uptime and reliability of our production and non-production environments. You will work closely with our development teams to build and maintain the infrastructure and applications...


  • New York, New York, United States Insight Global Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will be responsible for ensuring the uptime and reliability of our production and non-production environments.Key Responsibilities:Monitor availability and system health to ensure optimal...


  • New York, New York, United States Cynet Systems Full time

    Job Title: Site Reliability EngineerJob Summary:Cynet Systems is seeking a highly skilled Site Reliability Engineer to lead the development and implementation of geospatial application performance monitoring strategies. The ideal candidate will have a strong background in Site Reliability Engineering (SRE) and proven experience in using Dynatrace for...


  • New York, New York, United States Phaxis Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Phaxis. As a Site Reliability Engineer, you will be responsible for designing and building scalable and resilient systems, collaborating with engineering teams to advocate for optimal system use, and managing our centralized development infrastructure.Key...


  • New York, New York, United States Diverse Lynx Full time

    Job Title: SRE - Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement automated workflows to reduce TOIL and...


  • New York, New York, United States Apollo Solutions Full time

    Site Reliability EngineerApollo Solutions is partnering with a pioneering artificial intelligence business that is revolutionizing the use of AI/ML in gaming and security.The company is working closely with government contracts and gaming console companies and is seeking a Site Reliability Engineer to join their growing team.The Site Reliability Engineer...


  • New York, New York, United States Grafbase, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Engineering team at Grafbase, Inc. As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability, availability, and performance of our systems and services.You will collaborate with cross-functional teams to design, implement, and maintain...


  • New York, New York, United States Grafbase, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Engineering team at Grafbase, Inc.As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems and services.Key ResponsibilitiesCollaborate with cross-functional teams to develop and deploy software...


  • New York, New York, United States Alchemy Full time

    About the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our globally used developer platform.Our mission is to empower builders with the tools they need to create exceptional on-chain products....


  • New York, New York, United States Tik Tok Full time

    About TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide.Our mission is to provide a secure and reliable platform for users to express themselves, learn, and be entertained.Role OverviewWe are seeking a skilled Site Reliability Engineer to join our U.S....


  • New York, New York, United States City National Bank Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at City National Bank. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and maximum uptime of our systems in the Data Center or Cloud Platform.Key Responsibilities:Design and implement solutions...


  • New York, New York, United States Grafbase, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Engineering team at Grafbase, Inc. As an SRE, you will play a critical role in ensuring the reliability, availability, and performance of our systems and services.Key ResponsibilitiesCollaborate with cross-functional teams to ensure software is developed and deployed for...


  • New York, New York, United States Valstro Full time

    About ValstroValstro is a FinTech company that is revolutionizing the trading industry with its cloud-first, next-gen trading solutions. As a people-first company, we prioritize collaboration, motivation, and support to deliver exceptional value to our clients.Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team. As a key...


  • New York, New York, United States Peloton Full time

    About the RolePeloton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our platform.Your Daily ImpactDesign and implement automated infrastructure provisioning and deployment processes using Terraform and...


  • New York, New York, United States Peloton Full time

    About the RolePeloton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our platform.Your Daily ImpactDesign and implement automated infrastructure provisioning and deployment processes using Terraform and...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement infrastructure automation using Ansible...


  • New York, New York, United States Insight Global Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a key member of our infrastructure team, you will be responsible for ensuring the uptime and reliability of our production and non-production environments.Key ResponsibilitiesMonitor availability and system health to ensure optimal performanceDesign...