Site Reliability Engineer

3 days ago


San Francisco, California, United States Infactory Full time
About Infactory

We're a cutting-edge technology company dedicated to revolutionizing the field of artificial intelligence through fact-based innovation. Our mission is to empower businesses with accurate and trustworthy AI solutions.

Our Values

  • Facts First: We prioritize accuracy and definitiveness in everything we build.
  • Trust Builders: Our technology is designed to establish and maintain trust with our customers.
  • AI Visionaries: We're not just following the AI trend, we're shaping its future.
  • Product Excellence: We strive to create user-friendly, reliable, and valuable products.

Job Responsibilities

  • Cloud Capacity Management: You'll oversee our cloud infrastructure, ensuring optimal performance and cost-effectiveness.
  • Reliability Champion: You'll be responsible for maintaining the reliability and availability of our services, setting up monitoring, creating alerts, and implementing automation to ensure 24/7 uptime.
  • Deployment Collaboration: You'll work closely with our CTO and VP of AI to successfully deploy new features from development to production, ensuring seamless transitions.
  • Performance Optimization: You'll analyze system performance and implement improvements, fine-tuning databases, network configurations, and application settings to maximize efficiency.
  • Incident Management: When incidents occur, you'll lead the resolution process, conducting post-mortems to ensure we learn from our experiences.
  • Documentation and Knowledge Sharing: You'll document our systems, processes, and incidents, sharing expertise with the team to foster a culture of knowledge and collaboration.

Requirements

  • 5+ years of experience in a development-focused SRE role, with a bonus for experience in fast-paced, scale-up environments.
  • Experience with cloud platforms (AWS, Google Cloud, Azure).
  • Experience with monitoring and observability tools.
  • Knowledge of infrastructure as code and configuration management tools.
  • Familiarity with container technologies (Docker, Kubernetes).
  • Knowledge of database systems and their optimization.
  • Strong problem-solving skills and the ability to remain calm under pressure.

Compensation and Benefits

  • San Francisco Bay Area/Hybrid preferred, remote considered.
  • $120k-150k with equity in an early-stage startup.
  • Competitive benefits.
  • 20 days PTO + paid holidays + unlimited sick leave.


  • San Francisco, California, United States Diverse Lynx Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our organization, you will play a critical role in ensuring the reliability and efficiency of our digital infrastructure.Key Responsibilities:Design and implement reliable digital infrastructure solutionsCollaborate with...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our platform empowers developers to streamline their workflows, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, California, United States Instabase Full time

    About InstabaseInstabase is a cutting-edge technology company that specializes in democratizing access to AI innovation. Our mission is to empower organizations to solve complex unstructured data problems and unlock new business opportunities.Our TeamWe are a team of passionate and innovative professionals who are dedicated to building scalable and reliable...


  • San Jose, California, United States Adobe Full time

    Site Reliability Engineer page is loadedAdobe's Reliability Engineering team is looking for a Site Reliability Engineer (SRE) to help build and operate services like Adobe Sign. Adobe Sign is the fastest, and easiest way to get contracts signed and filed.You have a track record as a site reliability engineer in large-scale SaaS businesses, and a strong...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS provider and a prominent leader in the Salesforce DevSecOps platform tailored for regulated sectors such as finance, insurance, and healthcare. Our solutions empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, California, United States Outdefine Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Outdefine. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our ecommerce systems.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our offerings empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering...


  • San Francisco, California, United States Orb Full time

    About OrbOrb is a pioneering company that provides cutting-edge infrastructure solutions to businesses, empowering them to unlock their revenue potential. Our mission is to revolutionize the way companies approach billing and invoicing, making it a seamless and efficient process.Role & ImpactAs a Site Reliability Engineer at Orb, you will play a critical...


  • San Francisco, California, United States Orb Full time

    About OrbOrb is a pioneering company that provides cutting-edge infrastructure solutions to businesses, empowering them to unlock their revenue potential. Our mission is to revolutionize the way companies approach billing and invoicing, making it a seamless and efficient process.Role & ImpactAs a Site Reliability Engineer at Orb, you will play a critical...


  • San Francisco, California, United States Forsyth Barnes Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Forsyth Barnes. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and availability of our cloud-based services.Key ResponsibilitiesCloud Capacity Management: Monitor and optimize our cloud capacity to ensure...


  • San Francisco, California, United States Aircon Engineering Inc Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Aircon Engineering Inc. As a Site Reliability Engineer, you will be responsible for designing, building, and operating large-scale cloud infrastructure platforms that power our business applications.Key ResponsibilitiesDesign and implement highly available and...


  • San Francisco, California, United States Cisco Full time

    Principal Site Reliability Engineer, Datastores (ThousandEyes)LOCATION:San Francisco, California, USAREA OF INTERESTEngineer - SoftwareCOMPENSATION RANGE219700 USD USDJOB TYPEProfessionalTECHNOLOGY INTERESTNetworkingJOB ID1422674Who We AreThe name ThousandEyes was born from two big ideas: the power to see things not ordinarily possible and the ability to...


  • San Francisco, California, United States Crusoe Full time

    About This Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Crusoe Energy Systems. As a Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our infrastructure.Key Responsibilities:Collaborate with the SRE team to detect, analyze, and prevent issues to maintain high Service...


  • San Francisco, California, United States Operant AI Full time

    Job OverviewSenior Site Reliability EngineerAs the inaugural SRE within our organization, we are looking for an individual to establish Operant's SRE strategy and operations aimed at ensuring the resilience and security of our platforms and services. If you are enthusiastic about the prospect of being an early engineer at a startup ready to revolutionize...


  • San Francisco, California, United States Dice Full time

    Company Overview:Dice is recognized as a premier career platform for technology professionals at all levels. We are collaborating with ZEN3 INFOSOLUTIONS AMERICA INC to find a suitable candidate for an important role.Position:Site Reliability Engineer with Oracle Applications ExpertiseLocation:RemoteContract Duration:Long-Term EngagementRole Summary:We are...


  • San Francisco, California, United States Centene Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Centene. As a key member of our technology organization, you will play a critical role in ensuring the reliability, performance, and security of our platform infrastructure.Key ResponsibilitiesLead Projects and Initiatives: Help lead projects focused on...


  • San Francisco, California, United States RevenueCat Full time

    About RevenueCatWe are a leading provider of mobile subscription infrastructure, handling over $3 billion in in-app purchases annually across thousands of apps. Our mission is to build a standard for mobile subscription infrastructure, and we're looking for a Senior Site Reliability Engineer to help us achieve this goal.About the RoleWe're seeking a highly...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Outdefine. As a key member of our Infrastructure team, you will be responsible for ensuring the reliability and scalability of our blockchain-based systems.Key ResponsibilitiesRun internal Chainlink and Blockchain nodes to ensure seamless connectivity and data...


  • San Francisco, California, United States Cisco Full time

    Position Overview We are seeking experienced engineers to become part of our Federal region's Site Reliability Engineering (SRE) team at Cisco, a leader in Internet and cloud intelligence solutions. In this role, you will be instrumental in designing and sustaining the infrastructure and systems vital for the operations within the Federal sector. Your...


  • San Francisco, California, United States Okta, Inc. Full time

    Senior Site Reliability Engineer, Security About Okta Okta is recognized as The World's Identity Company, dedicated to empowering individuals to securely access any technology across various devices and applications. Our Workforce and Customer Identity Clouds facilitate secure yet adaptable access, authentication, and automation, fundamentally transforming...