Software Engineer, Site Reliability

3 weeks ago


San Francisco CA, United States Sierra Business Solution Full time

Software Engineer, Site Reliability (SRE) Software Engineer, Site Reliability (SRE) at Sierra Business Solution . About Us We are an inperson company based in San Francisco with growing offices in Atlanta, New York, and London, building a platform that helps businesses create better, more human customer experiences with AI. Our core values are Trust, Customer Obsession, Craftsmanship, Intensity, and Family. Company founders: Bret Taylor, former Salesforce and Facebook executive; Clay Bavor, former Google Labs leader. What Youll Do Own Sierras observability stackmonitoring, alerting, logging, and tracingto give engineers clear visibility into system health and performance. Partner with product and platform engineers to design reliable, scalable systems from day one. Design and implement scalable, secure cloud infrastructure (AWS) using Terraform and modern DevOps tooling. Improve reliability and scalability of LLM deployments, ensuring robust, costeffective operation. Lead improvements to deployment pipelines, CI/CD tooling, and incidentmanagement processes. Define the foundation of SRE practices at Sierra, influencing culture, tooling, and best practices. What Youll Bring 5+ years of handson experience in Site Reliability or infrastructure engineering for complex SaaS or cloudbased systems. Experience designing for availability, scalability, and reliability at both infrastructure and application layers. Deep experience with Terraform, AWS services, container orchestration, and cloud networking (IAM, VPC). Strong background in observability systems (Prometheus, Grafana, Datadog, or similar). Experience working with enterprise customers and familiarity with compliance and networking needs. Comfortable working in fastmoving environments and collaborating across teams. Degree in Computer Science or equivalent professional experience. Even Better Experience with LLM infrastructureoptimizing inference, managing finetuned models, or largescale deployment. Earlystage startup experience defining SRE culture and tooling from scratch. Familiarity with incidentmanagement automation or selfhealing infrastructure patterns. Benefits Unlimited Paid Time Off Medical, Dental, and Vision benefits Life Insurance and Disability Benefits 401(k) retirement plan with company match Parental Leave and fertility benefits via Carrot Lunch, snacks, coffee, and discretionary stipend Equity plans per applicable policies Equality & Diversity We actively encourage applicants of all backgrounds to apply. We strive to evaluate all applicants consistently without regard to race, color, religion, gender, sexual orientation, age, disability, veteran status, or any other protected characteristic. #J-18808-Ljbffr


  • Software Engineering

    3 weeks ago


    San Francisco, CA, United States Jobright.ai Full time

    Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai 2 days ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai Get AI-powered advice on this job and more exclusive features. Jobright is an AI-powered career platform that helps job seekers discover the top...


  • San Francisco, United States DevOps projects Full time

    Get weekly curated DevOps opportunities, salary insights, and career tips --- no spam, only relevant roles that match your stack and experience level. Software Engineer, Site Reliability Why Harvey Harvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with...


  • San Francisco, CA, United States Google Full time

    Software Engineering Manager II, Site Reliability Engineering Apply Not sure what skills you will need for this opportunity Simply read the full description below to get a complete picture of candidate requirements. Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Sunnyvale,...


  • San Francisco, United States Sierra Business Solution Full time

    Software Engineer, Site Reliability (SRE) Software Engineer, Site Reliability (SRE) at Sierra Business Solution. About Us We are an in‑person company based in San Francisco with growing offices in Atlanta, New York, and London, building a platform that helps businesses create better, more human customer experiences with AI. Our core values are Trust,...


  • San Francisco, United States Alchemy Full time

    Join to apply for the Site Reliability Engineer role at Alchemy Join to apply for the Site Reliability Engineer role at Alchemy Our Mission Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers the powerful APIs, SDKs,...


  • San Francisco, CA, United States Google Full time

    Overview Software Engineering Manager II, Site Reliability Engineering at Google. Lead a team and provide technical leadership on key projects, empowering and developing teams to ensure reliability and scalability of Google's services. Responsibilities Lead a team of Software/Systems Engineers on projects for users and be directly responsible for...


  • San Francisco, CA, United States Writemed Full time

    About Us Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care pathways? WriteMed.AI helps Biopharma and Life Sciences companies reduce time to write medical publications and regulatory paperwork. Submit your CV and...


  • San Francisco, CA, United States Writemed Full time

    About Us Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care pathways? WriteMed.AI helps Biopharma and Life Sciences companies reduce time to write medical publications and regulatory paperwork. Want to make an...


  • San Francisco, CA, United States Sierra Full time

    About us Please ensure you read the below overview and requirements for this employment opportunity completely. At Sierra, we’re creating a platform to help businesses build better, more human customer experiences with AI. We are primarily an in-person company based in San Francisco, with growing offices in Atlanta, New York, and London. We are guided by a...


  • San Francisco, CA, United States Alchemy Full time

    Join to apply for the Site Reliability Engineer role at Alchemy Join to apply for the Site Reliability Engineer role at Alchemy Our Mission Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers the powerful APIs, SDKs,...