Site Reliability Engineer

3 days ago


San Francisco, California, United States Swish Analytics Full time
About the Role

We are seeking an experienced Site Reliability Engineer to join our DevSecOps and Infrastructure team in Europe. As a key member of our team, you will be responsible for supporting our enterprise infrastructure during non-US hours and assisting in optimizing incident response, observability, and workload resiliency.

Responsibilities
  • Support production systems and help triage issues during live sporting events
  • Monitor the system and respond to incidents to maintain system SLO/SLA, review and follow up production incidents
  • Write and review code, develop documentation, and debug problems, live, on complex distributed systems
  • Optimize and facilitate incident response, conduct root cause analysis and blameless retrospectives
  • Work closely with technical teams to implement, optimize, maintain, scale and debug workloads on Kubernetes using CI/CD, automation tools and scripting languages to deliver tools/software to improve the reliability and scalability of services
Qualifications
  • 3+ years of experience working in an SRE leaning DevOps or full SRE roles
  • 3+ years building CICD pipelines with Github Actions, Gitlab CICD, or similar
  • Extensive experience with Kubernetes
  • Experience in managing customer-facing systems in a 24/7 environment including escalations
  • Experience triaging and escalation policies/protocols
  • Strong communication and documentation skills
  • Comfortable with scripting languages like Bash, Python, or similar
Preferred Qualifications
  • Networking and routing experience
  • Terraform in AWS to support global-scale services
  • Improving observability in an engineering organization
  • Past experience with PagerDuty or similar tools

Swish Analytics is an Equal Opportunity Employer. All candidates who meet the qualifications will be considered without regard to race, color, religion, sex, national origin, age, disability, sexual orientation, pregnancy status, genetic, military, veteran status, marital status, or any other characteristic protected by law.



  • San Francisco, California, United States Wasmer Full time

    About the RoleWe are seeking an exceptional Site Reliability Engineer to join our team at Wasmer. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our Edge computing platform.Key ResponsibilitiesDesign, implement, and maintain scalable and reliable infrastructure solutions for our Edge computing...


  • San Francisco, California, United States Instabase Full time

    About InstabaseAt Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry.With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index...


  • San Francisco, California, United States Diverse Lynx Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our organization, you will play a critical role in ensuring the reliability and efficiency of our digital infrastructure.Key Responsibilities:Design and implement reliable digital infrastructure solutionsCollaborate with...


  • San Francisco, California, United States Diverse Lynx Full time

    Role OverviewWe are seeking a highly skilled Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our organization, you will be responsible for ensuring the reliability and resilience of our digital systems.Key ResponsibilitiesDesign and implement reliable digital systems and processesCollaborate with cross-functional teams to...


  • San Francisco, California, United States GRNET Full time

    About GRNETGRNET is a leading provider of advanced network and cloud computing services to academic and research institutions, educational entities, and public sector agencies in Greece.Our ApproachWe adopt a Site Reliability Engineering approach to ensure the reliability, scalability, and efficiency of our infrastructure. Our team is divided into three...


  • San Francisco, California, United States Diverse Lynx Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer with 7+ years of experience in Java SRE to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our systems.Key ResponsibilitiesDesign and implement monitoring and alerting systems to ensure prompt...


  • San Francisco, California, United States SpeedCast Full time

    {"h1": "Site Reliability Engineer at Speedcast", "p": "At Speedcast, we're looking for a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a key role in ensuring the reliability and performance of our communication products. You will collaborate with our team of professionals to innovate and enhance our...


  • San Francisco, California, United States Instabase Full time

    About InstabaseAt Instabase, we're passionate about harnessing the power of AI to democratize access to cutting-edge innovation and empower organizations to solve complex unstructured data problems. With a global presence and a customer-centric approach, we're committed to delivering top-tier solutions that drive business success.Job SummaryWe're seeking a...


  • San Francisco, California, United States SpeedCast Full time

    Job Title: Site Reliability EngineerAt Speedcast, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our communication products.Key Responsibilities:Analyze and design continuous integration/continuous delivery...


  • San Francisco, California, United States Xero Full time

    About the RoleXero is a leading cloud-based accounting platform that empowers small businesses and their advisors to thrive. As a Site Reliability Engineer on our Reliability Enablement team, you'll play a critical role in ensuring the reliability and performance of our systems.Key ResponsibilitiesInvestigate operational surprises and support teams in...


  • San Francisco, California, United States Autodesk Full time

    {"Responsibilities": "As a Senior Site Reliability Engineer at Autodesk, you will be responsible for leading the development and maintenance of robust cloud infrastructure to support millions of daily users. You will automate processes to improve system reliability and introduce best practices in continuous integration and deployment. You will also lead...


  • San Francisco, California, United States Best Secret Full time

    About BestSecretGroupWe are a leading European members-only online destination for premium and luxury off-price fashion, driven by a tech-focused mindset and strong commitment to sustainability.With a rich history and a major tech transformation underway, BestSecret is scaling at pace to become one of Europe's most exciting ecommerce players.We are proud to...


  • San Francisco, California, United States Pager Full time

    About the RolePagerDuty is seeking a highly skilled Senior Site Reliability Engineer to join our SRE-Platform team. As a key contributor, you will play a crucial role in building, maintaining, and scaling our Kubernetes platform.Key ResponsibilitiesMaintain the overall health of the platform, including triaging and troubleshooting production issues,...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our platform empowers developers to streamline their workflows, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, California, United States DataRobot Full time

    Job Title: Director of Site Reliability EngineeringDataRobot is the leader in Value-Driven AI, a unique and collaborative approach to generative and predictive AI that combines an open platform, deep expertise, and broad use-case experience to improve how organizations run, grow, and optimize their business. The DataRobot AI Platform is the only complete AI...


  • San Francisco, California, United States PicnicHealth Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at PicnicHealth. As a key member of our engineering team, you will be responsible for ensuring the reliability, efficiency, and architecture of our cloud, developer, and security operations.As a Senior SRE, you will take the lead in identifying and resolving...


  • San Francisco, California, United States GRNET Full time

    About GRNETGRNET is a leading entity in the Greek Government, providing advanced network and cloud computing services to academic and research institutions, educational entities, and public sector agencies.The company offers a wide range of services, including:Unified Portal for Government Digital ServicesCloud Services for Research and EducationNetworking...


  • San Francisco, California, United States Instabase Full time

    About InstabaseInstabase is a cutting-edge technology company that specializes in democratizing access to AI innovation. Our mission is to empower organizations to solve complex unstructured data problems and unlock new business opportunities.Our TeamWe are a team of passionate and innovative professionals who are dedicated to building scalable and reliable...


  • San Francisco, California, United States Gusto Full time

    About GustoGusto is a pioneering online platform that empowers small businesses to manage their teams effectively. Our comprehensive suite of services includes full-service payroll, health insurance, 401(k)s, expert HR, and team management tools. With offices in Denver, San Francisco, and New York, we serve over 300,000 businesses nationwide.Our MissionWe...


  • San Francisco, California, United States Gusto Full time

    About GustoGusto is a leading provider of modern, online people platforms that empower small businesses to manage their teams effectively. Our comprehensive suite of tools includes full-service payroll, health insurance, 401(k)s, expert HR, and team management solutions. With offices in Denver, San Francisco, and New York, we serve over 300,000 businesses...