Cloud Infrastructure Engineer

1 week ago


San Francisco, California, United States Stratitech Services LLC Full time
Job Title: Sr. SRE/Kubernetes Engineer

Location: San Francisco, CA (Hybrid – 2 days in-office)

About the Role:

StratITech Services LLC is seeking a highly skilled Sr. Site Reliability Engineer/Kubernetes Engineer to join our team in San Francisco, CA. This is a full-time position offering competitive pay and stock options. We are only accepting applicants who are US Citizens or US Permanent Residents/Green Card Holders. Must be local to the SF Bay Area, we do not do relocation.

In this hybrid role, you will be working two days a week in-office, as part of a dynamic team responsible for deploying, managing, optimizing, and upgrading the systems that support innovative software solutions.

This person must be excited about working in an interrupt-driven startup environment. The ideal candidate will be passionate about learning new technologies, solving complex problems, and embracing Infrastructure as Code (IaC) to automate infrastructure processes. Your role will involve collaborating closely with team members to address architectural challenges and ensure the reliability and efficiency of the client's cloud infrastructure.

Implementation is key in this role, as you'll be directly responsible for turning ideas into reliable and scalable solutions.

Key Responsibilities:

  • Cloud Operations: Leverage DevOps principles to provide technical operational support, including production operational support, for cloud infrastructure operations for internal and external customers.
  • Tool Development & CI/CD: Write CI/CD pipelines from scratch and build tools that support internal platforms, improving stability, reliability, and efficiency.
  • Feature Flags & Modifications: Implement and manage feature flags, enabling or modifying features as necessary to support platform flexibility and customer requirements.
  • Troubleshooting: Diagnose and resolve complex system problems across the entire technology stack, including CI/CD pipelines, container-based systems, networking, operating systems, cloud resources, and databases. Must have very strong troubleshooting skills.
  • Monitoring & Alerting: Implement and manage monitoring and alerting infrastructure for critical services, ensuring stability and performance across all platform components.
  • Automation & Runbooks: Create, revise, and test operational runbooks and automation scripts to maintain infrastructure efficiently and securely.
  • Operational Innovation: Proactively seek opportunities for innovation to enhance operational processes, increasing reliability, availability, and performance while promoting a security-first culture.
  • On-Call Support: Participate in an on-call rotation (7am-7pm, 7 days a week, every three weeks rotating) to support 24/7 operations and ensure system availability.
  • Documentation: A willingness and desire to author technical documentation for design, workflows, processes, and best practices.
  • External Customer Focus: Provide direct support for external customer requirements, ensuring that solutions align with customer needs and expectations.
  • Quality & Security: Embody a Quality-first & Security-first culture in all that you do.

Must-Have Requirements:

  • 5+ years of experience with Azure (or AWS/GCP) for cloud infrastructure.
  • Strong experience with Terraform for infrastructure automation.
  • Strong experience with Kubernetes in production.
  • Proficiency in Helm for managing Kubernetes applications.
  • 5+ years of coding experience in Python.
  • Experience using Infrastructure as Code (IaC) and CI/CD tools like FluxCD, Jenkins, Terraform, or GitHub.
  • Strong experience with Linux operating systems.
  • Solid working knowledge of networking (TCP/IP, DNS) and cloud infrastructure performance.
  • Operational experience with monitoring/alerting systems such as Sentry, Opsgenie, or Prometheus.
  • Must have production operations and client-facing experience.
  • Willingness to mentor junior team members and contribute to technical documentation for workflows and best practices.
  • Hands-on problem-solver with the ability to balance risk and impact to customers.

These skills are a plus:

  • Experience with elements of the current tech stack: FluxCD, Prometheus, Elasticsearch, Java, Kafka, Postgres, and Jenkins.
  • Previous experience or a keen interest in industrial IoT, analytics, or manufacturing.


  • San Francisco, California, United States Twelve Labs Full time

    About the RoleAt Twelve Labs, we are seeking a highly skilled Cloud Infrastructure Engineer to join our team. As a key member of our engineering team, you will play a pivotal role in designing, implementing, and maintaining our cloud infrastructure platform. This is an exciting opportunity to be part of a dynamic team and contribute to the growth and success...


  • San Francisco, California, United States ClassDojo Full time

    About the RoleWe are seeking a highly skilled Cloud Infrastructure Engineer to join our team at ClassDojo. As a key member of our engineering team, you will be responsible for designing, building, and maintaining our cloud-based infrastructure.Key ResponsibilitiesInfrastructure Development: Design, develop, and deploy scalable and secure cloud-based...


  • San Francisco, California, United States HashiCorp Full time

    About the RoleWe are seeking a skilled Cloud Infrastructure Engineer to join our Terraform Enterprise team at HashiCorp. As a key member of our team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure, ensuring seamless user experiences for our customers.Key ResponsibilitiesDesign and implement scalable cloud...


  • San Francisco, California, United States Odaseva Full time

    At Odaseva, we are seeking a skilled Cloud Infrastructure Engineer to join our Infrastructure team. As a Cloud Infrastructure Engineer, you will play a pivotal role in bridging the gap between development, operations, and security.Key Responsibilities:Design, provision, and manage cloud resources on Azure and AWS, ensuring optimal performance, scalability,...


  • San Francisco, California, United States Twelve Labs Full time

    About the RoleWe are seeking a highly skilled Cloud Infrastructure Engineer to join our team at Twelve Labs. As a key member of our engineering team, you will play a pivotal role in designing, implementing, and maintaining our cloud infrastructure platform.This is an exciting opportunity to be part of a dynamic team and contribute to the growth and success...


  • San Francisco, California, United States Akraya Full time

    Job Summary:We are seeking a highly skilled and experienced professional to apply their engineering skills across various technology solutions and build upon those skills in product delivery from inception through design, build, and deployment.The role involves working multifunctionally with diverse teams in a rapidly evolving environment and developing...


  • San Francisco, California, United States Morph Labs Full time

    Company OverviewMorph Labs is a pioneering company that develops infrastructure to bring the next billion artificially intelligent software engineers online, faster. Backed by top investors, we work on some of the world's hardest problems in developing, evaluating, training, and deploying autonomous AI SWE agents.Job SummaryWe are seeking seasoned...


  • San Francisco, California, United States Diverse Lynx Full time

    We are seeking a skilled Cloud Infrastructure Engineer to join our team at Diverse Lynx LLC. The ideal candidate will have a strong background in developing and deploying cloud-based infrastructure using code. Proficiency in Python programming and experience with Docker, Kubernetes, and containerization are essential. Additionally, the candidate should have...


  • San Francisco, California, United States Cisco Meraki Full time

    Job DescriptionCisco Meraki is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team. As a key member of our Infrastructure SRE team, you will be responsible for designing, building, and maintaining our cloud infrastructure to support our growing customer base.Our cloud infrastructure is built on a highly scalable and secure...


  • San Francisco, California, United States Sight Machine, Inc. Full time

    About Sight Machine, Inc.Sight Machine strengthens manufacturers by providing the industry's only standard data model and system-level visualization capabilities. By integrating all crucial data into a single innovative platform, everyone involved in the fabrication process can visualize, contextualize and examine data in one intuitive interface.Job...


  • San Francisco, California, United States Sight Machine, Inc. Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team at Sight Machine, Inc. As a key member of our Cloud Infrastructure Team, you will be responsible for designing, deploying, and managing our cloud-based infrastructure to ensure high availability, scalability, and security.As a Senior Cloud Infrastructure...


  • San Francisco, California, United States Conduit Full time

    About ConduitConduit is a pioneering company in the crypto space, revolutionizing onchain compute with its rollup-native cloud platform. Our mission is to unlock the next wave of innovation in the crypto ecosystem by making infrastructure simple and onchain compute abundant.The RoleWe're seeking a talented Cloud Infrastructure Engineer to join our elite...


  • San Francisco, California, United States Sight Machine, Inc. Full time

    About the RoleSight Machine, Inc. is seeking a highly skilled Cloud Infrastructure Engineer to join our team. As a Cloud Infrastructure Engineer, you will be responsible for designing, deploying, and managing our cloud infrastructure to ensure high availability, scalability, and security.Key ResponsibilitiesDesign and implement cloud infrastructure...


  • San Francisco, California, United States Humane USA Full time

    About the RoleAt Humane USA, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, building, and maintaining our cloud infrastructure to ensure high availability, scalability, and reliability.Key ResponsibilitiesArchitect and implement cloud...


  • San Francisco, California, United States Hive Full time

    About HiveHive is a pioneering cloud-based AI solutions provider, trusted by hundreds of the world's largest and most innovative organizations. Our mission is to empower developers with cutting-edge AI models, serving billions of customer API requests every month. We offer turnkey software applications powered by proprietary AI models and datasets, enabling...


  • San Francisco, California, United States Dashworks Inc Full time

    About the RoleWe're seeking an experienced Cloud Infrastructure Engineer to join our team at Dashworks Inc. As an early engineer, you'll have the opportunity to materially impact and scale not just the company's engineering but also the team and the culture.You'll drive key engineering initiatives end-to-end, working alongside exceptionally strong co-workers...


  • San Francisco, California, United States Waabi Full time

    Transforming the Future of TransportationWaabi is a pioneering AI company revolutionizing the self-driving technology landscape. As a key member of our Infrastructure team, you will play a crucial role in designing, implementing, and troubleshooting cloud systems that enable safe and efficient self-driving at scale.With a strong background in computer...


  • San Francisco, California, United States Conduit Full time

    About ConduitConduit is a pioneering company in the crypto space, building the rollup-native cloud platform that will scale Ethereum. Our mission is to unlock the next wave of innovation in the crypto ecosystem by making infrastructure simple and onchain compute abundant.The RoleWe're seeking a talented Cloud Infrastructure Engineer to join our elite team....


  • San Francisco, California, United States Provable Full time

    Job Title: Senior Cloud Infrastructure EngineerJob Summary:We are seeking an experienced Senior Cloud Infrastructure Engineer to join our dynamic Infrastructure Team at Provable. The ideal candidate will have a strong background in AWS cloud infrastructure and be adept at managing and optimizing cloud services, implementing monitoring solutions, and ensuring...


  • San Francisco, California, United States ClassDojo Full time

    About ClassDojoClassDojo is a pioneering education technology company dedicated to accelerating humanity's progress by providing every child on Earth with an education they love. Our mission is to connect teachers, children, and families through a network that fosters a love of learning and empowers educators to create a better future for all.Job...