Site Reliability Engineer

1 week ago


san jose, United States Altimetrik Full time

We are looking to hire a Site reliability Engineer


Educational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.

Cloud Infrastructure Expertise: Has at least 4+ years of hands-on experience specifically with AWS, including launching, maintaining, and scaling applications on this platform. The candidate should also demonstrate familiarity with core AWS services, security practices, and cost management.

Performance Monitoring and Troubleshooting: Proven track record in monitoring and troubleshooting performance issues across cloud environments. This includes experience tuning for optimal performance and ensuring high availability in production environments.

DevOps Automation Skills: Proficient in scripting languages like Python, Ruby, or Bash for automation and pipeline development within a DevOps framework. The candidate should also be able to use these skills for tasks like CI/CD, configuration management, and infrastructure as code (IaC).

Programming Proficiency: Comfortable working in at least one of the following languages: Java, Python, or Ruby. This experience helps the candidate understand and work with software applications from both an operational and developmental perspective.

Containerization and Orchestration: Hands-on knowledge of Docker and Kubernetes, along with tools like ArgoCD for GitOps workflows. Experience in this area is essential to manage and scale containerized applications effectively.

Monitoring and Observability Tools: Proficient with monitoring and observability tools such as Splunk, Wavefront, AppDynamics, Prometheus, and other tracing tools. Experience in configuring and maintaining these tools is necessary to ensure visibility across the infrastructure.



  • San Jose, California, United States HireIO Inc Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...


  • San Jose, United States NInfo Systems, Inc. Full time

    Company DescriptionNInfo Systems Inc. is a Certified minority-owned national IT Recruiting and Solutions provider with two decades of experience. It works with Fortune 500 corporations, mid-sized companies, Boutique Consulting companies, startups, SME-level organizations, Federal/ State agencies, and tier-one vendors.Role: Senior Reliability Engineer, Hybrid...


  • san jose, United States Altimetrik Full time

    We are looking to hire a Site reliability EngineerEducational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.Cloud Infrastructure Expertise: Has at...


  • San Jose, United States Altimetrik Full time

    We are looking to hire a Site reliability EngineerEducational Background: Holds a bachelor’s or master’s degree in computer science, information technology, or a related technical field. Alternatively, significant work experience in DevOps or cloud infrastructure management could offset a formal degree requirement.Cloud Infrastructure Expertise: Has at...


  • San Jose, United States PDSSOFT INC. Full time

    Site Reliability Engineer (SRE) / AWS DevOps EngineerLocation: San Jose,CADuration: Long TermJob Description:We are seeking a highly skilled Site Reliability Engineer (SRE) with expertise in GitHub Actions, AWS DevOps, Helm Charts, and YAML configuration. The ideal candidate will be responsible for ensuring the reliability, scalability, and efficiency of our...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability Engineer, AI Platform TrainingJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and security of our AI Platform.About the Role:* Identify and implement methodologies and solutions to...


  • San Leandro, California, United States Omni Inclusive Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Omni Inclusive. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our Digital Sales & Marketing platforms.Key Responsibilities:Design, implement, and maintain scalable and efficient systems to...


  • San Francisco, United States WEX Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Site Reliability Engineer to join our team at Outdefine. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our ecommerce platform.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure using Kubernetes...


  • San Jose, California, United States Tik Tok Full time

    Job SummaryTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. As a Site Reliability Engineer on our Compute Platform team, you will play a critical role in ensuring the reliability of all Big Data services and products across the company.Key Responsibilities Responsible for the reliability of...


  • San Francisco, California, United States Roman Health Pharmacy LLC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Xero. As a key member of our Reliability Enablement team, you will play a critical role in ensuring the reliability and performance of our systems.Key ResponsibilitiesInvestigate operational surprises and support teams in post-incident activitiesConduct in-depth...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability Engineering Manager, AI PlatformAbout the Role:We are seeking an experienced Site Reliability Engineering Manager to lead our AI Inference Platform team at Adobe. As a key member of our Engineering organization, you will be responsible for developing and implementing strategies to ensure the reliability, scalability, and security...


  • San Francisco, California, United States Swish Analytics Full time

    {"h1": "Site Reliability Engineer at Swish Analytics"} Swish Analytics is a sports analytics and betting startup that's revolutionizing the industry with cutting-edge predictive data products. We're on a mission to make oddsmaking a challenge rooted in engineering, mathematics, and sports betting expertise, not intuition. We're looking for a team-oriented...


  • San Francisco, United States Ellation, Inc. Full time

    Who We AreWe‘re a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more. Visit our About Us pages for more information about our...


  • San Francisco, United States Ellation, Inc. Full time

    Who We AreWe‘re a cast of characters working to shine a spotlight on anime. Crunchyroll is an international business focused on creating both online and offline experiences for fans through content (licensed, co-produced, originals, distribution), merchandise, events, gaming, news, and more. Visit our About Us pages for more information about our...


  • san jose, United States Triune Infomatics Inc Full time

    Role: Senior Site Reliability ManagerFull-Time - HybridLocal to San Jose, CAThe Client is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control, and security for the distributed edge. Their platform allows customers to seamlessly manage and deploy any compute node, unlocking the value of IoT data, enabling...


  • san jose, United States Triune Infomatics Inc Full time

    Role: Senior Site Reliability ManagerFull-Time - HybridLocal to San Jose, CAThe Client is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control, and security for the distributed edge. Their platform allows customers to seamlessly manage and deploy any compute node, unlocking the value of IoT data, enabling...


  • san jose, United States EVONA Full time

    Site Reliability Engineer (SRE)Location: San Francisco Bay AreaRole Overview:We are seeking a highly skilled Site Reliability Engineer (SRE) to join a dynamic team at a rapidly growing technology company. As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of mission-critical systems, while implementing automation...


  • San Francisco, California, United States WEX Full time

    Job SummaryThe WEX Site Reliability Engineering team is seeking a highly motivated and quick-learning individual to join our team as a Site Reliability Engineer Level 1. As a key member of our team, you will be responsible for ensuring the reliability, performance, and security of our systems.Key Responsibilities:Actively participate in training and...


  • San Francisco, California, United States SpeedCast Full time

    Job Summary:Speedcast is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the reliability of our cloud-based infrastructure.Key Responsibilities:Analyze and design continuous integration/continuous delivery pipelines to ensure seamless...