Site Reliability Engineer 1
2 days ago
About the Team/Role
The WEX Site Reliability Engineering (SRE) team is looking for a motivated and quick-learning Level 1 Site Reliability Engineer to join our growing team. We are passionate about developing software and solutions for observability, incident response, reliability, performance, operational excellence, and compliance. As a member of the SRE organization, you will support internal stakeholders and Engineering teams, tackling complex challenges and enhancing our engineering teams' and customers' experience. You will have the opportunity to work alongside experienced SREs and gain valuable hands-on experience in a dynamic and supportive environment.
As a Level 1 SRE at WEX, you will:
- Learn the fundamentals of SRE: Gain a solid understanding of core SRE principles, including monitoring, incident management, and automation.
- Develop basic automation scripts: Use scripting languages like Python or Bash to automate simple tasks and improve operational efficiency.
- Triage and resolve incidents: Participate in on-call rotations, assisting with the identification and resolution of incidents under the guidance of senior SREs.
- Monitor system health: Utilize monitoring tools to identify and escalate potential issues, ensuring the stability and performance of our systems.
- Collaborate with development teams: Work closely with software engineers to understand their systems and provide operational support.
- Contribute to documentation: Help maintain and improve internal documentation, including runbooks, knowledge base articles, and playbooks.
- Continuously learn and grow: Expand your knowledge of cloud technologies, DevOps practices, and SRE tools through internal and external training opportunities.
How you'll make an impact
- Develop a basic understanding of code, networking, operating systems, and storage solutions: You'll be able to identify and troubleshoot common issues related to these areas.
- Assist in developing automation and utilizing monitoring tools to ensure system reliability: You'll learn how to use tools to automate tasks and monitor system health.
- Participate in incident response and troubleshooting alongside senior SREs: You'll gain experience in identifying, escalating, and resolving incidents.
- Participate in 24x7 Site Reliability rotations and escalation workflows with guidance from senior team members: You'll learn how to respond to incidents and escalate issues appropriately.
- Learn to identify and address basic performance bottlenecks: This will include understanding code optimization, configuration changes, and infrastructure upgrade recommendations.
- Collaborate with development teams to ensure software design meets operational requirements: You'll learn how to communicate effectively with developers and advocate for operational best practices.
- Work with development teams to make sure operational needs are met by assisting with support requests from other engineering teams: You'll gain experience in providing support and collaborating with different teams.
- Contribute to the continuous improvement of processes and procedures to increase system reliability and efficiency: You'll participate in team discussions and contribute ideas for improvement.
- Stay up-to-date with the latest industry trends and technologies: You'll be encouraged to learn new technologies and share your knowledge with the team.
Experience you'll bring
- Basic understanding of at least one major programming language: C#, Java, GoLang, Python. You should be able to read and understand code, and write scripts.
- Familiarity with a Cloud Computing platform (AWS, Azure, or GCP): You should have a basic understanding of cloud concepts and services.
- Strong communication and collaboration skills: You'll be working closely with different teams, so effective communication is essential.
- BA/BS degree in Computer Science or related technical field or equivalent job experience: A strong foundation in computer science principles is important.
Nice to have
- Basic understanding of infrastructure as code, preferably Terraform: Familiarity with IaC concepts and tools is a plus.
- Working knowledge of RESTful APIs: Understanding how APIs work is beneficial.
- Exposure to observability and logging technologies: Any experience with monitoring and logging tools is helpful.
- Experience with at least one major RDBMS and NoSQL data store: Familiarity with databases is a plus.
- Exposure to containerization technologies such as Docker or Kubernetes: Basic knowledge of containers and orchestration is beneficial.
- Familiarity with GitOps: Understanding of GitOps principles is helpful.
-
Site Reliability Engineer
1 month ago
Boston, United States Space Executive Full timeMy client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...
-
Site Reliability Engineer
1 month ago
boston, United States Space Executive Full timeMy client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...
-
Site Reliability Engineer
1 month ago
Boston, United States Space Executive Full timeMy client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...
-
Site Reliability Engineer
1 month ago
boston, United States Space Executive Full timeMy client, a series C Hyper-growth Cyber Security scale-up collaborating with leading brands across the Automotive industry is seeking a Senior Site Reliability Engineer to join their team.This is a permanent role and will be remote. In terms of salary, this will be market leading and additional benefits will include equity and a bonus scheme.This is a...
-
Site Reliability Engineering Manager
4 weeks ago
Boston, Massachusetts, United States Klaviyo Full timeKlaviyo is committed to empowering creators to own their destiny by making first-party data accessible and actionable like never before. To achieve this goal, we need a talented Site Reliability Engineering Manager to join our team.The Site Reliability Engineering Manager will be responsible for leading a team of Site Reliability Engineers in Klaviyo's...
-
Lead Site Reliability Engineer
1 month ago
Boston, Massachusetts, United States Klaviyo Full timeAbout the RoleWe're seeking a skilled Site Reliability Engineer to join our team at Klaviyo. As a Site Reliability Engineer, you will be responsible for ensuring the availability and scalability of our systems, as well as collaborating with product teams to deliver high-quality software.Key ResponsibilitiesDesign and develop systems and processes to enable...
-
Senior Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Klaviyo Full timeAt Klaviyo, we value the unique backgrounds, experiences, and perspectives each team member brings to our workplace every day.We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond traditional job requirements.Want to learn more about life at Klaviyo? Visit our website to see how we empower creators to...
-
Senior Site Reliability Engineer
1 month ago
Boston, Massachusetts, United States Klaviyo Full timeUnlock Your Potential as a Senior Site Reliability Engineer at KlaviyoWe're on a mission to empower creators to own their destiny, and we need talented individuals like you to help us achieve it. As a Senior Site Reliability Engineer at Klaviyo, you'll play a critical role in ensuring the reliability, scalability, and security of our platform.Key...
-
Senior Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Veradigm Full timeTransforming Healthcare with VeradigmWelcome to Veradigm, where our mission is to harness the power of research, analytics, and artificial intelligence to develop scalable data-driven solutions that bring significant value to all healthcare stakeholders. As a Senior Site Reliability Engineer, you will be part of a dynamic team that is dedicated to delivering...
-
Staff Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Zscaler Full timeAbout ZscalerZscaler is a leading cloud security company that provides a secure platform for enterprises to connect users, devices, and applications. As a Staff Site Reliability Engineer - Federal, you will play a critical role in ensuring the security and reliability of our cloud infrastructure.Key ResponsibilitiesOversee operational tasks for FedRAMP cloud...
-
Site Reliability Engineer
1 month ago
Boston, Massachusetts, United States Oracle Full timeJob Title: Senior Site Reliability EngineerOracle is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our cloud infrastructure team, you will be responsible for designing, deploying, and optimizing Oracle Health applications.About the RoleThis is a unique opportunity to work with a world-leading cloud solutions...
-
Staff Site Reliability Engineer
2 weeks ago
Boston, United States Zscaler Full timeWe're looking for an experienced Staff Site Reliability Engineer (Federal) to join our ZPA team, reporting to the Senior Manager SRE. This role requires Secret Security Clearance that you must maintain throughout employment. An Information Assurance Technician Level 2 Certification is also required, but you can obtain that within your first few weeks of...
-
Staff Site Reliability Engineer
2 weeks ago
Boston, United States Zscaler Full timeOur Engineering team built the world's largest cloud security platform from the ground up, and we keep building. With more than 100 patents and big plans for enhancing services and increasing our global footprint, the team has made us and our multitenant architecture today's cloud security leader, with more than 15 million users in 185 countries. Bring your...
-
Staff Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Zscaler Full timeAbout ZscalerZscaler is a leading cloud security company that serves thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, Zscaler's mission is to make the cloud a safe place to do business and provide a seamless experience for enterprise users.As the operator of the world's largest security cloud, Zscaler...
-
Staff Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Zscaler Full timeAbout ZscalerZscaler is a leading provider of cloud-based security solutions, serving thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, our mission is to make the cloud a safe and secure place for businesses to operate.As the operator of the world's largest security cloud, Zscaler accelerates digital...
-
Senior Software Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Klaviyo Full timeWe're looking for a skilled Senior Site Reliability Engineer to join our team at Klaviyo. As a key member of our Site Reliability Engineering team, you will be responsible for designing, building, and delivering software to improve the availability, scalability, and efficiency of our services.Key responsibilities include:Designing and developing systems and...
-
Reliability Engineer
2 weeks ago
Boston, United States MENTOR Technical Group Corporation Full timeMentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life science engineering and technical solutions, MTG has the knowledge and experience to ensure compliance with pharmaceutical, biotechnology, and medical device safety and efficacy guidelines. With offices...
-
Reliability Engineer
3 days ago
Boston, United States ZipRecruiter Full timeMentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life science engineering and technical solutions, MTG has the knowledge and experience to ensure compliance with pharmaceutical, biotechnology, and medical device safety and efficacy guidelines. With offices...
-
Site Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States Air Space Intelligence Full timeAbout Air Space IntelligenceAir Space Intelligence is a software-first aerospace company that develops AI-powered mission control systems to ensure the world's most complex air operations succeed.We serve major U.S. airlines as well as U.S. and allied government organizations.Our software is used in mission-critical operations to provide our partners with a...
-
Data Reliability Engineer
4 weeks ago
Boston, Massachusetts, United States WHOOP Full timeAt WHOOP, we're on a mission to unlock human performance. Our innovative data platforms are the game-changing connective tissue flowing vital resources to teams, applications, and insightful solutions that power real-time AI, cutting-edge science, and bold visionary decision-making.As a Data Reliability Engineer at WHOOP, you will play a crucial role in...