Site Reliability Engineer 1
2 months ago
About the Team/Role
The WEX Site Reliability Engineering (SRE) team is looking for a motivated and quick-learning Level 1 Site Reliability Engineer to join our growing team. We are passionate about developing software and solutions for observability, incident response, reliability, performance, operational excellence, and compliance. As a member of the SRE organization, you will support internal stakeholders and Engineering teams, tackling complex challenges and enhancing our engineering teams' and customers' experience. You will have the opportunity to work alongside experienced SREs and gain valuable hands-on experience in a dynamic and supportive environment.
As a Level 1 SRE at WEX, you will:
- Learn the fundamentals of SRE: Gain a solid understanding of core SRE principles, including monitoring, incident management, and automation.
- Develop basic automation scripts: Use scripting languages like Python or Bash to automate simple tasks and improve operational efficiency.
- Triage and resolve incidents: Participate in on-call rotations, assisting with the identification and resolution of incidents under the guidance of senior SREs.
- Monitor system health: Utilize monitoring tools to identify and escalate potential issues, ensuring the stability and performance of our systems.
- Collaborate with development teams: Work closely with software engineers to understand their systems and provide operational support.
- Contribute to documentation: Help maintain and improve internal documentation, including runbooks, knowledge base articles, and playbooks.
- Continuously learn and grow: Expand your knowledge of cloud technologies, DevOps practices, and SRE tools through internal and external training opportunities.
How you'll make an impact
- Develop a basic understanding of code, networking, operating systems, and storage solutions: You'll be able to identify and troubleshoot common issues related to these areas.
- Assist in developing automation and utilizing monitoring tools to ensure system reliability: You'll learn how to use tools to automate tasks and monitor system health.
- Participate in incident response and troubleshooting alongside senior SREs: You'll gain experience in identifying, escalating, and resolving incidents.
- Participate in 24x7 Site Reliability rotations and escalation workflows with guidance from senior team members: You'll learn how to respond to incidents and escalate issues appropriately.
- Learn to identify and address basic performance bottlenecks: This will include understanding code optimization, configuration changes, and infrastructure upgrade recommendations.
- Collaborate with development teams to ensure software design meets operational requirements: You'll learn how to communicate effectively with developers and advocate for operational best practices.
- Work with development teams to make sure operational needs are met by assisting with support requests from other engineering teams: You'll gain experience in providing support and collaborating with different teams.
- Contribute to the continuous improvement of processes and procedures to increase system reliability and efficiency: You'll participate in team discussions and contribute ideas for improvement.
- Stay up-to-date with the latest industry trends and technologies: You'll be encouraged to learn new technologies and share your knowledge with the team.
Experience you'll bring
- Basic understanding of at least one major programming language: C#, Java, GoLang, Python. You should be able to read and understand code, and write scripts.
- Familiarity with a Cloud Computing platform (AWS, Azure, or GCP): You should have a basic understanding of cloud concepts and services.
- Strong communication and collaboration skills: You'll be working closely with different teams, so effective communication is essential.
- BA/BS degree in Computer Science or related technical field or equivalent job experience: A strong foundation in computer science principles is important.
Nice to have
- Basic understanding of infrastructure as code, preferably Terraform: Familiarity with IaC concepts and tools is a plus.
- Working knowledge of RESTful APIs: Understanding how APIs work is beneficial.
- Exposure to observability and logging technologies: Any experience with monitoring and logging tools is helpful.
- Experience with at least one major RDBMS and NoSQL data store: Familiarity with databases is a plus.
- Exposure to containerization technologies such as Docker or Kubernetes: Basic knowledge of containers and orchestration is beneficial.
- Familiarity with GitOps: Understanding of GitOps principles is helpful.
-
Site Reliability Engineer 2
1 week ago
Boston, United States WEX Full time(*) This is a remote position; however, the candidate must reside within 30 miles of one of the following locations: Boston, MA and Portland, MEAbout the Team/RoleThe WEX Site Reliability Engineering (SRE) team seeks individuals passionate about developing software and solutions for observability, incident response, reliability, performance, operational...
-
Principal Site Reliability Engineer
2 months ago
Boston, United States Global InfoTek Full timeClearance Level: Clearance Eligible US Citizenship: Required Job Classification: Full Time Location: Remote Years of Experience: 7-10 years Education Level: Bachelor's degree in computer science, Mathematics, or equivalent technical degree; or equivalent industry experience. Position Description: The Site Reliability Engineer (SRE) must be able to build...
-
Site Reliability Engineer 2
1 week ago
Boston, United States WEX, Inc. Full time(*) This is a remote position; however, the candidate must reside within 30 miles of one of the following locations: Boston, MA and Portland, MEAbout the Team/RoleThe WEX Site Reliability Engineering (SRE) team seeks individuals passionate about developing software and solutions for observability, incident response, reliability, performance, operational...
-
Site Reliability Engineer 2
4 days ago
Boston, United States WEX Full time(*) This is a remote position; however, the candidate must reside within 30 miles of one of the following locations: Boston, MA and Portland, ME About the Team/Role The WEX Site Reliability Engineering (SRE) team seeks individuals passionate about developing software and solutions for observability, incident response, reliability, performance, operational...
-
Boston, United States Global InfoTek, Inc Full timeClearance Level: Clearance Eligible US Citizenship: Required Job Classification: Full Time Location: Remote Years of Experience: 7-10 years Education Level: Bachelor’s degree in computer science, Mathematics, or equivalent technical degree; or equivalent industry experience. Position Description: The Site Reliability Engineer (SRE) must be able to build...
-
Boston, United States Global InfoTek, Inc Full timeClearance Level: Clearance Eligible US Citizenship: Required Job Classification: Full Time Location: Remote Years of Experience: 7-10 years Education Level: Bachelor’s degree in computer science, Mathematics, or equivalent technical degree; or equivalent industry experience. Position Description: The Site Reliability Engineer (SRE) must be able to build...
-
Lead Site Reliability Engineer
6 days ago
Boston, United States Klaviyo Full timeAt Klaviyo, we value the unique backgrounds, experiences and perspectives each Klaviyo (we call ourselves Klaviyos) brings to our workplace each and every day. We believe everyone deserves a fair shot at success and appreciate the experiences each person brings beyond the traditional job requirements. If you're a close but not exact match with the...
-
Senior Site Reliability Engineer
2 weeks ago
Boston, United States The Hollister Group Full timeSenior Site Reliability EngineerOur client is a financial services company looking to hire a Senior Site Reliability Engineer at their Boston offices. This individual will develop proactive processes for monitoring key infrastructural/application components and managing risk. The ideal candidate has strong experience with Azure cloud, Kubernetes, and...
-
Senior Site Reliability Engineer
1 month ago
Boston, United States The Hollister Group Full timeSenior Site Reliability EngineerOur client is a financial services company looking to hire a Senior Site Reliability Engineer at their Boston offices. This individual will develop proactive processes for monitoring key infrastructural/application components and managing risk. The ideal candidate has strong experience with Azure cloud, Kubernetes, and...
-
Senior Site Reliability Engineer
6 days ago
Boston, United States Motional Full timeWe are seeking a capable and highly motivated Senior Engineer for our Site Reliability Engineering (SRE) team to enhance the reliability, performance, and scalability of our infrastructure platforms, effectively manage complex business-critical systems and deliver high-quality service to internal customers. You will join a talented team of SRE engineers and...
-
Senior Site Reliability Engineer
7 days ago
Boston, MA, United States Motional Full timeWe are seeking a capable and highly motivated Senior Engineer for our Site Reliability Engineering (SRE) team to enhance the reliability, performance, and scalability of our infrastructure platforms, effectively manage complex business-critical systems and deliver high-quality service to internal customers. You will join a talented team of SRE engineers and...
-
Senior Site Reliability Engineer
6 days ago
Boston, United States Red Hat Full timeRed Hat is seeking a Site Reliability Engineer (SRE) to develop, scale, and operate our OpenShift managed cloud services. OpenShift is Red Hat’s enterprise Kubernetes distribution. As an SRE you will contribute to running OpenShift at scale by enabling customer self-service, making our monitoring system more sustainable, and eliminating work through...
-
boston, United States The Hollister Group Full timeSenior Site Reliability EngineerOur client is a financial services company looking to hire a Senior Site Reliability Engineer at their Boston offices. This individual will develop proactive processes for monitoring key infrastructural/application components and managing risk. The ideal candidate has strong experience with Azure cloud, Kubernetes, and...
-
Site Reliability Engineer
4 days ago
Boston, United States RANE Full timeRANE Network is seeking a highly motivated person to join our dynamic team, dedicated to supporting the company’s users, applications and web based product offerings. In this role, the SRE will play a key role in maintaining resources at peak efficiency to guarantee staff are able to perform their functions efficiently and ensuring a good customer...
-
Site Reliability Engineer
5 days ago
Boston, United States RANE Full timeRANE Network is seeking a highly motivated person to join our dynamic team, dedicated to supporting the company’s users, applications and web based product offerings. In this role, the SRE will play a key role in maintaining resources at peak efficiency to guarantee staff are able to perform their functions efficiently and ensuring a good customer...
-
Associate Site Reliability Engineer
7 hours ago
Boston, United States Red Hat Full timeThe Software Production Resilienceteam is seeking an Associate Site Reliability Engineer (ASRE) with passion for maintaining highly reliable cloud-based services. In this role, you will support Red Hat’s software manufacturing services on our hybrid cloud infrastructure. You will partner with development, quality engineering and release engineering...
-
Site Reliability Developer 4
3 days ago
Boston, United States Oracle Full timeJob Description A unique opportunity to join a rapidly growing world-class team to engineer cutting edge Oracle Cloud technologies and infrastructure that make up the Oracle Cloud solutions. As part of the SRE team, you will be continually challenged and have an opportunity to contribute to the Oracle Cloud Object Storage success every day. As a Site...
-
Jr Site Reliability Engineering
1 month ago
Boston, United States SCRAM Systems Full timeJob Summary: We are seeking a motivated and detail-oriented Junior Site Reliability Engineer (SRE) to join our team. This position blends SRE principles with DevOps practices, making the team a key player in enabling seamless application delivery, scalability, and reliability. You will support both Azure Cloud environments and co-located data centers,...
-
RANE | Site Reliability Engineer
4 days ago
boston, United States RANE Full timeRANE Network is seeking a highly motivated person to join our dynamic team, dedicated to supporting the company’s users, applications and web based product offerings. In this role, the SRE will play a key role in maintaining resources at peak efficiency to guarantee staff are able to perform their functions efficiently and ensuring a good customer...
-
Site Reliability Engineer
4 days ago
Boston, United States StartUs GmbH Full timeWHAT YOU’LL DO As a member of a small cross functional squad, you’ll own a particular infrastructure challenge at Spotify Design and document systems, including writing and reviewing code, to automate away problems within your squad’s domain Undertake measured, methodical, troubleshooting of complicated systems under pressure Partake in an on-call...