Site Reliability Engineer
4 weeks ago
--
An SRE is responsible for maintaining reliability. That means facilitating automated, streamlined, and efficient error responses and reducing human error at scale. SREs spend a lot of time removing pain points, configuring internal tools, and setting and testing system benchmarks. They also develop and monitor robust engineering pipelines for everyday product operability.
Site Reliability Engineers at Material Bank are responsible for performance, availability, reliability, efficiency, change management, monitoring, and emergency response of a system. Other core tasks of SREs include:
Establish and Monitor Service-Level Indicators (SLIs) and setting Service-Level Objectives (SLOs)
– SREs facilitate proper SLIs for efficient performance through proper resource utilization, with minimal errors. They also set SLOs for reviewing internal targets, such as high availability.
Risk assessments and error budgeting
– SREs are responsible for establishing the reliability target for systems, even taking measured risks with subsequent product launches.
Monitoring outputs
— Ticketing, logging, and alerts (signifying different levels of needed human actions) are critical tasks for an SRE.
Demand forecasting and capacity planning
– Projects require careful assessments to plan for future demand, outages, and emergencies. An SRE works in conjunction with product heads to perform these tasks.
Collaboration
– SREs must collaborate with many diverse teams, disseminating best practices and reviewing best reliability decisions to make for better cross-departmental product development.
Writing Postmortems
– Postmortem reports help the team learn from incidents to prevent their recurrence.
What you'll do:
Lead the handling of ticket queue (MaterialBank production issues) for AWS and GCP corporate infrastructure requests from team members. This ranges from simple IAM and route 53 DNS requests to designing and deploying new scalable application infrastructure.
Be on an on-call (Opsgenie) rotation to respond to incidents that impact GitLab.com availability, and provide support for service engineers with customer incidents.
Use your on-call shift to prevent incidents from ever happening.
Run our infrastructure with Ansible, Terraform, ArgoCD, Bitbucket CI/CD, and Kubernetes.
Build monitoring with NewRelic that alerts on symptoms rather than on outages.
Document every action so your findings turn into repeatable actions and then into automation.
Improve operational processes (such as deployments and upgrades) to make them as boring as possible.
Design, build and maintain core infrastructure that enables Materialbank scaling to support thousands of concurrent users.
Debug production issues across services and levels of the stack.
Plan the growth of Material Bank’s infrastructure.
What you’ll get from us:
Our people
: If you thrive in an inclusive, innovative, and fast-paced organization, look no further You will get to work alongside some of the brightest minds - Join a genuinely fun and supportive workplace where we keep our employees consistently engaged through internal communication and corporate events
Relaxation and Celebrations
:Generous PTO, Sick Days, Paid National Holidays, and even more (ask us about this when we connect).
Health Benefits
:
We
contribute
to your medical, dental, vision and short-term/long-term
disability plans
and have a strong employee assistance program.
Plan for your Retirement
:
401(k)
eligible
after your first 90 day's employed
Giving Back
: We sponsor multiple events throughout the year to help out our communities. You will receive time off to give back as well.
Growth
: We’ll help you take your career to the next level. We want you to be creative and take initiative which will allow you to grow and create within the company. Most importantly, be the best at what matters
Flexible Work Schedules
: With business units and employees across the globe, Material Technologies has embraced a hybrid working model allowing department leaders to decide on thebest approach for their respective teams, whether that beremote, in person, or a little of both.
#J-18808-Ljbffr
-
Site Reliability Engineer
7 days ago
Boston, United States Biofourmis Full timePosition Overview: Biofourmis is seeking a talented and experienced Site Reliability Engineer to join our dynamic global team. As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and performance of our digital health platform. You will collaborate closely with cross-functional teams to design,...
-
Engineering Manager
7 days ago
Boston, United States New Balance Full timeWho We Are: Since 1906, New Balance has empowered people through sport and craftsmanship to create positive change in communities around the world. We innovate fearlessly, guided by our core values and driven by the belief that conventions were meant to be challenged. We foster a culture in which every associate feels welcomed and respected, where leaders...
-
Site Reliability Engineer
6 days ago
Boston, MA, United States Biofourmis Full timePosition Overview: Biofourmis is seeking a talented and experienced Site Reliability Engineer to join our dynamic global team. As a Site Reliability Engineer (SRE), you will play a critical role in ensuring the reliability, scalability, and performance of our digital health platform. You will collaborate closely with cross-functional teams to design,...
-
Site Reliability Engineer
2 days ago
Boston, United States Material Bank Full timeMaterial Bank is a fast-paced, high-growth technology company and created the world's largest material marketplace for the Architecture and Design industry, providing the fastest and most powerful way to start and manage a design project. Learn more about us at www.materialbank.com or see below. -- An SRE is responsible for maintaining reliability. That...
-
Site Reliability Engineer
1 day ago
Boston, United States Material Bank Full timeMaterial Bank is a fast-paced, high-growth technology company and created the world's largest material marketplace for the Architecture and Design industry, providing the fastest and most powerful way to start and manage a design project. Learn more about us at www.materialbank.com or see below. -- An SRE is responsible for maintaining reliability. That...
-
Senior Site Reliability Engineer
2 days ago
Boston, United States PathAI Full timeWho We Are PathAI is on a mission to improve patient outcomes with AI-powered pathology. We are transforming traditional pathology methods into powerful, new technologies. These innovations in pathology can help accelerate drug development, improve confidence in the accuracy of diagnosis, and get life-saving therapies to patients more quickly. At PathAI,...
-
Senior Site Reliability Engineer
23 hours ago
Boston, United States PathAI Full timeWho We Are PathAI is on a mission to improve patient outcomes with AI-powered pathology. We are transforming traditional pathology methods into powerful, new technologies. These innovations in pathology can help accelerate drug development, improve confidence in the accuracy of diagnosis, and get life-saving therapies to patients more quickly. At PathAI,...
-
Reliability Engineer
7 days ago
Boston, United States Sequoia Biotech Consulting Full timeResponsibilities The GxP Reliability Engineer will provide reliability engineering support for all facilities, utilities systems and equipment including analytical instrumentation, R&D lab support equipment and systems. This role will facilitate the deployment of Maintenance and Reliability Best Practices for new and existing equipment, facilities, and...
-
Reliability Engineer
17 hours ago
Boston, United States Sequoia Biotech Consulting Full timeResponsibilities The GxP Reliability Engineer will provide reliability engineering support for all facilities, utilities systems and equipment including analytical instrumentation, R&D lab support equipment and systems. This role will facilitate the deployment of Maintenance and Reliability Best Practices for new and existing equipment, facilities, and...
-
Site Reliability Engineer
7 days ago
Boston, United States BlueSkyClarity Full timeSite Reliability Engineer (Kubernetes, Microservices, Operations) Apply Site Reliability Engineer (Kubernetes, Microservices, Operations), Boston, MA, Downtown & Metro West Market Compensation Commensurate with experience, bonus, equity, benefits additional, EOE Candidates must be a U.S. citizen or national, refugee, asylum, or lawful permanent resident. H1b...
-
Site Reliability Engineer
16 hours ago
Boston, United States BlueSkyClarity Full timeSite Reliability Engineer (Kubernetes, Microservices, Operations) Apply Site Reliability Engineer (Kubernetes, Microservices, Operations), Boston, MA, Downtown & Metro West Market Compensation Commensurate with experience, bonus, equity, benefits additional, EOE Candidates must be a U.S. citizen or national, refugee, asylum, or lawful permanent resident. H1b...
-
Oracle: Principal Site Reliability Engineer
6 days ago
Boston, MA, United States Soteriare Full timeApply locations Merrimack, NH Boston, MA time type Full time posted on Posted 5 Days Ago job requisition id 2093756 Job Description: As a member of the TechOps SRE team, you'll work closely with our engineering partners to help enable and drive initiatives from design to implementation. This is a phenomenal opportunity to have a direct impact on the...
-
Lead Site Reliability Engineer
7 days ago
Boston, United States Dice Full timeDice is the leading career destination for tech experts at every stage of their careers. Our client, Motion Recruitment Partners, LLC, is seeking the following. Apply via Dice today! We are partnered with a a dynamic startup poised to revolutionize data management, competing with established players. They are looking for a Senior Site Reliability to join...
-
Boston, United States Tik Tok Full timeResponsibilities TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us At TikTok, our people are humble, intelligent, compassionate and creative. We create to...
-
Site Reliability Engineer, Infrastructure- Boston
24 hours ago
Boston, United States Tik Tok Full timeResponsibilities TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us At TikTok, our people are humble, intelligent, compassionate and creative. We create to...
-
Lead Site Reliability Engineer
7 days ago
Boston, United States Motion Recruitment Partners LLC Full timeWe are partnered with aa dynamic startup poised to revolutionize data management, competing with established players. They are looking for a Senior Site Reliability to join their grown DevOps team to ensure the reliability and performance of their highly scalable systems. You will work closely with software engineers to automate tooling and migrate...
-
Lead Site Reliability Engineer
2 weeks ago
Boston, United States Motion Recruitment Full timeWe are partnered with a a dynamic startup poised to revolutionize data management, competing with established players. They are looking for a Senior Site Reliability to join their grown DevOps team to ensure the reliability and performance of their highly scalable systems. You will work closely with software engineers to automate tooling and migrate...
-
Lead Site Reliability Engineer
1 day ago
Boston, United States Motion Recruitment Full timeWe are partnered with a a dynamic startup poised to revolutionize data management, competing with established players. They are looking for a Senior Site Reliability to join their grown DevOps team to ensure the reliability and performance of their highly scalable systems. You will work closely with software engineers to automate tooling and migrate...
-
Principal Site Reliability Engineer
2 weeks ago
Boston, United States Apollo Solutions Full timePrincipal DevOps Engineer/SRE Apollo Solutions have partnered with a disruptive early stage AI/ML start-up backed by top tier venture capital. In this role, you will be working closely with their founders and founding engineers to ensure fast, secure and reliable features can be delivered as well as building their infrastructure to feature massive...
-
Principal Site Reliability Engineer
3 weeks ago
Boston, United States Apollo Solutions Full timePrincipal DevOps Engineer/SRE Apollo Solutions have partnered with a disruptive early stage AI/ML start-up backed by top tier venture capital. In this role, you will be working closely with their founders and founding engineers to ensure fast, secure and reliable features can be delivered as well as building their infrastructure to feature massive...