Lead Site Reliability Engineer
4 weeks ago
Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service and convenience to our members, helping them save on the products and services they need for their families and homes.
The Benefits of working at BJ’s
• BJ’s pays weekly
• Eligible for free BJ's Inner Circle and Supplemental membership(s)*
• Generous time off programs to support busy lifestyles*
o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty
• Benefit plans for your changing needs*
o Three medical plans**, Health Savings Account (HSA), two dental plans, vision plan, flexible spending
• 401(k) plan with company match (must be at least 18 years old)
*eligibility requirements vary by position
**medical plans vary by location
As a Lead Site Reliability Engineer, you will be responsible for designing, building, monitoring, and continuously improving our ecommerce platform's infrastructure and processes. Leveraging your expertise in observability tools such as New Relic, Scalyr/Splunk, bash scripts, and Python scripts, you will play a pivotal role in ensuring the reliability and performance of our Java microservices-based architecture.
Key Responsibilities:
- Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engineering (SRE) principles.
- Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs).
- Proactively identify and address potential issues before they impact operations, utilizing observability tools like New Relic, Scalyr/Splunk, bash scripts, and Python scripts.
- Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices.
- Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence.
- Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions.
- Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices.
- Design and maintain robust production monitoring systems to ensure timely detection and resolution of issues, following SRE guidelines for effective monitoring and alerting.
- Utilize a diverse array of tools to troubleshoot performance and stability issues effectively, employing SRE methodologies to identify and mitigate bottlenecks.
- Evaluate and enhance application and environment security measures, integrating SRE-driven security practices into the development and deployment pipelines.
- Provide support for globally distributed, multi-cloud (public and/or private) environments, implementing SRE strategies for resilience and fault tolerance.
- Automate repetitive tasks at scale to streamline operational workflows and enhance efficiency, focusing on the implementation of SRE-driven automation solutions.
- Adhere to change management processes during implementations and utilize version control for application infrastructure, following SRE principles for reliable and auditable change management.
- Foster a SRE mindset throughout the organization, promoting collaboration and shared responsibility for reliability and performance
Qualifications:
- Bachelor's Degree in Computer Science or related field, or foreign equivalent.
- Demonstrated curiosity and self-drive to tackle complex challenges and drive change in a diverse organizational landscape.
- Excellent written and verbal communication skills, with the ability to effectively communicate with engineering management, developers, and leadership.
- Proven ability to adapt to new technologies and learn quickly.
- Minimum of 5 years of experience in Site Reliability Engineering (SRE) or related roles.
Job Conditions:
- Collaborate within a diverse and global team environment.
- Participate in cross-training with other team members across different regions.
- Rotate in an on-call schedule as required to ensure 24/7 availability and support for critical systems.
-
Lead Site Reliability Engineer
4 weeks ago
Marlborough, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...
-
Site Reliability Engineer
4 weeks ago
Marlborough, Massachusetts, United States BJ's Wholesale Club Full timeJob SummaryBJ's Wholesale Club is seeking a highly skilled Lead Site Reliability Engineer to join our team. As a key member of our ecommerce platform's infrastructure team, you will be responsible for designing, building, and continuously improving our infrastructure and processes.Key ResponsibilitiesDesign and manage Java-based microservices, bash scripts,...
-
Site Reliability Engineering Manager
4 weeks ago
Boston, MA , USA, United States Insight Global Full timeSite Reliability Engineering ManagerA leading retail company in the $7 billion industry is seeking a Site Reliability Engineering Manager to lead a team of 7-10 Site Reliability Engineers in Boston, MA.Key Responsibilities:Lead a team of Site Reliability Engineers in supporting and monitoring production for the eCommerce platform.Develop and implement...
-
Reliability Engineer
4 weeks ago
Marlborough, Massachusetts, United States DuPont Full timeAbout the RoleWe are seeking a highly skilled Reliability Engineer to join our team at DuPont. As a key member of our maintenance and reliability team, you will play a critical role in ensuring the success of our Process Safety Management (PSM)/MIQA and reliability programs.Key ResponsibilitiesDevelop and implement programs and systems to improve equipment...
-
Site Reliability Engineer
4 weeks ago
Newton, MA, USA, United States Intelliswift Software Inc Full timeSite Reliability EngineerWe are seeking a skilled Site Reliability Engineer to join our dynamic team at Intelliswift Software Inc. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance.Key Responsibilities:System Monitoring and Incident Response: Monitor...
-
Reliability Engineer
4 weeks ago
Marlborough, Massachusetts, United States DuPont Full timeJob Title: Reliability EngineerWe are seeking a highly skilled Reliability Engineer to join our team at DuPont. As a Reliability Engineer, you will play a critical role in ensuring the success of our Process Safety Management (PSM) and Mechanical Integrity Quality Assurance (MIQA) programs.Key Responsibilities:Develop and implement programs and systems to...
-
Site Reliability Engineer
4 weeks ago
Newton, MA, USA, United States Intelliswift Full timeJob Title: Site Reliability EngineerJob Summary:We are seeking a skilled Site Reliability Engineer to join our dynamic team at Intelliswift. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance.Key Responsibilities:System Monitoring and Incident...
-
Site Reliability Engineer
4 weeks ago
Newton, MA, USA, United States Cypress HCM Full timeJob SummaryWe are seeking a skilled Site Reliability Engineer to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance.Key Responsibilities:System Monitoring and Incident Response: Monitor system health, performance metrics, and...
-
Site Reliability Engineer 2
4 weeks ago
Newton, MA, United States Intelliswift Full timeSite Reliability Engineer 2We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team at Intelliswift. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance.Key Responsibilities: System Monitoring and Incident Response:...
-
Site Reliability Engineer
1 week ago
Newton, MA, United States Intelliswift Software Full timeTitle : Site Reliability EngineerLocation : Newton, MA HybridDuration : 6 MonthsPay rate : $38.73 per hour on W2We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and...
-
Reliability Engineer
4 weeks ago
Marlborough, Massachusetts, United States DuPont Full timeJob SummaryWe are seeking a highly skilled Reliability Engineer to join our team at DuPont. As a key member of our Process Safety Management (PSM) and Mechanical Integrity Quality Assurance (MIQA) teams, you will be responsible for ensuring the success of our equipment reliability programs.Your primary focus will be on improving site Predictive/Preventive...
-
Staff Site Reliability Engineer
1 month ago
Newton, MA, USA, United States CyberArk Full timeAbout CyberArkCyberArk is the global leader in Identity Security, providing the most comprehensive security offering for any identity - human or machine - across business applications, distributed workforces, hybrid cloud workloads, and throughout the DevOps lifecycle. The world's leading organizations trust CyberArk to help secure their most critical...
-
Lead Reliability Engineer/Plant Engineer
1 week ago
Worcester, MA, United States CapstoneONE Search Full timeWe are representing a globally recognized industrial manufacturing organization who is actively seeking a Lead Reliability Engineer/Plant Engineer due to a recently announced retirement. Reporting to the Director of Engineering, this position will be responsible for strategically leading reliability programs, projects, and department for the plant. This is a...
-
Reliability Engineer
3 weeks ago
Marlborough, Massachusetts, United States DuPont Full timeReliability Engineer Job DescriptionAt DuPont, we are working on things that matter; whether it's providing clean water to more than a billion people on the planet, producing materials that are essential in everyday technology devices from smartphones to electric vehicles, or protecting workers around the world.We are excited to share that on May 22, 2024,...
-
SR IT Cloud Engineer
2 months ago
BJ's Club Support Center Marlborough, MA #5997, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...
-
Tech Product Manager- ERP Finance
1 month ago
BJ's Club Support Center Marlborough, MA #5997, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...
-
Product Owner Pricing AND Promotion Tech
1 week ago
BJ's Club Support Center Marlborough, MA #5997, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...
-
Product Owner Pricing AND Promotion Tech
5 days ago
BJ's Club Support Center Marlborough, MA #5997, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...
-
Site Reliability Engineer
4 weeks ago
Newton, MA, USA, United States Software Guidance and Assistance, Inc. Full timeJob Title: Site Reliability EngineerSoftware Guidance and Assistance, Inc. (SGA) is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure.Key Responsibilities:System Monitoring and Incident Response:...
-
Tech Product Manager- Vendor Management
2 months ago
BJ's Club Support Center Marlborough, MA #5997, United States BJ's Wholesale Club Full timeJoin our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service...