Lead Site Reliability Engineer
2 weeks ago
Lead Site Reliability Engineer page is loaded
Lead Site Reliability Engineer
Apply
locations
BJ's Club Support Center Marlborough, MA #5997
time type
Full time
posted on
Posted 2 Days Ago
job requisition id
R147855
Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center,
235+
clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service and convenience to our members, helping them save on the products and services they need for their families and homes.
The Benefits of working at BJ’s
•
BJ’s pays weekly
•
Generous time off programs to support busy lifestyles*
o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty
•
Benefit plans for your changing needs*
o Three medical plans**, Health Reimbursement Account (HRA), Health Savings Account (HSA), two dental plans, flexible spending
*eligibility requirements vary by position
**medical plans vary by location
As a Lead Site Reliability Engineer, you will be responsible for designing, building, monitoring, and continuously improving our ecommerce platform's infrastructure and processes. Leveraging your expertise in observability tools such as New Relic, Scalyr/Splunk, bash scripts, and Python scripts, you will play a pivotal role in ensuring the reliability and performance of our Java microservices-based architecture.
Key Responsibilities :
Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engineering (SRE) principles.
Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs).
Proactively identify and address potential issues before they impact operations, utilizing observability tools like New Relic, Scalyr/Splunk, bash scripts, and Python scripts.
Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices.
Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence.
Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions.
Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices.
Design and maintain robust production monitoring systems to ensure timely detection and resolution of issues, following SRE guidelines for effective monitoring and alerting.
Utilize a diverse array of tools to troubleshoot performance and stability issues effectively, employing SRE methodologies to identify and mitigate bottlenecks.
Evaluate and enhance application and environment security measures, integrating SRE-driven security practices into the development and deployment pipelines.
Provide support for globally distributed, multi-cloud (public and/or private) environments, implementing SRE strategies for resilience and fault tolerance.
Automate repetitive tasks at scale to streamline operational workflows and enhance efficiency, focusing on the implementation of SRE-driven automation solutions.
Adhere to change management processes during implementations and utilize version control for application infrastructure, following SRE principles for reliable and auditable change management.
Foster a SRE mindset throughout the organization, promoting collaboration and shared responsibility for reliability and performance
Qualifications :
Bachelor's Degree in Computer Science or related field, or foreign equivalent.
Demonstrated curiosity and self-drive to tackle complex challenges and drive change in a diverse organizational landscape.
Excellent written and verbal communication skills, with the ability to effectively communicate with engineering management, developers, and leadership.
Proven ability to adapt to new technologies and learn quickly.
Minimum of 5 years of experience in Site Reliability Engineering (SRE) or related roles.
Job Conditions :
Collaborate within a diverse and global team environment.
Participate in cross-training with other team members across different regions.
Rotate in an on-call schedule as required to ensure 24/7 availability and support for critical systems.
In accordance with the Pay Transparency requirements, the following represents a good faith estimate of the compensation range for this position. At BJ’s Wholesale Club, we carefully consider a wide range of non-discriminatory factors when determining salary. Actual salaries will vary depending on factors including but not limited to location, education, experience, and qualifications. The pay range for this position is starting from $109,000.00.
About Us
At BJ's Wholesale Club, we’re focused on delivering unbeatable value and outstanding service to our members and communities.
Headquartered in Marlborough, Massachusetts, BJ's Wholesale Club is a leading operator of membership warehouse clubs in the Eastern United States. Currently operating
more than 235
clubs,
over 165
BJ's Gas locations and eight distribution centers, we were the first retailer to introduce the warehouse club concept in the northeastern United States. Providing a curated assortment of grocery, general merchandise, gasoline and ancillary services, BJ's offers a differentiated shopping experience that is further enhanced by our omnichannel capabilities.
#J-18808-Ljbffr
-
Site Reliability Engineer
3 days ago
Jersey City, United States BCforward Full timeJob Title: Site Reliability Engineer (AWS) (SRE)Type: W2 (Strictly No C2C and no sponsorship available)Location: Jersey City or Plano or Delaware (Hybrid)Duration: 9 Months Contract to hireHybrid Model: 3 Days onsite 2 days remotea. Skillset AWS, Big Data, Spark, Python, Shell / Perl Scripting, Control-M, Autosys. Grafana, AppDynamics, APICAb. Experience•...
-
Site Reliability Engineer
3 weeks ago
Jersey City, United States Pinnacle Group, Inc. Full timeW2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...
-
Site Reliability Engineer
3 weeks ago
Jersey City, United States Pinnacle Group, Inc. Full timeW2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...
-
Site Reliability Engineer
3 weeks ago
Jersey City, United States Pinnacle Group, Inc. Full timeW2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...
-
Site Reliability Engineer
3 weeks ago
Redwood City, California, United States C3 Full timeWe are looking for a Site Reliability Engineer to join our team at our HQ in Redwood City, CA.Responsibilities:Maximize system uptime and availability, ensuring functional and performance SLAs.Establish end-to-end monitoring and alerting on all critical aspects.Solve complex problems for critical services and build automation to prevent problem...
-
Site Reliability Engineer
5 days ago
Salt Lake City, United States Global Channel Management Full timeSite Reliability Engineer (SRE) Specialist 1 month ago Be among the first 25 applicants 4+ years of experience in a similar support engineering or site reliability engineering (SRE) role Proficiency in standard RPE with excellent written and verbal communication skills Strong knowledge of Linux operating systems Hands-on experience with Python for automating...
-
Aumni - Site Reliability Engineer III
1 week ago
Jersey City, New Jersey, United States tapwage Full timeThere's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve complex...
-
Site Reliability Engineer
2 weeks ago
Foster City, United States Zoox Full timeZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...
-
Site Reliability Engineer
2 weeks ago
Foster City, United States Zoox Full timeZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...
-
Site Reliability Engineer
7 days ago
Kansas City, United States Gorilla Logic Full timeGorilla Logic: Mid-Level Site Reliability Engineer (SRE) Gorilla Logic provides nearshore Agile teams to Fortune 500 and SMB companies, bringing unparalleled expertise in the delivery of full-stack web, mobile, and enterprise applications. Our highly collaborative Agile Gorillas are uniquely qualified to implement complex software initiatives. With offices...
-
Site Reliability Engineer
3 days ago
Salt Lake City, United States Goldman Sachs Full timeMORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...
-
Site Reliability Engineer
2 weeks ago
Jersey City, New Jersey, United States Devexperts Full timeCompany DescriptionDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you'll become a part of a company that fosters self-improvement and actively seeks...
-
Senior Site Reliability Engineer
7 days ago
Kansas City, United States Gorilla Logic Full timeGorilla Logic Overview Gorilla Logic provides nearshore Agile teams to Fortune 500 and SMB companies, bringing unparalleled expertise in the delivery of full-stack web, mobile, and enterprise applications. Our highly collaborative Agile Gorillas are uniquely qualified to implement complex software initiatives. With offices in the United States, Costa Rica,...
-
Safety and Reliability Engineer
4 days ago
Oklahoma City, United States Ajsearch Full timeSafety & Reliability Engineer About Us Our client is a transatlantic cutting-edge aerospace company developing solar powered aircraft solutions capable of achieving perpetual flight with heavy, and powerful payload capacity. Utilizing technology based upon the longest continuous renewably powered flight program in history, this fast-growing startup is...
-
Reliability Specialist
3 weeks ago
Salt Lake City, United States bioMerieux SA Career Site - MULTI-LINGUAL Full timeDescription Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...
-
SRE / Site Reliability Engineer// W2 Only
6 days ago
Arizona City, United States Brothers Consulting Full timeKey Skills: • Experience with one or more Cloud Platforms (Azure, GCP) • Experience with Container technologies: Kubernetes, Docker, PKS, Azure Kubernetes Service (AKS) • 5+ years of experience in Site Reliability engineering • Experience setting up monitoring in applications and database. • Experience in ServiceNow, Jira,...
-
Site Reliability Engineer
7 days ago
Jersey City, United States DevExperts Full timeDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide. Interested in learning more about this job Scroll down and find out what skills, experience and educational qualifications are needed. By becoming...
-
Site Reliability Engineer
3 days ago
Jersey City, United States DevExperts Full timeDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide. By becoming a part of Devexperts, youll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our teams...
-
Site Reliability Engineer
2 weeks ago
Jersey City, United States Devexperts Full timeDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you’ll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our...
-
Site Reliability Engineer
2 weeks ago
Jersey City, United States Devexperts Full timeDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you’ll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our...