Lead Site Reliability Engineer

2 weeks ago


Oklahoma City, United States BJ's Wholesale Club Full time

Lead Site Reliability Engineer page is loaded

Lead Site Reliability Engineer

Apply

locations

BJ's Club Support Center Marlborough, MA #5997

time type

Full time

posted on

Posted 2 Days Ago

job requisition id

R147855

Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center,

235+

clubs and eight distribution centers. BJ’s Wholesale Club offers a collaborative and inclusive environment where all team members can learn, grow and be their authentic selves. Together, we’re committed to providing outstanding service and convenience to our members, helping them save on the products and services they need for their families and homes. The Benefits of working at BJ’s •

BJ’s pays weekly •

Generous time off programs to support busy lifestyles* o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty •

Benefit plans for your changing needs* o Three medical plans**, Health Reimbursement Account (HRA), Health Savings Account (HSA), two dental plans, flexible spending *eligibility requirements vary by position **medical plans vary by location As a Lead Site Reliability Engineer, you will be responsible for designing, building, monitoring, and continuously improving our ecommerce platform's infrastructure and processes. Leveraging your expertise in observability tools such as New Relic, Scalyr/Splunk, bash scripts, and Python scripts, you will play a pivotal role in ensuring the reliability and performance of our Java microservices-based architecture. Key Responsibilities : Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to Site Reliability Engineering (SRE) principles. Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs). Proactively identify and address potential issues before they impact operations, utilizing observability tools like New Relic, Scalyr/Splunk, bash scripts, and Python scripts. Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices. Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence. Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions. Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices. Design and maintain robust production monitoring systems to ensure timely detection and resolution of issues, following SRE guidelines for effective monitoring and alerting. Utilize a diverse array of tools to troubleshoot performance and stability issues effectively, employing SRE methodologies to identify and mitigate bottlenecks. Evaluate and enhance application and environment security measures, integrating SRE-driven security practices into the development and deployment pipelines. Provide support for globally distributed, multi-cloud (public and/or private) environments, implementing SRE strategies for resilience and fault tolerance. Automate repetitive tasks at scale to streamline operational workflows and enhance efficiency, focusing on the implementation of SRE-driven automation solutions. Adhere to change management processes during implementations and utilize version control for application infrastructure, following SRE principles for reliable and auditable change management. Foster a SRE mindset throughout the organization, promoting collaboration and shared responsibility for reliability and performance Qualifications : Bachelor's Degree in Computer Science or related field, or foreign equivalent. Demonstrated curiosity and self-drive to tackle complex challenges and drive change in a diverse organizational landscape. Excellent written and verbal communication skills, with the ability to effectively communicate with engineering management, developers, and leadership. Proven ability to adapt to new technologies and learn quickly. Minimum of 5 years of experience in Site Reliability Engineering (SRE) or related roles. Job Conditions : Collaborate within a diverse and global team environment. Participate in cross-training with other team members across different regions. Rotate in an on-call schedule as required to ensure 24/7 availability and support for critical systems. In accordance with the Pay Transparency requirements, the following represents a good faith estimate of the compensation range for this position. At BJ’s Wholesale Club, we carefully consider a wide range of non-discriminatory factors when determining salary. Actual salaries will vary depending on factors including but not limited to location, education, experience, and qualifications. The pay range for this position is starting from $109,000.00. About Us

At BJ's Wholesale Club, we’re focused on delivering unbeatable value and outstanding service to our members and communities. Headquartered in Marlborough, Massachusetts, BJ's Wholesale Club is a leading operator of membership warehouse clubs in the Eastern United States. Currently operating

more than 235

clubs,

over 165

BJ's Gas locations and eight distribution centers, we were the first retailer to introduce the warehouse club concept in the northeastern United States. Providing a curated assortment of grocery, general merchandise, gasoline and ancillary services, BJ's offers a differentiated shopping experience that is further enhanced by our omnichannel capabilities.

#J-18808-Ljbffr



  • Jersey City, United States BCforward Full time

    Job Title: Site Reliability Engineer (AWS) (SRE)Type: W2 (Strictly No C2C and no sponsorship available)Location: Jersey City or Plano or Delaware (Hybrid)Duration: 9 Months Contract to hireHybrid Model: 3 Days onsite 2 days remotea. Skillset AWS, Big Data, Spark, Python, Shell / Perl Scripting, Control-M, Autosys. Grafana, AppDynamics, APICAb. Experience•...


  • Jersey City, United States Pinnacle Group, Inc. Full time

    W2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...


  • Jersey City, United States Pinnacle Group, Inc. Full time

    W2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...


  • Jersey City, United States Pinnacle Group, Inc. Full time

    W2 only - Preferred Citizen or Green Card Holder Contract to Hire Must Have: AWS Certification7-8 years of experience and 2 years of AWS expTools: Grafana, DataDogDatabase: MySQL or Oracle-Unix, Linux, Shell Scripting, LAN, NFS-Python, Go Lang, Terraform, Jenkins -Docker, Kubernetes Site Reliability Engineer (AWS) (SRE)Roles and Responsibilities:• Design,...


  • Redwood City, California, United States C3 Full time

    We are looking for a Site Reliability Engineer to join our team at our HQ in Redwood City, CA.Responsibilities:Maximize system uptime and availability, ensuring functional and performance SLAs.Establish end-to-end monitoring and alerting on all critical aspects.Solve complex problems for critical services and build automation to prevent problem...


  • Salt Lake City, United States Global Channel Management Full time

    Site Reliability Engineer (SRE) Specialist 1 month ago Be among the first 25 applicants 4+ years of experience in a similar support engineering or site reliability engineering (SRE) role Proficiency in standard RPE with excellent written and verbal communication skills Strong knowledge of Linux operating systems Hands-on experience with Python for automating...


  • Jersey City, New Jersey, United States tapwage Full time

    There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni (A JP Morgan Chase Company), you will solve complex...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Foster City, United States Zoox Full time

    Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through...


  • Kansas City, United States Gorilla Logic Full time

    Gorilla Logic: Mid-Level Site Reliability Engineer (SRE) Gorilla Logic provides nearshore Agile teams to Fortune 500 and SMB companies, bringing unparalleled expertise in the delivery of full-stack web, mobile, and enterprise applications. Our highly collaborative Agile Gorillas are uniquely qualified to implement complex software initiatives. With offices...


  • Salt Lake City, United States Goldman Sachs Full time

    MORE ABOUT THIS JOB: Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for the availability and reliability of our firm's most critical platform services, and ensures they meet the...


  • Jersey City, New Jersey, United States Devexperts Full time

    Company DescriptionDevexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you'll become a part of a company that fosters self-improvement and actively seeks...


  • Kansas City, United States Gorilla Logic Full time

    Gorilla Logic Overview Gorilla Logic provides nearshore Agile teams to Fortune 500 and SMB companies, bringing unparalleled expertise in the delivery of full-stack web, mobile, and enterprise applications. Our highly collaborative Agile Gorillas are uniquely qualified to implement complex software initiatives. With offices in the United States, Costa Rica,...


  • Oklahoma City, United States Ajsearch Full time

    Safety & Reliability Engineer About Us Our client is a transatlantic cutting-edge aerospace company developing solar powered aircraft solutions capable of achieving perpetual flight with heavy, and powerful payload capacity. Utilizing technology based upon the longest continuous renewably powered flight program in history, this fast-growing startup is...


  • Salt Lake City, United States bioMerieux SA Career Site - MULTI-LINGUAL Full time

    Description Position SummaryThe reliability specialist is responsible for overseeing data monitoring, data reporting, investigations, and action plans for product and process performance within the instrument department. This role will also lead cross-functional technical teams through business-critical failure investigations and resolutions. Primary...


  • Arizona City, United States Brothers Consulting Full time

    Key Skills: • Experience with one or more Cloud Platforms (Azure, GCP) • Experience with Container technologies: Kubernetes, Docker, PKS, Azure Kubernetes Service (AKS) • 5+ years of experience in Site Reliability engineering • Experience setting up monitoring in applications and database. • Experience in ServiceNow, Jira,...


  • Jersey City, United States DevExperts Full time

    Devexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide. Interested in learning more about this job Scroll down and find out what skills, experience and educational qualifications are needed. By becoming...


  • Jersey City, United States DevExperts Full time

    Devexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide. By becoming a part of Devexperts, youll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our teams...


  • Jersey City, United States Devexperts Full time

    Devexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you’ll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our...


  • Jersey City, United States Devexperts Full time

    Devexperts has been working for nearly two decades consulting and developing for the financial industry. We solve complex technological challenges facing the most well-respected financial institutions worldwide.By becoming a part of Devexperts, you’ll become a part of a company that fosters self-improvement and actively seeks out-of-the-box ideas. Our...