Current jobs related to Site Reliability Engineer II - Plano - Bank of America


  • Plano, United States Capgemini Engineering Full time

    Job Title: Site Reliability EngineerJob Description:As a member of Site Reliability Engineering, you will play a critical role in establishing and implementing a site reliability strategy for an innovative leader in the MedTech industry. You will partner with and influence our architecture and engineering teams in delivering highly resilient software...


  • Plano, United States Amtex Systems Inc. Full time

    Title: Site Reliability Engineer Location: Plano, TX Duration: 6+ months Locals ONLYExperience Level : 10 + years • Should be strong SRE, experience with java, AWS / DevOps / deployment strategy and monitoring tools. Candidates should be with more hands-on experience with Dynatrace / Splunk / CICD / Grafana etc. • Looking for resource with very good...


  • Plano, Texas, United States Dexian Full time

    Job Title: Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Dexian. As a key member of our Incident Management team, you will be responsible for establishing frameworks, best practices, and scope management as we transition Incident Management into a Site Reliability Engineering team.Key...


  • Plano, Texas, United States Capgemini Engineering Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Capgemini Engineering. As a Site Reliability Engineer, you will play a critical role in establishing and implementing a site reliability strategy for our clients in the MedTech industry.Key ResponsibilitiesDevelop and Implement SRE Strategy: Partner with our digital...


  • Plano, Texas, United States Toyota Full time

    About the RoleWe are seeking a highly skilled Director of Site Reliability Engineering to lead our new SRE team at Toyota Financial Services. As a key member of our organization, you will be responsible for building and establishing robust processes to ensure the reliability, performance, and scalability of our systems and applications.Key...


  • Plano, United States Amtex Systems Inc. Full time

    • Should be strong SRE, experience with java, AWS / DevOps / deployment strategy and monitoring tools. Candidates should be with more hands-on experience with Dynatrace / Splunk / CICD / Grafana etc. • Looking for resource with very good application trouble shooting experience. More on core SRE metrics before going to Prod. uptime vs availability,...


  • Plano, Texas, United States Amtex Systems Inc. Full time

    About the RoleAmtex Systems Inc. is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and stability of our applications and platforms.Key ResponsibilitiesChampion Site Reliability Culture: Demonstrate and promote site reliability culture and practices...


  • Plano, Texas, United States AT&T Full time

    Job Title: Principal Site Reliability EngineerAT&T is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our Consumer Technology experience team, you will be responsible for delivering innovative and reliable technology solutions to power differentiated, simplified customer experiences.Key...


  • Plano, Texas, United States Pizza Hut Full time

    About the RoleWe are seeking a highly skilled Senior Manager of Engineering Site Reliability to lead our SRE teams and drive the maturation of Site Reliability Engineering best practices and processes.Key ResponsibilitiesLead a team of 15 people with 2 managers and work closely with Product Engineering Managers and other PO Managers.Own the relationship with...


  • Plano, Texas, United States AT&T Full time

    Job Title: Principal Site Reliability EngineerAT&T is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our Consumer Technology experience team, you will be responsible for ensuring the high availability, reliability, and resiliency of our customer and agent-facing experiences and shared omnichannel...


  • Plano, Texas, United States Toyota Full time

    About the RoleWe are seeking a highly experienced Director to lead our Site Reliability Engineering (SRE) team at Toyota. As a key member of our organization, you will be responsible for building and managing a high-performing team that ensures the reliability, performance, and scalability of our systems and applications.Key ResponsibilitiesTeam Leadership:...


  • Plano, Texas, United States Toyota North America Full time

    About UsToyota is a collaborative and respectful organization where innovation thrives. As a globally recognized brand, we are at the forefront of mobility solutions that enhance lives and exceed expectations. We are committed to nurturing diverse talent and providing opportunities for professional growth.Position OverviewWe are initiating a new Site...


  • plano, United States AT&T Full time

    NOTE: This position is "hybrid" 3 days a week onsite in our Plano, Texas location. (this is NOT remote)This is for a very high level Principal Site Reliability Engineer.Join AT&T and reimagine the communications and technologies that connect the world. Our Consumer Technology experience team is delivering innovative and reliable technology solutions to power...


  • Plano, United States AT&T Full time

    NOTE: This position is "hybrid" 3 days a week onsite in our Plano, Texas location. (this is NOT remote)This is for a very high level Principal Site Reliability Engineer.Join AT&T and reimagine the communications and technologies that connect the world. Our Consumer Technology experience team is delivering innovative and reliable technology solutions to power...


  • Plano, Texas, United States Toyota North America Full time

    About UsToyota is a company built on collaboration and respect, where innovation meets high-quality solutions to enhance lives. We are committed to fostering a diverse workforce that embodies our values of dreaming, doing, and growing together.Position OverviewAs a pivotal member of Toyota Financial Services, you will take the lead in establishing a new Site...


  • Plano, Texas, United States CarMax Full time

    **About the Role**CarMax is seeking a highly skilled Senior Manager Site Reliability Engineering to join our team. As a key member of our technology organization, you will play a crucial role in collaborating with business and technology counterparts to deliver solutions that align with our strategic objectives.**Key Responsibilities**Oversee the strategic...


  • Plano, Texas, United States Toyota North America Full time

    About UsToyota is a name synonymous with innovation and quality. Our culture is built on collaboration and respect, fostering an environment where creativity thrives. As a leader in the automotive industry, we are committed to shaping the future of mobility through cutting-edge solutions that enhance lives and provide exceptional experiences for our...


  • Plano, Texas, United States Toyota North America Full time

    About UsToyota is a company that embodies collaboration and respect, where innovation meets quality. We are committed to shaping the future of mobility with solutions that enhance lives and bring joy to our customers. We are on the lookout for talented individuals who are eager to contribute to our mission.At Toyota, we prioritize the growth of our...


  • Plano, Texas, United States Toyota North America Full time

    About UsToyota is a company that thrives on collaboration and respect, creating an environment where innovation flourishes. As a globally recognized brand, we are committed to advancing mobility through cutting-edge, high-quality solutions that enhance lives and bring joy to our customers. We seek diverse and talented individuals who are eager to contribute...


  • Plano, Texas, United States Compunnel Inc. Full time

    Position: Senior Production Support EngineerCompany: Compunnel Inc.Type: ContractLocation: RemoteOverview:The Senior Production Support Engineer will play a crucial role in maintaining the integrity and performance of enterprise applications. This position focuses on Engineering Operations and Production Support with a strong emphasis on Site Reliability...

Site Reliability Engineer II

3 months ago


Plano, United States Bank of America Full time
Job Description:

About us:

At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. Responsible Growth is how we run our company and how we deliver for our clients, teammates, communities and shareholders every day.

One of the keys to driving Responsible Growth is being a great place to work for our teammates around the world. We're devoted to being a diverse and inclusive workplace for everyone. We hire individuals with a broad range of backgrounds and experiences and invest heavily in our teammates and their families by offering competitive benefits to support their physical, emotional, and financial well-being.

Bank of America believes both in the importance of working together and offering flexibility to our employees. We use a multi-faceted approach for flexibility, depending on the various roles in our organization.

Working at Bank of America will give you a great career with opportunities to learn, grow and make an impact, along with the power to make a difference. Join us

Job Description:

This job is responsible for partnering with engineering and technology teams to implement measures as prescribed by lead/senior SRE engineers. Key responsibilities include ensuring appropriate instrumentation, tooling, ticketing, alerting and on call routines are in place for key services, identifying root causes of issues through production triage efforts, and suggesting code enhancements to technology teams to automate services and improve reliability and efficiency. Job expectations include using software development skills to improve efficiency and to address gaps in reliability.

Overview:

Site Reliability Engineer II (Hadoop Admin) role supporting NextGen Platforms built around Big Data Technologies (AI/ML, Hadoop, Jupyter Notebook, Spark, Kafka, Impala, Hbase, Docker-Container, Ansible and many more). Requires experience in cluster management of vendor based Hadoop and Data Science (AI/ML) products like C3, Cloudera, Talend, Trifacta, Selerity, ELK, KPMG Ignite etc. Analyst is involved in the full life cycle of an application and part of an agile development process. They require the ability to interact, develop, engineer, and communicate collaboratively at the highest technical levels with clients, development teams, vendors and other partners. The following section is intended to serve as a general guideline for each relative dimension of project complexity, responsibility, and education/experience within this role.

Responsibilities:
  • Develops and maintains reliability scripts, tools and libraries and leverages them for common instrumentation, automation, and operational needs, and when mentoring Site Reliability Engineer (SRE) resources on reliability practices and established tools/capabilities
  • Collaborates with Development and Infrastructure teams to understand technical solutions and implement monitoring capabilities outlined in the application and system monitoring designs put forward by the SRE Lead
  • Partners to implement code changes to make use of common reliability libraries and tools and helps Application Production Services and Application Development teammates understand how to use them
  • Identifies vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and 'noise' in monitoring, and defines solutions to reduce manual support effort and/or improve system reliability
  • Engages as a subject matter expert in major incident triage efforts and failure scenario modelling and diagnosis with Problem Manager root causes for major incident/problem management investigations
  • Participates regularly in an on-call rotation with Production Support teammates to learn more about reliability issues affecting their portfolio
  • Works on complex, major or highly visible tasks in support of multiple projects that require multiple areas of expertise
  • Team member will be expected to provide subject matter expertise in managing Hadoop and Data Science Platform operations with focus around Cloudera Hadoop, Jupyter Notebook, Openshift, Docker-Container Cluster Management and Administration
  • Integrates solutions with other applications and platforms outside the framework
  • He / She will be responsible for managing platform operations across all environments which includes upgrades, bug fixes, deployments, metrics / monitoring for resolution and forecasting, disaster recovery, incident / problem/ capacity management
  • Serves as a liaison between client partners and vendors in coordination with project managers to provide technical solutions that address user needs
Required Qualifications:
  • 5+ years of combined Technology experience in an Enterprise environment
  • Docker, OpenShift/Kubernetes, Database (SQL, Cassandra, Postgres), Jupyter Notebook
  • Strong technical knowledge: Unix/Linux; Database (Sybase/SQL/Oracle), Java, Python, Perl, Shell scripting, Infrastructure.
  • Experience in Monitoring & Alerting, and Job Scheduling Systems
  • Being comfortable with frequent, incremental code testing and deployment
  • Strong grasp of automation / DevOps tools - Ansible, Jenkins, SVN, Bitbucket
Desired Qualifications:
  • Bachelor's degree or equivalent, preferably in a technical or engineering discipline
  • Cloudera Big Data Stack, Hadoop, Impala, Hive, Spark, Kafka, Impala, Hive, Hbase
Skills:
  • Analytical Thinking
  • Automation
  • Collaboration
  • Production Support
  • Result Orientation
  • Application Development
  • Architecture
  • Influence
  • Project Management
  • Solution Design
  • Adaptability
  • DevOps Practices
  • Risk Management
  • Solution Delivery Process
  • Stakeholder Management


Shift:
1st shift (United States of America)

Hours Per Week:
40