Senior Site Reliability Engineer

2 weeks ago


Raleigh, United States Cisco Full time

Who We Are

Today’s results-oriented business environment is more than that – it’s a period of disruption between the pandemic, global business change and internal process complexity. For us to focus on simplicity and the best customer experience, we need great talent and the right skillsets to be successful. This is now a mantra for our Cisco leadership team and for us. The Digital Enterprise Solutions team is changing the way we run Cisco’s operations by improving the power of technology, the best of business processes and outstanding data insights. Together, we will Reinvent the Cisco experience. Show the world how to Reinvent applications and demonstrate the future of the Internet to showcase the power of Cisco: our people, products, processes, systems, and data. Please join us and make this journey together Be part of IT Cloud Strategy efforts passionate about modernization and re-engineering of on-prem file storage, block storage and backup infrastructure with Distributed Storage, including distributed block storage, object storage, file storage and SAN Storage.

Who You'll Work With

You’ll be part of Storage SRE team passionate about automating and modernizing Cisco Storage Infrastructure (NetApp, EMC, Ceph, SolidFire) as part of IT transformation to cloud strategy. This team is comprised of Architects, Design Engineers, SME's and software developers organized in agile teams with daily or weekly scrum meetings using JIRA as a tool to track all activities with scrum or Kanban approach.

Who You Are

You are an articulate communicator with effective verbal and written skills, capable of engaging successfully with team members and stakeholders alike. Your strong analytical and problem-solving abilities set you apart, enabling you to offer creative alternatives, conduct in-depth root cause analyses, and present thoughtful proposals. With a customer solutions mindset, you bring a blend of strong interpersonal skills and leadership qualities to the table, always ready to guide and inspire those around you. As a Site Reliability Engineer, you have demonstrated leadership by spearheading projects that achieved significant infrastructure enhancements. Your extensive experience in cloud planning, migration, and implementation with platforms like AWS, GCP, and Azure is a testament to your expertise. In addition, you possess a robust background in DevOps practices as well as a deep understanding of object storage and archival solutions. You are a forward-thinking leader who has consistently delivered large-scale infrastructure improvements and is poised to bring your comprehensive skill set to our team.

What You’ll Do

As a Site Reliability Engineer, you will play a crucial role in managing and optimizing SRE operations for an array of storage solutions including distributed block storage, object storage, file storage, and SAN Storage. You will focus on automating processes, leveraging a DevOps and SRE mindset to foster user self-service capabilities and system self-recovery features, and will actively participate in agile infrastructure software development meetings. You have a solid understanding of Linux operations and expertise in storage systems running on both virtual and physical infrastructures. You are a subject matter expert for the implementation of continuous integration and continuous delivery (CI/CD) pipelines for storage infrastructure as code. You will manage revision control and automate testing, provisioning, and deprovisioning workflows within the storage domain, ensuring smooth operations across both public and private cloud platforms. You have the ability to handle complex provisioning tasks, advanced maintenance, data replication, disaster recovery, data migration, and the creation of comprehensive documentation for Storage and Backup environments.

You will interact with storage vendors to configure and fine-tune storage systems, troubleshoot issues, and determine the root cause to prevent future problems. In terms of Capacity Management, you will automate the monitoring and reporting of storage usage and trends, facilitating accurate capacity planning. You will take part in ongoing projects and evaluate new technologies related to storage solutions and migrations, code upgrades, and data protection and management activities. You will be responsible for establishing guidelines and best practices for storage standards on public cloud platforms, including AWS, Azure, and GCP Storage services such as S3, EBS, FSx, EFS, and other related offerings, ensuring a secure, efficient, and scalable storage infrastructure.

Minimum Qualifications:

7+ years of experience supporting enterprise storage Experience implementing, optimizing, and managing global enterprise scale storage, including Enterprise Scale SAN and NAS and Software Defined storage. Enterprise Backup solutions and Replication technologies, software defined storage and Public Cloud platform and protocols. Experience with cloud and on-prem enterprise storage environments such as EMC VMAX/PowerMax, Data domain, SRDF, SnapMirror, NetApp, Veritas Netbackup, VMware, EC2, EBS, FSx or S3, S3 Glacier or S3 Glacier Deep Archive Experience with cloud-based software development tools and methodologies such as Git, CI/CD, CodeDeploy, CodePipeline, Jenkins, Build Automation and Testing Experience in programming languages such as, Python, PowerShell, Ruby, GoLang or Bash Experience with infrastructure and configuration management tools such as, RedHat, CentOS/Windows, VMware, Puppet or Ansible

Preferred Qualifications:

Bachelors Degree in STEM

Why Cisco

#WeAreCisco. We are all unique, but collectively we bring our talents to work as a team, to develop innovative technology and power a more inclusive, digital future for everyone. How do we do it? Well, for starters – with people like you

Nearly every internet connection around the world touches Cisco. We’re the Internet’s optimists. Our technology makes sure the data traveling at light speed across connections does so securely, yet it’s not what we make but what we make happen which marks us out. We’re helping those who work in the health service to connect with patients and each other; schools, colleges, and universities to teach in even the most challenging of times. We’re helping businesses of all shapes and sizes to connect with their employees and customers in new ways, providing people with access to the digital skills they need and connecting the most remote parts of the world – whether through 5G, or otherwise.

We tackle whatever challenges come our way. We have each other’s backs, we recognize our accomplishments, and we grow together. We celebrate and support one another – from big and small things in life to big career moments. And giving back is in our DNA (we get 10 days off each year to do just that). #J-18808-Ljbffr



  • Raleigh, United States Envestnet Asset Management, Inc Full time

    Senior Site Reliability Engineer page is loaded Senior Site Reliability Engineer Apply locations Raleigh time type Full time posted on Posted 30+ Days Ago job requisition id Req 20.357 - It's fun to work in a company where people truly BELIEVE in what they're doing! We're committed to bringing passion and customer focus to the business. Job Description...


  • Raleigh, United States Veradigm® Full time

    Welcome to Veradigm! Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the largest community of clients in healthcare, Veradigm is able to deliver an...


  • Raleigh, United States Bandwidth Full time

    Apply Now Site Reliability Engineer at Bandwidth Raleigh, NC Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie....


  • Raleigh, United States Bandwidth Full time

    Apply Now Site Reliability Engineer at Bandwidth Raleigh, NC Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie....


  • Raleigh, United States Bandwidth Inc. Full time

    Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie. Configure Datadog and Grafana alerts and Application Health...


  • Raleigh, United States Bandwidth Recruitment Full time

    Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie. Configure Datadog and Grafana alerts and Application Health...


  • Raleigh, North Carolina, United States Bandwidth Inc. Full time

    Site Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie. Configure Datadog and Grafana alerts and Application Health...


  • Raleigh, United States Bandwidth Full time

    Job DescriptionJob DescriptionSite Reliability Engineer (Raleigh, NC) Duties: Work closely with leadership and internal partners to ensure that software meets security, SLA, performance, and capacity requirements. Set up and maintain monitoring tools and systems to detect issues using Datadog Monitors and Alert using OpsGenie. Configure Datadog and Grafana...


  • Raleigh, United States Allscripts Full time

    Welcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today’s healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At Veradigm, our primary focus is on harnessing the power of research, analytics,...


  • Raleigh, United States Veradigm Full time

    Welcome to Veradigm, where our Mission is transforming health, insightfully. Join the Veradigm team and help solve many of today's healthcare challenges being addressed by biopharma, health plans, healthcare providers, health technology partners, and the patients they serve. At Veradigm, our primary focus is on harnessing the power of research, analytics,...


  • Raleigh, United States Oracle Full time

    Oracle Senior Site Reliability Developer Raleigh , North Carolina Apply Now Customers rely on Oracle Cloud Infrastructure (OCI) to power their business as they tackle some of the world’s biggest challenges. We’re looking for Senior Site Reliability Developers/Engineers who would be responsible for Advanced Operations (AO) and critical issues of...


  • Raleigh, United States Qualys Full time

    Come work at a place where innovation and teamwork come together to support the most exciting missions in the world! Site Reliability Engineer, Cloud Platform ** The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S. citizen, U.S. national, lawful permanent resident, asylee, or refugee).This...


  • Raleigh, United States Raleigh Founded Full time

    The Role Kaleido is growing rapidly and seeking a Site Reliability Engineer with a passion for enterprise blockchain software. The successful candidate will be a highly skilled DevOps Engineer who is also passionate about systems stability, security and operational efficiency. If this describes you and you are able to thrive in a fast paced, high-growth, and...


  • Raleigh, United States Qualys Full time

    Come work at a place where innovation and teamwork come together to support the most exciting missions in the world! Site Reliability Engineer, Cloud Platform ** The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S. citizen, U.S. national, lawful permanent resident, asylee, or refugee). This...


  • Raleigh, United States Qualys Full time

    Come work at a place where innovation and teamwork come together to support the most exciting missions in the world! Site Reliability Engineer, Cloud Platform ** The successful applicant will be performing work in FedRAMP environments, and therefore, must be a U.S. Person (i.e. U.S. citizen, U.S. national, lawful permanent resident, asylee, or refugee).This...


  • Raleigh, United States DSJ Global Full time

    Job Title: Reliability Engineer Industry: Chemicals/Food & Beverage Location : North Carolina DSJ Global is currently partnered with a Fortune 500 manufacturing company based out of North Carolina who are looking for their next Reliability Engineer. As the Reliability Engineer, you will be responsible for the development, direction, supervision, and day to...


  • Raleigh, United States Biogen Full time

    The Sr. Reliability Engineer I applies Reliability Engineering methodologies to optimize design requirements and performance of critical assets across the site. Originates and develops analysis methods for determining reliability of components, equipment and processes. Acquires data and analyzes the data. Prepares and communicates information to define...


  • Raleigh, United States Cisco Full time

    Location: RTP, North Carolina, US Area of Interest Job Type Professional Cloud and Data Center, Software Development Job Id 1421649 Who We Are Today’s business environment is more than that – it’s a period of disruption between the pandemic, global business change and internal process complexity. For us to focus on simplicity and the best customer...

  • Software Engineer

    6 days ago


    Raleigh, United States Celonis GmbH Full time

    The Team: Site Reliability Engineering The Role: You will be part of a highly technical, collaborative and creative team, with a focus on SRE & Software Engineering We design, write and deliver software, improve availability, scalability and efficiency of our product We constantly improve our monitoring, metrics and KPIs as well as define and implement...

  • Technical Leader

    1 week ago


    Raleigh, United States Cisco Full time

    *** The successful applicant will be performing work on US Government classified environments, and therefore, must be a U.S. Person (i.e., U.S. citizen, U.S. national, lawful permanent resident, asylee, or refugee). This position may also perform work that the U.S. government has specified can only be performed by a U.S. citizen on U.S. soil. *** Who We Are...