Lead Site Reliability Engineer

2 weeks ago


San Diego, California, United States Kforce Full time

Lead Site Reliability Engineer (AWS/Azure) Kforce has a client that is seeking a Lead Site Reliability Engineer (AWS/Azure) in San Diego, CA. The Lead Site Reliability Engineer is responsible for driving the organizational reliability strategy and conducting resiliency design reviews to ensure the reliability, scalability, and performance of our company's software systems and applications meet organizational service level objectives (SLOs) and error budgets. The role is responsible for leading a team of Site Reliability Engineers in designing, implementing, and maintaining the infrastructure and tools necessary to support our platforms, as well as improving our monitoring, automation, and deployment processes. This role involves strategic planning, technical leadership, and collaboration with various stakeholders including Company's Product Delivery, Data Services, DevOps, DataOps, and Infrastructure teams to support organizational goals.
Responsibilities:
Drive the organizational reliability strategy.
Conduct resiliency design reviews.
Lead a team of Site Reliability Engineers.
Design, implement, and maintain infrastructure and tools.
Improve monitoring, automation, and deployment processes.
Collaborate with various stakeholders to support organizational goals.
Requirements:
Bachelor's degree or 8+ years demonstrated work experience or an equivalent combination of related training and experience, with at least three years in a leadership role.
Proven leadership experience and ability to manage a team.
Experienced in cloud-based hosting solutions (AWS, Azure).
Experienced with Cloud server environments (AWS, Azure).
Experienced in Agile software development best practices utilizing Continuous Integration & Delivery Pipelines.
Proven experience with large-scale software implementation.
Deep knowledge of software deployment, versioning (GIT), and release management processes.
Deep knowledge of infrastructure design, implementation, and support.
Collaborate with stakeholders to define RPO/RTO for Company's system footprint.
Expert in Cloud-based redundancy, high availability, and reliability strategies.
Expert in reliability, scalability, and performance optimization.
Expert at maintaining Linux/Unix and Windows systems administration.
Strong Linux and Windows Administration & scripting skills.
Solid Database Administration skills (MySQL, MariaDB, RDS, SQL Server, and Azure Storage services).
Deep knowledge of current methodologies in high performance operations.
Proficient at automated provisioning and configuration management.
Excellent written and verbal communication skills.
Highly adaptable and capable of working in a fast-paced environment.
Ability to create DR strategies and execute DR drills.
Ability to interact with external customers and staff members.
Compensation:
The pay range is $150,000.00/yr - $170,000.00/yr. Actual pay will be based on skills and experience.
Benefits:
We offer comprehensive benefits including medical/dental/vision insurance, HSA, FSA, 401(k), and life, disability & ADD insurance to eligible employees.
Kforce is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, protected veteran status, or disability status.

#J-18808-Ljbffr


  • Site Reliability Lead

    22 hours ago


    San Diego, California, United States Kforce Full time

    About the Role:The Lead Site Reliability Engineer will be responsible for leading a team of engineers in designing, implementing, and maintaining infrastructure and tools. The role involves strategic planning, technical leadership, and collaboration with various stakeholders.Key Responsibilities:Developing the organizational reliability strategy.Conducting...


  • San Diego, California, United States Talent Software Services Full time

    Site Reliability Engineer Job Summary: Talent Software Services is in search of a Site Reliability Engineer for a contract position in San Diego, CA. The opportunity will be six months with a strong chance for a long-term extension. Position Su...


  • San Diego, California, United States Talent Software Services Full time

    Site Reliability EngineerJob Summary: Talent Software Services is in search of a Site Reliability Engineer for a contract position in San Diego, CA. The opportunity will be six months with a strong chance for a long-term extension.Position Su...


  • San Diego, California, United States Ursus Inc Full time

    JOB TITLE: Site Reliability EngineerLOCATION: San Diego, CADURATION: 6 monthsPAY RANGE: $70.00 - $80.69/hrTOP 3 SKILLS: 3+ years of professional Site Reliability experience operating at scale in a high-paced environment.4+ years of hands-on experience with AWS, Kubernetes, Infrastructure as Code, monitoring, and alerting.Experience with building Kubernetes...


  • San Diego, California, United States Ursus Inc Full time

    JOB TITLE: Site Reliability Engineer LOCATION: San Diego, CA DURATION: 6 months PAY RANGE: $70.00 - $80.69/hr TOP 3 SKILLS: 3+ years of professional Site Reliability experience operating at scale in a high-paced environment. 4+ years of hands-on experience with AWS, Kubernetes, Infrastructure as Code, monitoring, and alerting. Experience with building...


  • San Diego, California, United States Ursus Inc Full time

    JOB TITLE: Site Reliability Engineer LOCATION: San Diego, CA DURATION: 6 months PAY RANGE: $70.00 - $80.69/hr TOP 3 SKILLS: 3+ years of professional Site Reliability experience operating at scale in a high-paced environment. 4+ years of hands-on experience with AWS, Kubernetes, Infrastructure as Code, monitoring, and alerting. Experience with building...


  • San Diego, California, United States Apple Full time

    Summary The Video Computer Vision organization is working on exciting technologies for future Apple products. Our focus is on ML based solution around real time image and video. We have contributed to the FaceID and FaceKit project in the past and more recently the new LIDAR iPad sensor. We are looking for the right Site Reliability Engineer to help us take...


  • San Diego, California, United States Talent Software Services Full time

    Talent Software Services is seeking a Site Reliability Engineer for a six-month contract position in San Diego, CA.The opportunity has a strong chance for long-term extension.As a Site Reliability Engineer, you will play a crucial role in ensuring the stability and performance of our software systems.Your primary focus will be on designing, implementing, and...


  • San Diego, California, United States Motion Recruitment Full time

    Direct message the job poster from Motion RecruitmentOur Client, an A Global Media/Entertainment Company , is looking for a Site Reliability Engineer to join their team in San Diego, CAPay: $83/hourHybrid***This is a 6 Month Contract Open to Conversion OR Extension***As the Site Reliability Engineer you will be part of the CICD and Cloud SRE team supporting...


  • San Diego, California, United States Talent Software Services Full time

    Site Reliability EngineerCheck all associated application documentation thoroughly before clicking on the apply button at the bottom of this description.Job Summary: Talent Software Services is in search of a Site Reliability Engineer for a contract position in San Diego, CA. The opportunity will be six months with a strong chance for a long-term...


  • San Diego, California, United States Innova software Services Inc Full time

    Job Description Job Description Job Title: Site Reliability Engineer (SRE)Location: San Diego, CARate: $60-65/hrJob Description:We are seeking an experienced and proactive Site Reliability Engineer (SRE) to join our team in San Diego, CA. This role will be responsible for ensuring high availability, reliability, and recoverability of platforms, leveraging...


  • San Diego, California, United States Innova software Services Inc Full time

    Job Title: Site Reliability Engineer (SRE) Location: San Diego, CA Rate: $60-65/hr Job Description: We are seeking an experienced and proactive Site Reliability Engineer (SRE) to join our team in San Diego, CA. This role will be responsible for ensuring high availability, reliability, and recoverability of platforms, leveraging best practices and...


  • San Ramon, California, United States Litmus7 Full time

    Job Description: At Litmus7, we are seeking a highly skilled Site Reliability Engineering Lead to join our team. This role will be responsible for ensuring the reliability, performance, and availability of our applications.Key Responsibilities: - Monitor, automate, and improve the reliability of our applications - Collaborate with cross-functional teams to...


  • San Diego, California, United States REQ Solutions Full time

    Job Title: Site Reliability EngineerLocation: San Diego CA 92127 Contract : 6 Months ContractShift: 8:00 AM - 5:00 PM (Monday - Friday)Note: Candidates who can work on W2 bases without any sponsorship are encouraged to applyResponsibilities include:Contributes to a team of Engineers to deliver and support highly available, self-service, CI/CD...


  • San Diego, California, United States REQ Solutions Full time

    Job Title: Site Reliability EngineerContract: 6 Months ContractShift: 8:00 AM - 5:00 PM (Monday - Friday)Note: Candidates who can work on W2 bases without any sponsorship are encouraged to applyBase pay range $80.00/hr - $85.00/hrResponsibilities include:Contributes to a team of Engineers to deliver and support highly available, self-service, CI/CD...


  • San Diego, California, United States ACL Digital Full time

    W2 Contract || Only looking for candidates based in CaliforniaJob Title: Site Reliability EngineerLocation: San Diego, CA (Open to other locations in CA)Duration: 06+ Months (Possible Extension)Job Description:Overview:As a member of the CICD and Cloud Reliability team you'll work at the heart of the client Network to ensure a high-performing platform that...


  • San Diego, California, United States REQ Solutions Full time

    Job Title: Site Reliability EngineerLocation: San Diego CA 92127 Contract : 6 Months ContractShift: 8:00 AM - 5:00 PM (Monday - Friday)Note: Candidates who can work on W2 bases without any sponsorship are encouraged to applyResponsibilities include:Contributes to a team of Engineers to deliver and support highly available, self-service, CI/CD...


  • San Diego, California, United States Leidos Holding Full time

    Leidos is seeking a talented Engineering Site Lead to join our diverse team and create unique solutions to complex problems for our Space Sensor Department in San Diego.With offices across the United States engaging in the defense, space, cyber and commercial fields, Leidos provides responsive, cost-effective engineering, scientific and IT solutions.The...


  • San Diego, California, United States Apple Full time

    The Atlassian Services Site Reliability Engineer (SRE) role resides within the Software Delivery organization, which is at the core of the Apple software release process. This role is responsible for applying SRE practices in maintaining Atlassian se Reliability Engineer, Liability, Reliability, Engineer, Atlassian, Reliability, Technology


  • San Diego, California, United States CoStar Group Full time

    Job SummaryWe are seeking a Senior Site Reliability Engineer to join our team. The successful candidate will have an opportunity to make fundamental contributions on improving the training and inference efficiency of deep learning models.About UsOur research organization is committed to advancing the field of artificial intelligence. We are passionate about...