Senior Site Reliability Engineer

5 days ago


Agoura Hills, CA, United States Lakeview Loan Servicing Full time
Overview

Lakeview IT is passionate about delivering high-quality products and services to our customers. Our technology operations team is committed to ensuring reliable, scalable, and high-performing services for our clients. We are looking for a talented and motivated Site Reliability Engineer to join our dynamic team and help us continue to build and maintain a world-class infrastructure.

The Sr. Site Reliability Engineer at Lakeview is responsible for ensuring the availability, performance, and scalability of the company's critical systems. They will lead the design and implementation of infrastructure solutions, focusing on automation, monitoring, and high reliability. This role involves optimizing system performance, managing incident responses, and conducting post-mortems to drive continuous improvement. The engineer will also work closely with engineering, development, and operations teams to create and enforce best practices, establish service-level objectives, and ensure seamless deployment processes. Additionally, mentoring junior team members and driving key architectural decisions are essential aspects of the role to build a culture of reliability and operational excellence. Salary range for the role is between $130,000 and $150,000 with an annual bonus. The position can be 100% remote, but if located in the Agoura Hills, CA area the expectation will be that the role is hybrid.

Responsibilities

  • Proactively identify and resolve incidents before they impact operations.
  • Monitor all systems and infrastructure for the highest level of availability.
  • Perform routine maintenance tasks, including monitoring, patching, and backups.
  • Respond to incidents and outages in a timely and effective manner.
  • Collaborate with other teams to diagnose and resolve complex issues.
  • Document incident details and implement corrective actions to prevent recurrence.
  • Document processes, configurations, and troubleshooting procedures.
  • Diagnose and resolve application performance problems or system outages.
  • Play the role of Incident Manager during outages.
  • Resolve complex hardware and software issues, and work with vendors when necessary.
  • Optimize system performance and resource utilization on-prem and in the cloud.
  • Develop and maintain automation scripts to streamline repetitive tasks.
  • Utilize scripting languages (e.g., PowerShell, Python, etc.) to automate system administration.
  • Implement configuration management tools to ensure consistency and repeatability.
  • Create and maintain comprehensive documentation of IT processes and procedures.
  • Lead the design, development, and implementation of reliable, scalable infrastructure systems.
  • Mentor junior SREs, guiding on best practices and technical issues.
  • Architect and execute disaster recovery and high-availability plans.
  • Drive incident management processes, ensuring swift and effective resolution of critical issues.
  • Optimize system performance through proactive monitoring, tuning, and capacity planning.
  • Lead root cause analysis and post-mortem discussions to identify long-term fixes.
  • Develop and maintain complex automation scripts to enhance system reliability.
  • Influence reliability improvements within the engineering organization, promoting a culture of observability and resilience.
  • Champion the adoption of new tools and technologies that enhance system stability and deployment efficiency.
  • Communicate effectively with stakeholders and executive leadership regarding system status, incidents, and upcoming reliability initiatives.
Qualifications
  • Strong understanding of IT infrastructure components, including servers, networks, and storage.
  • Knowledge in scripting languages (e.g., PowerShell, Python).
  • Knowledge of networking concepts and protocols (e.g., TCP/IP, DNS, DHCP).
  • Experience with IT service management frameworks.
  • Experience with cloud platforms such as AWS and Azure.
  • Experience of virtualization technologies such as Azure VDI, AWS Workspaces.
  • Experience with monitoring and alerting tools (e.g., New Relic, Datadog).
  • Excellent problem-solving and analytical skills.
  • Strong communication and interpersonal skills.
  • Extensive expertise in the Windows operating system.


Physical Demands and Work Environment

The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

While performing the duties of this job, the employee is regularly required to sit and use hands to handle, touch or feel objects, tools, or controls. The employee frequently is required to talk and hear. The noise level in the work environment is usually moderate. The employee is occasionally required to stand; walk; reach with hands and arms. The employee is rarely required to stoop, kneel, crouch, or crawl. The employee must regularly lift and/or move up to 50 pounds. Specific vision abilities required by this job include close vision, color vision, and the ability to adjust focus.

EEOC

Lakeview is an Equal Employment Opportunity employer. All aspects of consideration for employment and employment with the Company are governed on the basis of merit, competence, and qualifications without regard to race, color, religion, sex, national origin, age, disability, veteran status, sexual orientation, or any other category protected by federal, state, or local law.

  • Woodland Hills, CA, United States Blue Shield of CA Full time

    Your Role We are seeking an Experienced Site Reliability Engineer (SRE) to lead reliability, scalability, and performance initiatives across our production systems. In this role, you will blend software engineering, automation, and systems operations to ensure that our platforms are resilient, efficient, and continuously improving. You will be part of a...


  • Agoura Hills, CA, United States OSI Engineering Full time

    A leading chip and silicon IP company is seeking a talented Senior Analog IC Design Engineer to join its Bufferchip Design team in Agoura Hills, California. This is an exciting opportunity to work alongside some of the brightest minds in the industry on innovative products that enhance data speed and security. In this full-time role, the Senior Analog IC...


  • Agoura Hills, CA, United States OSI Engineering Full time

    A leading chip and silicon IP company is seeking a talented Senior Analog IC Design Engineer to join its Bufferchip Design team in Agoura Hills, California. This is an exciting opportunity to work alongside some of the brightest minds in the industry on innovative products that enhance data speed and security. In this full-time role, the Senior Analog IC...

  • Sr. DevOps Engineer

    4 days ago


    Agoura Hills, CA, United States Sky Systems Full time

    Sr. Devops Engineer Agoura Hills, CA (onsite interview) Long term contract (6 months +) 2 days onsite each week The Senior DevOps Engineer on the Cloud Engineering team will be responsible for designing, automating, and optimizing cloud infrastructure and deployment pipelines to ensure scalability, security, and operational efficiency. This role requires a...

  • Sr. DevOps Engineer

    1 week ago


    Agoura Hills, CA, United States Sky Systems Full time

    Sr. Devops Engineer Agoura Hills, CA (onsite interview) Long term contract (6 months +) 2 days onsite each week The Senior DevOps Engineer on the Cloud Engineering team will be responsible for designing, automating, and optimizing cloud infrastructure and deployment pipelines to ensure scalability, security, and operational efficiency. This role requires a...

  • Sr. DevOps Engineer

    1 week ago


    Agoura Hills, CA, United States Sky Systems Full time

    Sr. Devops Engineer Agoura Hills, CA (onsite interview) Long term contract (6 months +) 2 days onsite each week The Senior DevOps Engineer on the Cloud Engineering team will be responsible for designing, automating, and optimizing cloud infrastructure and deployment pipelines to ensure scalability, security, and operational efficiency. This role requires a...

  • Sr. DevOps Engineer

    1 week ago


    Agoura Hills, CA, United States Sky Systems Full time

    Sr. Devops Engineer Agoura Hills, CA (onsite interview) Long term contract (6 months +) 2 days onsite each week The Senior DevOps Engineer on the Cloud Engineering team will be responsible for designing, automating, and optimizing cloud infrastructure and deployment pipelines to ensure scalability, security, and operational efficiency. This role requires a...


  • Beverly Hills, CA, United States Ritchie Bros. Full time

    Job Description As a Senior Manager, Software Engineering for our Cloud & Reliability team, you will lead the engineering group responsible for SmartEquip's production and pre-production runtime environments. Your team is the critical counterpart to our platform delivery function, responsible for the stability, scalability, and security of the cloud platform...


  • Agoura Hills, CA, United States Teradyne Full time

    We are the global test and automation specialists, powering next-generation technologies through sophisticated solutions. Behind every electronic device you use, Teradyne's test technology ensures your device works right the first time, every time! Our portfolio of automation solutions help manufacturers to develop and deliver products quickly, efficiently...


  • Agoura Hills, CA, United States Lakeview Loan Servicing Full time

    Overview Are you a visionary leader with a passion for building high-performing engineering organizations? Do you thrive in fast-paced environments and relish the challenge of transforming technical strategy into reality? If so, we want you on our team! We're searching for an exceptional VP of Software Engineering to lead our company's software engineering...