Current jobs related to Lead Site Reliability Engineer - San Diego, California - Platform Science
-
Site Reliability Engineer
1 month ago
San Diego, California, United States ACL Digital Full timeJob DescriptionDuration: 0-12 monthsJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at ACL Digital. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based applications.Key Responsibilities:Hands-on application management and support for AWS...
-
Site Reliability Engineering Director
2 weeks ago
San Diego, California, United States Becton, Dickinson & Company Full timeAbout the RoleA Site Reliability Engineering Manager at Becton, Dickinson & Company is responsible for ensuring the smooth operation of complex systems and services. They oversee a team of Site Reliability Engineers to maintain infrastructure, handle incident response, and implement continuous improvement initiatives.Key ResponsibilitiesLead a team of Site...
-
Site Reliability Engineer
4 weeks ago
San Diego, California, United States Qualcomm Full timeJob Title: Site Reliability EngineerJoin Qualcomm as a Site Reliability Engineer and be part of a highly collaborative team focused on provisioning and maintaining infrastructure and services with stability, sustainability, and security always on your mind.About the RoleWe are seeking a skilled Site Reliability Engineer to join our team. As a Site...
-
Site Reliability Engineer
4 weeks ago
San Diego, California, United States Qualcomm Full timeJob Title: Site Reliability EngineerAt Qualcomm, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the stability, scalability, and security of our infrastructure and services.Key Responsibilities:Monitor system health and detect anomaliesInvestigate and...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States Qualcomm Full timeJob Title: Site Reliability EngineerAt Qualcomm, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the stability, sustainability, and security of our infrastructure and services.Key Responsibilities:Monitor system health and detect anomalies to prevent service...
-
Lead Site Reliability Engineer
2 weeks ago
San Jose, California, United States VDart Full timeJob Title:Lead Site Reliability EngineerJob Summary:Vdart is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our software systems.Key Responsibilities:Design and implement automation scripts to improve operational...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States Insight Global Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and highly available cloud...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States Insight Global Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and highly available cloud...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States Commserve Technologies Inc Full timeJob Title: Site Reliability EngineerAt Commserve Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our enterprise-level applications.Key Responsibilities:Configure, architect, and maintain...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States Commserve Technologies Inc Full timeJob Title: Site Reliability EngineerAt Commserve Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our enterprise-level applications.Key Responsibilities:Configure, architect, and maintain...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States BAE Systems USA Full timeJob DescriptionAt BAE Systems USA, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the seamless delivery of our cloud-based services.Key Responsibilities:Work collaboratively with cross-functional teams to design, implement, and maintain scalable and reliable...
-
Site Reliability Engineer
2 weeks ago
San Diego, California, United States BAE Systems USA Full timeJob DescriptionBAE Systems USA is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement robust automation solutions to streamline infrastructure deployment and...
-
Site Reliability Engineer
1 week ago
San Diego, California, United States BAE Systems USA Full timeJob DescriptionAt BAE Systems USA, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the seamless delivery of our cloud-based services. Your expertise in cloud technologies, service lifecycle management, and infrastructure automation will be instrumental in driving our...
-
Site Reliability Engineering Manager
2 weeks ago
San Diego, California, United States BD Full timeJob DescriptionA Site Reliability Engineering Manager is responsible for ensuring the smooth operation of systems and services at scale. This role involves leading a team of Site Reliability Engineers to maintain infrastructure, handle incident response, and improve system reliability and performance.Key Responsibilities:Lead and mentor a team of SREs to...
-
Site Reliability Engineering Manager
4 days ago
San Diego, California, United States BD Full timeJob Title: Site Reliability Engineering ManagerJob Summary:A Site Reliability Engineering Manager is responsible for ensuring that systems and services run smoothly, reliably, and efficiently at scale. They manage a team of SREs to maintain infrastructure, handle incident response, and improve the system's reliability and performance.Key...
-
Site Reliability Engineering Manager
1 week ago
San Diego, California, United States BD (Becton, Dickinson and Company) Full time**Job Description Summary** A Site Reliability Engineering Manager is responsible for ensuring that systems and services run smoothly, reliably, and efficiently at scale. They manage a team of SREs to maintain infrastructure, handle incident response, and improve the system's reliability and performance. Below is a comprehensive job description for an SRE...
-
Principal Site Reliability Engineer
3 weeks ago
San Diego, California, United States BAE SYSTEMS Full timeJob DescriptionAt BAE Systems, we're pushing the boundaries of innovation in the field of Site Reliability Engineering. We're seeking a highly skilled and motivated individual to join our team as a Site Reliability Engineer, where you'll play a critical role in ensuring the seamless delivery of our cloud-based solutions.Key Responsibilities:Deliver...
-
Reliability Engineer Lead
3 weeks ago
San Diego, California, United States Apple Full timeReliability Engineer LeadAt Apple, we're committed to delivering exceptional products and services that exceed our customers' expectations. As a Reliability Engineer Lead, you'll play a critical role in ensuring the reliability and quality of our silicon components.Key ResponsibilitiesDrive requirements and execution of products, process, and package...
-
Lead Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Meraki Full timeAbout the RoleCisco Meraki is a cloud-managed IT company and leader in cloud-controlled Wi-Fi, routing, and security. We're challenging the status quo with the power of diversity, inclusion, and collaboration.Job SummaryWe're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for...
-
Principal Site Reliability Engineer
3 weeks ago
San Diego, California, United States BAE SYSTEMS Full timeJob DescriptionAt BAE Systems, we're pushing the boundaries of innovation in the field of Site Reliability Engineering. We're seeking a highly skilled and motivated individual to join our team as a Site Reliability Engineer, where you'll play a critical role in ensuring the seamless delivery of our cloud-based solutions.Key Responsibilities:Deliver...
Lead Site Reliability Engineer
2 months ago
About Us
At Platform Science, we are dedicated to revolutionizing the transportation industry through innovative IoT solutions. Established in 2015, our open platform collaborates with forward-thinking fleets, application developers, vehicle manufacturers, and equipment providers to enhance supply chain efficiency worldwide.
Our workforce is a vibrant and diverse collective that values creativity and the exchange of ideas. We prioritize hiring individuals with varied experiences and viewpoints to cultivate a culture that fosters growth and innovation. Empathy and thoughtful actions are at the core of our operations, and we tackle challenges with resilience and creativity, promoting transparency and teamwork.
Position Overview
We are seeking a qualified Senior Site Reliability Engineer to enhance our operations. This role is pivotal in addressing operational challenges and supporting development teams for essential business applications in production. The primary objective is to guarantee reliability across all production services and empower development teams to assess their reliability effectively.
The SRE team plays a crucial role in managing our cloud-based platform, utilizing services such as AWS, Azure, and GCP. Our applications are designed with containerization and serverless architecture. If you are eager to explore new technologies and work on a diverse range of products, including mobile applications, hardware, websites, messaging queues, and serverless pipelines, this role is tailored for you.
Key Responsibilities
- Enhance and develop Continuous Integration/Continuous Deployment (CI/CD) pipelines while refining release management processes and tools.
- Maintain Helm charts to facilitate application deployment and management.
- Implement standardized observability solutions to assist development teams in managing their applications effectively.
- Champion reliability initiatives, driving uptime goals, and mentoring peers in SRE best practices.
- Conduct thorough Production Readiness Reviews, collaborating with teams to establish Service Level Indicators and Service Level Objectives (SLIs/SLOs) to ensure dependable services.
- Design and create software solutions to resolve operational challenges, enhancing system stability and reliability.
- Provide on-call support, offering expert assistance to development teams for mission-critical applications in production.
- Enhance application and system resilience through chaos engineering techniques.
Qualifications
- Minimum of 5 years of hands-on experience in SRE or Platform Engineering roles.
- Proven expertise (2+ years) with automation tools such as Jenkins, ArgoCD, or similar technologies.
- Experience with Kubernetes (2+ years), Helm, and Docker in production settings.
- Strong understanding of software development lifecycle (SDLC) principles, CI/CD pipelines, and test-driven development.
- Proficient in AWS, including EKS, IAM, autoscaling, networking, and load balancing in production environments.
- Skilled in programming languages such as Python, Bash, Node.js, and/or Go.
- Familiarity with distributed tracing methodologies and observability tools like Prometheus, ELK, or Datadog.
- Emphasis on documentation and knowledge-sharing within the team and organization.
- Experience in training and mentoring engineers effectively.
- Proven ability to optimize performance and manage costs in cloud environments.
- Solid understanding of SLI/SLO concepts and adherence to SRE best practices.
- Bachelor's degree in Computer Science or a related field.
Benefits Overview
Platform Science offers a comprehensive benefits package for regular, full-time employees, including:
- Medical, dental, and vision insurance.
- Short-term and long-term disability insurance.
- Life and AD&D insurance.
- 401k retirement plan.
- Paid vacation, sick leave, and holidays.
- Six weeks of paid parental leave.