Site Reliability Engineer
4 days ago
Role Number: 200635067-3337
Summary
The Apple Service Engineering - SRE team is looking for Site Reliability Engineers with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple’s services are reliable, scalable and secure, and we leverage both open source and home-grown technologies to provide managed data infrastructure services. You will help building next generation search infrastructure and platform services, collaborating cross-functionally with various ASE teams, from store and commerce to search and recommendations. You’ll create platforms that can rapidly scale to serve personalized and non-personalized data with very low latencies. You should be someone who is not afraid to question assumptions, are a good standout colleague under tight deadlines, and can take on problems with elegant technical solutions.
Description
The ASE SRE team develops applications and tooling that are safe, reliable, scalable, and fast. This work requires an innovative spirit and an extraordinary degree of care and difficulty in engineering. Team members contribute to all major components of Redis deployment infrastructure, including maintenance automation, backup service application, monitoring and alerting tooling/dashboards, deployment architecture, focused on stability, performance, and scaling. Success in this role requires expertise in several of the following: - Understanding of core SRE concepts - Monitoring, Alerting, Incident management. Performance engineering (design concepts, profile-guided optimization). Service management across Kubernetes, bare metal, and virtualized (EC2) platforms. This role also requires excellent communication and a high degree of customer focus when engaging with internal platform customers. As a distributed team, ability to work optimally with colleagues based in other locations is also essential; experience in this area is a plus. Prior experience with development or maintenance of distributed databases / storage systems is recommended. Apple values craftsmanship and Performance is a key ingredient. Come join us at Apple Services Engineering and help us deliver services and applications that are fluid and responsive. You will collaborate with engineers from across Apple to define the metrics, set targets, uncover optimization opportunities, define quality guardrails, and ship a product/service that will delight our customers. This role is for engineers who enjoy deep technical engineering that spans large cross-organizational projects. Your willingness to learning and implementing new technologies will contribute to the continuous evolution of our organization.
Minimum Qualifications
-
Bachelor's Degree in Computer Science, an engineering-related field, or equivalent related experience.
-
3 - 5 years in a Site Reliability Engineering focused role.
-
Proficient in one or more of the following programming languages: Java, Go (Golang), Python
-
Understanding of core SRE concepts - Monitoring, Alerting, Incident management.
-
Understanding of database concepts (consistency models, isolation levels, crash and recovery semantics).
-
Performance engineering (design concepts, profile-guided optimization).
-
Service management across Kubernetes, bare metal, and virtualized (EC2) platforms.
-
Datacenter architecture (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide-area networking.
Preferred Qualifications
-
Demonstrated expertise developing distributed systems, storage engines, distributed systems, or performance engineering.
-
Experience developing critical internet services and/or platform infrastructure.
-
Experience managing services on Kubernetes
-
Experience with EC2, EBS, and Terraform
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .
-
Site Reliability Engineer
2 weeks ago
Seattle, WA, United States Kaav Inc. Full timeWho we are We are a yoga-inspired technical apparel company up to big things. The practice and philosophy of yoga informs our overall purpose to elevate the world through the power of practice. We are proud to be a growing global company with locations all around the world, from Vancouver to Shanghai, and places in between. We owe our success to our...
-
Site Reliability Engineer, Python
2 days ago
Seattle, WA, United States Next Step Systems LTD Full timeSite Reliability Engineer, Python, Seattle, WA There are 5 openings available for the Site Reliability Engineer position. These will be an onsite opportunities in either Los Angeles, CA; New York City, NY; or Seattle, WA. Responsibilities: - Manage cloud infrastructure, provide resource allocation, system upgrades, user access control etc. - Perform deep...
-
Site Reliability Engineer, Python
1 week ago
Seattle, WA, United States Next Step Systems LTD Full timeSite Reliability Engineer, Python, Seattle, WA There are 5 openings available for the Site Reliability Engineer position. These will be an onsite opportunities in either Los Angeles, CA; New York City, NY; or Seattle, WA. Responsibilities: - Manage cloud infrastructure, provide resource allocation, system upgrades, user access control etc. - Perform deep...
-
Senior Site Reliability Engineer
1 week ago
Seattle, WA, United States Dat Services Inc Full timeAbout DATDAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on...
-
Senior Site Reliability Engineer
1 week ago
Seattle, WA, United States Dat Services Inc Full timeAbout DATDAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on...
-
Senior Site Reliability Engineer
2 days ago
Seattle, WA, United States Dat Services Inc Full timeAbout DATDAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on...
-
Senior Site Reliability Engineer
2 weeks ago
Seattle, WA, United States Zillow Group Full timeAbout the team The SRE team at Zillow Group empowers product teams to efficiently run "Zillow 2.0" services by reducing human error, focusing on automation, and providing deep insight into application behavior and health. By applying software engineering principles to infrastructure and operations, the team creates and manages scalable, reliable distributed...
-
Site Reliability Engineer-Remote
2 weeks ago
Seattle, WA, United States Georgia IT Inc Full timeSite Reliability Engineer Location - Remote - must be willing to work PST - High preference for someone local to Seattle Duration - 12 months Rate: DOE US Citizens and Green cards & GC-EAD Only. No Third-party C2C available for this job 8-10+ years of Site Reliability / DevOps Engineering Experienced with PowerShell Scripting. Should have extensive...
-
Site Reliability Engineer-Remote
2 weeks ago
Seattle, WA, United States Georgia IT Inc Full timeSite Reliability Engineer Location - Remote - must be willing to work PST - High preference for someone local to Seattle Duration - 12 months Rate: DOE US Citizens and Green cards & GC-EAD Only. No Third-party C2C available for this job 8-10+ years of Site Reliability / DevOps Engineering Experienced with PowerShell Scripting. Should have extensive...
-
Site Reliability Engineer-Remote
1 week ago
Seattle, WA, United States Georgia IT Inc Full timeSite Reliability Engineer Location - Remote - must be willing to work PST - High preference for someone local to Seattle Duration - 12 months Rate: DOE US Citizens and Green cards & GC-EAD Only. No Third-party C2C available for this job 8-10+ years of Site Reliability / DevOps Engineering Experienced with PowerShell Scripting. Should have extensive...