Site Reliability Engineer

1 day ago


Seattle, Washington, United States Prodigy Resources Full time
About Prodigy Resources

We're a leading fintech innovator, driving financial success and empowering our clients with state-of-the-art solutions. Our Greenfield project is a transformative opportunity to shape the future of fintech, and we're seeking a talented Site Reliability Engineer to join our team.

Key Responsibilities
  • System Reliability: Ensure the availability, performance, and scalability of our backend services and APIs, developing and implementing reliability engineering practices and tools.
  • Incident Management: Respond to and resolve incidents, perform root cause analysis, and implement preventive measures to avoid future issues.
  • Monitoring & Metrics: Set up and manage monitoring, logging, and alerting systems using tools such as AWS CloudWatch, ensuring comprehensive visibility into system performance.
  • Automation: Automate operational tasks and processes to improve efficiency and reduce manual intervention, developing and maintaining CI/CD pipelines.
  • Capacity Planning: Work on capacity planning and performance tuning to handle increasing loads and ensure system resilience.
  • Collaboration: Collaborate with development teams to design, deploy, and manage infrastructure and applications, providing guidance on reliability best practices and performance optimizations.
  • Documentation: Create and maintain documentation for systems, processes, and incident response procedures.
  • Continuous Improvement: Stay updated with industry trends and emerging technologies to continuously improve our reliability and performance practices.
Requirements
  • Minimum of 5 years of experience in Site Reliability Engineering or a related field, with a solid background in Python, Django, Flask, and AWS.
  • Proficiency in Python and experience with Django and Flask frameworks, hands-on experience with AWS services (EC2, S3, RDS, Lambda, etc.).
  • Strong understanding of SRE principles, including SLAs, SLOs, and error budgets, experience with incident management and disaster recovery.
  • Experience with monitoring and observability tools (e.g., AWS CloudWatch, Prometheus, Grafana).
  • Proven experience in automating tasks and managing CI/CD pipelines.
  • Excellent analytical and troubleshooting skills, with the ability to resolve complex technical issues.
  • Strong verbal and written communication skills, with the ability to convey technical concepts to both technical and non-technical audiences.
  • Fintech experience or similar regulated environments is highly desirable.
Why Prodigy Resources?
  • Innovative Projects: Contribute to a transformative Greenfield project that will shape the future of fintech.
  • Dynamic Environment: Engage in a fast-paced, collaborative environment focused on continuous improvement and innovation.
  • Growth Opportunities: Access to ongoing learning and career development opportunities.
  • Competitive Compensation: Enjoy a competitive salary and comprehensive benefits package.


  • Seattle, Washington, United States F5 Networks Full time

    About the RoleF5 Networks is seeking a highly skilled Site Reliability Engineer III to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, availability, and scalability of our critical systems and SaaS platforms.Key ResponsibilitiesApply modern engineering principles and practices to operational...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our Apple Service Engineering team. As a Site Reliability Engineering Manager, you will be responsible for leading a team that provides the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and...


  • Seattle, Washington, United States F5 Networks Full time

    About F5 NetworksF5 Networks is a leading provider of cloud and security solutions, empowering organizations to create, secure, and run applications that enhance the digital experience.Job SummaryWe are seeking a highly skilled Site Reliability Engineer III to join our team. As a Site Reliability Engineer III, you will be responsible for ensuring the...


  • Seattle, Washington, United States Apple Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineering Leader to join our Apple Services Engineering (ASE) team. As a key member of our organization, you will be responsible for leading a new SRE team dedicated to security services, partnering closely with the Security Development team to bring up and mature new services as part of our...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesCollaborate with cross-functional teams to design, implement, and...


  • Seattle, Washington, United States Hulu Full time

    Job SummaryAs a Senior Site Reliability Engineer at Hulu, you will be a key member of our Performance and Reliability embedded teams. We focus on planning, scoping, solution architecting, software design, and implementation based on functional and performance capability requirements. We leverage cloud-native, commercial, and open-source tools and frameworks...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesCollaborate with cross-functional teams to design, implement, and...


  • Seattle, Washington, United States Tik Tok Full time

    About the RoleTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure.Key ResponsibilitiesDevelop and maintain automation procedures to maximize system efficiency...


  • Seattle, Washington, United States Apple Full time

    Overview:Position Number: The Apple Services Engineering team exemplifies Apple's dedication to merging creativity with technology. We invite you to join the Apple Services Engineering Cloud Service Infrastructure team as a Site Reliability Engineer, where you will play a pivotal role in supporting and expanding cloud services for millions of Apple users....


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team at Apple. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesDesign, implement, and maintain security measures, incident...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesDesign, implement, and maintain security measures, incident response...


  • Seattle, Washington, United States SingleStore Full time

    Position OverviewSingleStore is on the lookout for a Lead Site Reliability Engineer to spearhead our Kubernetes product initiatives related to our managed service offerings. You will play a pivotal role in shaping the architecture, realizing the collective vision, and maintaining your strategic approach to product development.This position is crucial in...


  • Seattle, Washington, United States F5 Networks Full time

    About F5 NetworksAt F5 Networks, we are dedicated to shaping a superior digital landscape. Our teams empower organizations worldwide to create, secure, and operate applications that enhance our interactions with the ever-evolving digital environment.We are deeply committed to cybersecurity, safeguarding consumers from fraud, and enabling businesses to...


  • Seattle, Washington, United States Oracle Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Oracle. As a key member of our engineering organization, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure.About the RoleAs a Site Reliability Engineer, you will be responsible for designing, implementing,...


  • Seattle, Washington, United States Oracle Full time

    Overview: The OCI Incident Response team serves as the primary defense mechanism for ensuring the uninterrupted operation of Oracle's cloud services. Our mission is to reduce the frequency and impact of customer-affecting incidents by implementing effective large-scale incident management strategies. We leverage our operational expertise, adherence to...


  • Seattle, Washington, United States Apple Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Services Engineering team in Seattle, Washington. As a key member of our dynamic team, you will play a critical role in ensuring the availability, latency, and overall health of our object store orchestration service.Key...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Apple Services Engineering team. As a key member of our team, you will be responsible for designing, building, and operating large-scale distributed systems that provide secure, scalable, and highly available services to our customers.Key ResponsibilitiesDesign and implement...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly experienced Senior Site Reliability Engineering Leader to lead our Security SRE team. As a key member of our Infrastructure organization, you will be responsible for overseeing critical security infrastructure services and improving their reliability, observability, and manageability.Key ResponsibilitiesLead a global SRE...


  • Seattle, Washington, United States Apple Full time

    Job SummaryThe Apple Services Engineering team is a dynamic and innovative group of engineers who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. As a Senior Site Reliability Engineer, Object Storage, you will play a critical role in building secure, end-to-end solutions that meet Apple's high expectations.Key...


  • Seattle, Washington, United States Apple Full time

    Imagine what you could achieve at Apple. As a Senior Site Reliability Engineer, you will play a critical role in ensuring the stability and performance of our cloud infrastructure. With a passion for technology and a dedication to excellence, you will have the opportunity to make a real impact on our customers' experiences.Key Responsibilities:Design and...