Site Reliability Engineer

4 weeks ago


Los Angeles, United States Saxon Global Full time

Looking for a highly motivated Site Reliability Engineer, who is capable of build and run large-scale, massively distributed, fault-tolerant systems. Individual to work with teams across the organization and ensures core services reliability and keep an eye on capacity and performance.

This is for a migration from AWS into GCP. Knowledge and experience with GCP is mandatory, knowledge of AWS is nice to have.

• Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement.
• Experience in Designing and Deploying multi-data center Large Scale Web Applications.
• Work closely with dev, and ops teams to build highly available, cost-effective systems.
• Create new tools and scripts designed for auto-remediation of incidents.
• Design/Implementation of Big Data technologies, including Hadoop, MongoDB, Kafka, RabbitMQ, Zookeeper, Spark, ELK, etc.
• Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems.
• Design platforms for extremely high uptime metrics.
• Works well independently and requires little or no supervision.
• Work with cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting.
• Fully understand the application, microservices interactions.
• Design/Implementation containers/applications in scalable HA/DR multi-tier cloud environments, including new system design, documentation, implementation, and deployment.
• Participate in 24x7 an on-call rotation.

Job Requirements (7+ years of experience in the following areas):

• Experience in providing L4 technical support for production 24x7.
• Strong experience in production support and operations.
• Design/Implementation of network and presentation tier technologies, including F5, Apache, Nginx, etc.
• Experience in Performance Testing/Tuning/Monitoring, maximizing system uptime and availability, ensuring functional and performance SLAs.
• Experience with monitoring Application/Infrastructure Performance, and availability.
• Automation Experience with Build/deployment, Software Configuration/Continuous Integration/Continuous Delivery/Release Engineering related tasks in an JavaEE/C++ Environments.
• Experience in automating manual processes using Python, Ruby, Unix Shell (bash, ksh), perl, Ant, etc.
• Installing, Configuring, Administering, and Tuning of JavaEE Application Servers/Containers like Tomcat, WebSphere, etc.
• Installing/maintaining/Administering software on Unix Linux, Windows servers.
• Experience with Web service technologies, including REST, SOAP, JSON, XML.
• Experience with Cloud Platforms and virtualization Technologies.
• Deploying and automating infrastructure/applications in cloud environment using Chef, RPM, etc.
• Working closely with Development, QA, Product Management, and Production Ops teams to make sure Product Releases on-time with quality.
• Hands on experience Configuring and Administering SCM (GIT, SVN), Build (CMake, Make files, Maven), CI(Jenkins), CD Automation Tools.
• Experience with database (RDBMS, NoSql) technologies is a plus.
• Experience with Performance Testing is a plus.
• Configuring and maintaining SDLC Environments.
• Experience in Agile Methodologies and processes.
• Strong Automation, problem-solving skills, and ability to follow through to completion.
• Demonstrated leadership skills through a variety of activities, including leading or mentoring technical staff.
• Strong verbal/written communication skills.
• Participate in 24x7 an on-call rotation.

Required Skills : Looking for an SRE who can assist with moving from AWS into GCP. GCP is a required skill. AWS is just a nic eto have. Design/Implementation of Big Data technologies, including Hadoop, MongoDB, Kafka, RabbitMQ, Zookeeper, Spark, ELK, etc.
Background Check :Yes
Drug Screen :Yes
Notes :
Selling points for candidate :
Project Verification Info :The information provided below is for Apex Systems AV use only and is not to be distributed publicly, or to any third party. Any distribution of the below information will result in corrective action from Apex Systems Vendor Management. MSA: Blanket Approval Received Client Letter: Will Provide
Candidate must be your W2 Employee :Yes
Exclusive to Apex :No
Face to face interview required :No
Candidate must be local :Yes
Candidate must be authorized to work without sponsorship ::No
Interview times set : :No
Type of project :
Master Job Title :
Branch Code :



  • Los Angeles, United States OPEN MIND Technologies Full time

    Our client is looking for a Site Reliability Engineer who can serve their community of users and customers by working tirelessly to preserve free expression and choice, create limitless interactivity, and create a marketplace that enables the economic success of all its participants. If interested, kindly reply back with your updated resume and contact info....


  • Los Angeles, United States X (formerly Twitter) Full time

    Are you prepared to join the X team and help build the ultimate real-time information-sharing app, revolutionizing how people connect? At X, we're on a mission to become a trusted global digital public square, committed to minimal censorship within legal boundaries. Our goal is to empower every user to freely create and share ideas, fostering open public...


  • Los Angeles, United States X (formerly Twitter) Full time

    Are you prepared to join the X team and help build the ultimate real-time information-sharing app, revolutionizing how people connect? At X, we're on a mission to become a trusted global digital public square, committed to minimal censorship within legal boundaries. Our goal is to empower every user to freely create and share ideas, fostering open public...


  • Los Angeles, United States developrec Full time

    SRE Lead/Manager | San Diego, CA | Full-time Role Overview: As the Engineering Manager for Site Reliability, you'll lead the charge in transitioning to cloud-based solutions while ensuring the stability of our existing systems for our rapidly growing user base, currently standing at around one million. You'll spearhead our cloud infrastructure strategy...


  • Los Angeles, United States Adastra replica Full time

    Job Description Job Description Our client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Adastra replica Full time

    Job DescriptionJob DescriptionOur client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Adastra replica Full time

    Job DescriptionJob DescriptionOur client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States eTek IT Services, Inc. Full time

    Job DescriptionJob DescriptionOverviewThe Site Reliability Engineer will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure and applications, ultimately contributing to the seamless operations of our systems. This role is vital in maintaining a high level of uptime and system efficiency, enhancing the overall...


  • Los Angeles, California, United States Beacon Hill Full time

    NOTE:This is a Hybrid position in Los AngelesLinux Site Reliability Engineer (SRE)If you're passionate about Linux, cloud infrastructure, and contributing to open-source projects, you've come to the right place.One of our clients in Los Angeles area is looking for Linux Site Reliability (SRE) with the following skills and experience:Job Overview:As a Linux...

  • Reliability Engineer

    4 weeks ago


    Los Angeles, United States Kindeva Drug Delivery Company Full time

    The Reliability Engineer will lead the sites Asset Reliability agenda, effectively promoting analytical problem-solving techniques and structured reliability improvement processes. We have an immediate opening for a Reliability Engineers at Kindeva’s Northridge, CA manufacturing facility. The Reliability Engineer will lead the sites Asset Reliability...


  • Los Angeles, California, United States Fox Corporation Full time

    OVERVIEW OF THE COMPANY Fox CorporationUnder the FOX banner, we produce and distribute content through some of the world's leading and most valued brands, including: FOX News Media, FOX Sports, FOX Entertainment, FOX Television Stations and Tubi Media Group. We empower a diverse range of creators to imagine and develop culturally significant content,...


  • Los Angeles, United States forhyre.com Full time

    Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changing technology landscape. To be successful in this role You'll have the opportunity to design and implement major infrastructure...

  • Uncapped Games

    2 weeks ago


    Los Angeles, United States LightSpeed Studios Full time

    Uncapped Games - Site Reliability Engineer page is loaded Uncapped Games - Site Reliability Engineer Apply remote type Hybrid locations US-Los Angeles US-California-Remote time type Full time posted on Posted 30+ Days Ago job requisition id R100321 Work Mode: Onsite Responsibilities: Description: Seeking the opportunity to build a game from scratch and...


  • Los Angeles, California, United States Motion Recruitment Partners LLC Full time

    Senior Site Reliability Engineer / DynatraceLos Angeles, CaliforniaOpen to RemoteContract$60/hr - $85/hrJob Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with travel to...


  • Los Angeles, United States Motion Recruitment Partners LLC Full time

    Senior Site Reliability Engineer / Dynatrace Los Angeles, California **Open to Remote** Contract $60/hr - $85/hr Job Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with...


  • Los Angeles, United States Motion Recruitment Partners LLC Full time

    Senior Site Reliability Engineer / Dynatrace Los Angeles, California Open to Remote Contract $60/hr - $85/hr Job Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with travel...

  • Reliability Engineer

    4 weeks ago


    Los Angeles, United States GRN Hudson (Global Recruiters Network) Full time

    Reliability Engineer Supporting manufacturing operations to achieve superior equipment efficiency with minimal down time. The ideal candidate will play a crucial role in ensuring the reliability and performance of our industrial equipment through continuous improvement initiatives and strategic maintenance practices. Responsibilities: - Develop and implement...

  • Uncapped Games

    4 weeks ago


    Los Angeles, United States Tencent Full time

    Work Mode: Onsite Responsibilities: Description: Seeking the opportunity to build a game from scratch and create a global impact? Uncapped Games is seeking a talented and enthusiastic Site Reliability Engineer to join our new AAA team. The ideal candidate is well-versed in operating and improving distributed online systems, eager to automate & optimize...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Live Video Streaming Senior Principal/Senior Site Reliability EngineerLocation - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONOur clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience. We're on...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Live Video Streaming Senior Principal/Senior Site Reliability EngineerLocation - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONOur clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience. We're on...