Site Reliability Engineer

3 weeks ago


Miami, United States Magaya Corporation Full time
Job DescriptionJob Description

JOB TITLE: Site Reliability Engineer

FLSA CLASSIFICATION: Exempt

DEPARTMENT: Engineering/IT Garcia

REPORTS TO: Director of IT

EFFECTIVE DATE: 5/29/2024

SUMMARY: As an SRE, this role will be responsible for monitoring the applications and responding to events, incidents, and changes originating from internal or vendor applications. Investigate incidents and problems and determine the root causes. Analyze existing IT processes and use IaC to automate them. Reports to the Director of IT and works to establish operational metrics for our AWS and Azure environments. The SRE role will participate in our on-call rotation.


DUTIES AND RESPONSIBILITIES

  • Work on Magaya’s platform services to design, develop, and improve services, platforms and processes that result in improved end-to-end reliability and maintainability for all our services.
  • Create and drive adoption of tools that help deliver insights and automation to simplify the complex world of large-scale services.
  • Create the infrastructure to support the deployment of Magaya Supply Chain in AWS.
  • Leverage new technology paradigms (e.g., serverless, containers, microservices)
  • Influence infrastructure architecture by sharing your application development expertise.
  • Be a mentor for design reviews, code, and test cases.
  • Quickly adapt, apply, and train on new technologies, tools, methods, and processes from both internal and external sources.
  • Actively look for opportunities to improve the system's availability and performance by applying the learnings from monitoring and observation.
  • Handle incidents and outage emergencies promptly to reduce Mean-Time-To-Recovery (MTTR). Investigate root causes and collaborate with the team to implement preventive measures.
  • Familiar with ITSM methodology for incident response.
  • Participate in our on-call rotation to support our systems.

SUPERVISORY RESPONSIBILITIES:

  • This job has no supervisory responsibilities.

QUALIFICATIONS:

  • Bachelor’s degree in Computer Science or 5+ years professional experience in software development (MS, BE, Computer Science, Site reliability, etc.)
  • 5+ years of large-scale software development or application engineering with recent coding experience in one or more of the following languages: Java, JavaScript, C/C++, C#, Node.js, Python, or Rust.
  • Experience in designing and building infrastructure to support applications using container and serverless technologies.
  • Familiarity with cloud computing platforms (AWS, Azure) and containerization technologies.
  • Experience in designing and building infrastructure to support traditional 3-tier applications.
  • Proficiency in scripting languages such as Bash, Python, and PowerShell
  • Experience working with operating systems (Linux, Windows).
  • Experience supporting infrastructure for large multi-services applications.
  • Experience running infrastructure with Chef, Ansible, Terraform, and Kubernetes. (Infrastructure as code)
  • Experience working with CICD in micro-services architectures.
  • Experience with observability/Monitoring tools: Datadog, New Relic, Istio.
  • Experience working with configuration management tools: Kubernetes.
  • Experience developing environment documentation and support procedures.

REFERRED QUALIFICATIONS

  • Understanding of enterprise IT operational capabilities – examples include Change, Release, Incident Management, infrastructure management or applications management.
  • Experience architecting highly available systems that utilize load balancing, horizontal scalability, and high availability.
  • Experience with Agile software development and DevOps practices such as Infrastructure as Code (IaC), Continuous Integration and automated deployment.
  • Experience in adopting chaos engineering techniques to validate system resiliency.
  • Experience with Distributed Services, Asynchronous Messaging Architecture, Eventual Consistency, Telemetry, and high scale experience with managing and services on top of cloud environments such as AWS and Azure.

COMPETENCIES:

  • Problem Solving--Identifies and resolves problems in a timely manner; Gathers and analyzes information skillfully; Develops alternative solutions; Works well in group problem solving situations; Uses reason even when dealing with emotional topics.
  • Technical Skills--Assesses own strengths and weaknesses; Pursues training and development opportunities; Strives to continuously build knowledge and skills; Shares expertise with others.
  • Teamwork--Balances team and individual responsibilities; Exhibits objectivity and openness to others' views; Gives and welcomes feedback; Contributes to building a positive team spirit; Puts success of team above own interests; Able to build morale and group commitments to goals and objectives; Supports everyone's efforts to succeed.
  • Quality Management--Looks for ways to improve and promote quality; Demonstrates accuracy and thoroughness.
  • Quality--Demonstrates accuracy and thoroughness; Looks for ways to improve and promote quality; Applies feedback to improve performance; Monitors own work to ensure quality.
  • Innovation--Displays original thinking and creativity; Meets challenges with resourcefulness; Generates suggestions for improving work; Develops innovative approaches and ideas; Presents ideas and information in a manner that gets others' attention.

PHYSICAL DEMANDS AND WORK ENVIRONMENT:

  • Occasionally/Frequently/Continually required to stand
  • Occasionally/Frequently/Continually required to walk
  • Occasionally/Frequently/Continually required to sit
  • Occasionally/Frequently/Continually required to utilize hand and finger dexterity
  • Occasionally/Frequently/Continually required to climb, balance, bend, stoop, kneel or crawl
  • Occasionally/Frequently/Continually required to talk or hear
  • Occasionally/Frequently/Continually utilize visual acuity to operate equipment, read technical information, and/or use a keyboard
  • Occasionally/Frequently/Continually required to lift/push/carry items less than 25 pounds/ up to 25 pounds/ up to 50 pounds/ more than 50 pounds.
  • Occasionally/Frequently/Continually exposure to wet and/or humid conditions (non-weather)
  • Occasionally/Frequently/Continually loud noise (examples: datacenter, server room)

What You’ll Love About Magaya

Great Company Culture: Our team is comprised of people who are talented at what they do and have a passion to help our clients.

Work that Stays at Work: Genuine work/life balance served at Magaya

Rest and Relaxation: Flexible Vacation and up to 40 hours of sick leave depending on worked in State.

Health Benefits: Full comprehensive benefits program with medical, dental, vision, supplemental offered to all new employees and dependents. Short-term disability, long-term disability, and company paid life insurance.

Prepare for the Future: Competitive 401k plan with a company match.

Learning and Development: From our very own Magaya Software to coding, internal subject matter experts run classes designed to get you up to speed.




  • Miami, United States INSPYR Solutions Full time

    Title: Cloud Reliability Engineering Manager, SRE Manager Location: Miami, FL (Hybrid) *MUST* work hybrid in Miami, FL Duration: Direct Hire Compensation: 140,000 to 160,000. Plus excellent benefits Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. Skillset / Experience: The Manager, Cloud Reliability Engineering will manage and...


  • Miami, United States INSPYR Solutions Full time

    Title: Cloud Reliability Engineering Manager, SRE Manager Location: Miami, FL (Hybrid) *MUST* work hybrid in Miami, FL Duration: Direct Hire Compensation: 140,000 to 160,000. Plus excellent benefits Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. Skillset / Experience: The Manager, Cloud Reliability Engineering will manage and...


  • Miami, United States Carnival Cruise Line Full time

    The Manager, Cloud Reliability Engineering will manage and lead a talented team of Site Reliability Engineers across AWS, Azure, and GCP environments. This person will bring at least 10 years of relevant experience managing remote Cloud SRE teams. The individual will be expected to ensure our cloud infrastructure's reliability, scalability, and...


  • Miami, United States INSPYR Solutions Full time

    Title: Cloud Reliability Engineering Manager, SRE Manager Location: Remote Duration: Direct Hire Compensation: 140,000 to 160,000. Plus excellent benefits Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S.Remotely in hub locations. The hub locations are Florida, the Greater Atlanta Metro area, Dallas/Ft Worth & Houston/Sugarland....


  • Miami, United States Apex Systems Full time

    Apex Systems, a World-Class Technology Solutions Provider, is seeking applicants for the below position on behalf of our client. Please apply if interested and qualified. Please note that only qualified candidates will be contacted. Position:Cloud Reliability Engineering Manager Location: Miami, FL Duration:Permanent Rate/Salary Range: $150k-160k Job...


  • Miami, United States INSPYR Solutions Full time

    Title: Cloud Reliability Engineering Manager, SRE Manager Location: Remote Duration: Direct Hire Compensation: 140,000 to 160,000. Plus excellent benefits Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S.Remotely in hub locations. The hub locations are Florida, the Greater Atlanta Metro area, Dallas/Ft Worth & Houston/Sugarland....


  • Miami, United States Citadel Securities Full time

    Job DescriptionResponsibilities:Candidates who have less than 3 years of experience should possess: Good knowledge of UNIX/Linux command line. Good understanding of the usage of TCP/IP and UDP networking in applications. Basic understanding of network routing and troubleshooting. Basic experience in writing SQL database queries. Basic experience in Bash,...


  • Miami, Florida, United States Citadel Securities Full time

    Job DescriptionResponsibilities:Candidates who have less than 3 years of experience should possess: Good knowledge of UNIX/Linux command line. Good understanding of the usage of TCP/IP and UDP networking in applications. Basic understanding of network routing and troubleshooting. Basic experience in writing SQL database queries. Basic experience in Bash,...


  • Miami, Florida, United States Citadel Securities Full time

    Job DescriptionResponsibilities:Candidates who have less than 3 years of experience should possess: Good knowledge of UNIX/Linux command line. Good understanding of the usage of TCP/IP and UDP networking in applications. Basic understanding of network routing and troubleshooting. Basic experience in writing SQL database queries. Basic experience in Bash,...

  • Site Engineer II

    3 weeks ago


    Miami, Florida, United States Digital Realty - Multi-Lingual 2023 Full time

    Your roleThe Site Engineer II position is a contributing member to the site level Data Center Operations team assigned to one or more of our data center properties reporting directly to the Manager Engineering Operations. The Site Engineer II will have experience in mission critical infrastructure, including Generators, UPS Systems, HVAC Systems, Fire/Life...

  • Site Engineer II

    4 weeks ago


    Miami, Florida, United States Digital Realty - Multi-Lingual 2023 Full time

    Your roleThe Site Engineer II position is a contributing member to the site level Data Center Operations team assigned to one or more of our data center properties reporting directly to the Manager Engineering Operations. The Site Engineer II will have experience in mission critical infrastructure, including Generators, UPS Systems, HVAC Systems, Fire/Life...

  • Site Engineer

    1 month ago


    Miami, United States Americaribe LLC Full time

    'We're all Americaribe'. It means that we share the same culture, expertise & refers to the efforts we take every day to make ourselves operate more tightly and agilely. Our pride, achievements, and experiences have all tied to the minds of teamwork, innovation, and novelty. Talented newcomers with creative and technical skill sets like these, along with an...

  • Site Engineer

    1 month ago


    Miami, United States Americaribe LLC Full time

    'We're all Americaribe'. It means that we share the same culture, expertise & refers to the efforts we take every day to make ourselves operate more tightly and agilely. Our pride, achievements, and experiences have all tied to the minds of teamwork, innovation, and novelty. Talented newcomers with creative and technical skill sets like these, along with an...

  • Site Engineer

    1 month ago


    Miami, United States Americaribe LLC Full time

    'We're all Americaribe'. It means that we share the same culture, expertise & refers to the efforts we take every day to make ourselves operate more tightly and agilely. Our pride, achievements, and experiences have all tied to the minds of teamwork, innovation, and novelty. Talented newcomers with creative and technical skill sets like these, along with an...

  • Site Engineer

    1 month ago


    Miami, United States Americaribe LLC Full time

    'We're all Americaribe'. It means that we share the same culture, expertise & refers to the efforts we take every day to make ourselves operate more tightly and agilely. Our pride, achievements, and experiences have all tied to the minds of teamwork, innovation, and novelty. Talented newcomers with creative and technical skill sets like these, along with an...


  • Miami, United States KELLER - North America Full time

    Company Logo: Overview: Keller is the world leader in geotechnical construction and deep foundations. With a North American presence of over 100 years, we operate as the market leader with over 50 offices throughout the US and Canada. By connecting global resources and local experience, Keller develops innovative, practical, and cost-effective solutions to...


  • Miami, United States Careerbuilder-US Full time

    Overview Keller is the world leader in geotechnical construction and deep foundations. With a North American presence of over 100 years, we operate as the market leader with over 50 offices throughout the US and Canada. By connecting global resources and local experience, Keller develops innovative, practical, and cost-effective solutions to...


  • Miami, United States Ware Malcomb Full time

    The Civil Engineering Site Inspector will be part of our team working on the construction of a large industrial project in Port St Lucie County, Florida. Construction will be ongoing 6 days per week and up to 10-hours per day. The individual will perform a variety of tasks and assignments related to oversight of the construction of site improvements in...

  • Site Engineer II

    4 weeks ago


    Miami, United States Digital Realty Full time

    Job DescriptionYour role The Site Engineer II position is a contributing member to the site level Data Center Operations team assigned to one or more of our data center properties reporting directly to the Manager Engineering Operations. The Site Engineer II will have experience in mission critical infrastructure, including Generators, UPS Systems, HVAC...

  • Site Engineer II

    4 weeks ago


    Miami, United States Digital Realty - Multi-Lingual 2023 Full time

    Your role The Site Engineer II position is a contributing member to the site level Data Center Operations team assigned to one or more of our data center properties reporting directly to the Manager Engineering Operations.  The Site Engineer II will have experience in mission critical infrastructure, including Generators, UPS Systems, HVAC Systems,...