Cloud Reliability and Monitoring Specialist

5 days ago


Plano, Texas, United States Futran Tech Solutions Pvt. Ltd. Full time
Job Title: CTP Reliability and Monitoring Engineer

Futran Tech Solutions Pvt. Ltd. is seeking a skilled CTP Reliability and Monitoring Engineer to join our team. As a key member of our Platform Engineering team, you will be responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services.

Key Responsibilities:
  • Design, implement, and manage robust monitoring and alerting systems to proactively identify issues and timely incident response.
  • Work closely with the CTP Platform Engineering and Development teams to optimize services and maintain service uptime.
  • Develop and maintain comprehensive monitoring solutions for cloud-based services and applications.
  • Configure monitoring tools and systems to collect relevant metrics, logs, and traces.
  • Create custom monitoring dashboards and reports using DataDog or other tools, to provide real-time insights into system performance and health.
  • Continuously monitor the cloud infrastructure's performance and capacity, anticipating and addressing potential scalability issues.
  • Proactively suggest and implement improvements to enhance the system's reliability, resilience, and fault tolerance.
  • Work on automating tasks to streamline operational processes and reduce manual intervention.
  • Collaborate with cross-functional teams to investigate and resolve critical incidents, ensuring minimal impact on end-users.
  • Work with Problem Management team to complete post-mortem analysis of incidents to identify root causes and implement preventive measures.
Ideal Qualifications:
  • 3+ years' experience working with cloud platforms and services (AWS, Azure, GCP, etc.) in a production environment.
  • Solid understanding of monitoring and logging tools, such as Prometheus, Grafana, ELK stack, Splunk, etc.
  • Experience with infrastructure as code (IaC) tools, like Terraform, CloudFormation, or Ansible.
  • Strong scripting and automation skills (e.g., Python, Bash) to facilitate operational tasks.
  • Knowledge of containerization technologies (Docker, Kubernetes) and microservices architecture.
  • Familiarity with DevOps practices and Agile methodologies.


  • Plano, Texas, United States Futran Tech Solutions Pvt. Ltd. Full time

    Job Title: CTP Reliability and Monitoring EngineerFutran Tech Solutions Pvt. Ltd. is seeking a skilled CTP Reliability and Monitoring Engineer to join our team. As a key member of our Platform Engineering team, you will be responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services.Key...


  • Plano, Texas, United States Futran Tech Solutions Pvt. Ltd. Full time

    Job Title: Cloud Reliability EngineerFutran Tech Solutions Pvt. Ltd. is seeking a skilled Cloud Reliability Engineer to join our team. As a Cloud Reliability Engineer, you will be responsible for ensuring the reliability, resilience, and fault tolerance of our cloud-based services and applications.Key Responsibilities:Design and implement comprehensive...


  • Plano, Texas, United States VDart Full time

    Job Title: CTP Reliability and Monitoring EngineerAt VDart, we are seeking a highly skilled CTP Reliability and Monitoring Engineer to join our team. As a key member of our Platform Engineering team, you will be responsible for ensuring the availability, performance, and reliability of our cloud-based infrastructure and services.Key Responsibilities:Design,...


  • Plano, Texas, United States Dexian - DISYS Full time

    We are seeking a Senior Site Reliability Engineer to join our team at Dexian - DISYS. As a key member of our Incident Management team, you will be responsible for establishing frameworks, best practices, and scope management as we transition Incident Management into a Site Reliability Engineering team.You will partner with Platform Engineering, Development,...


  • Plano, Texas, United States Dexian - DISYS Full time

    Senior Site Reliability EngineerDexian is a leading provider of staffing, IT, and workforce solutions with over 12,000 employees and 70 locations worldwide. We are seeking a Senior Site Reliability Engineer to join our team.Key Responsibilities:Establish frameworks, best practices, and scope management for Incident Management as we transition into a Site...


  • Plano, Texas, United States Dexian - DISYS Full time

    Senior Site Reliability EngineerDexian - DISYS is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our Incident Management team, you will be responsible for establishing frameworks, best practices, and scope management as we transition Incident Management into a Site Reliability Engineering team.Key...


  • Plano, Texas, United States Dexian Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our Incident Management team. As a key member of our team, you will be responsible for establishing frameworks, best practices, and scope management as we transition Incident Management into a Site Reliability Engineering team.Key Responsibilities:Partner...


  • Plano, Texas, United States Cognizant North America Full time

    About Cognizant's Digital Engineering Practice:Cognizant Digital Engineering is a small, cross-functional team that builds higher quality software faster. Our team consists of a Product Manager, an Architect, Full-Stack Developers, UI/UX designers, and Big Data analysts. We work together to ideate and develop innovative cloud-based solutions following a...


  • Plano, Texas, United States Amaze Systems Inc. Full time

    Job OverviewWe are seeking a highly skilled Cloud Reliability Engineer with expertise in Azure to join our team at Amaze Systems Inc.The ideal candidate will have a strong background in cloud infrastructure, Azure administration, and automation. They will be responsible for designing, implementing, and maintaining high-availability, scalable, and secure...


  • Plano, Texas, United States Bank of America Full time

    Job Description:At Bank of America, we are committed to delivering exceptional customer experiences through the power of technology. As a Senior Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based systems.We are seeking a highly skilled and experienced engineer to join our team. The ideal...


  • Plano, Texas, United States Bowman Williams Full time

    We are a Cloud Services Provider that prides ourselves on bringing cutting-edge technology products and services with top-notch advisors to a wide range of markets, including financial, legal, educational, industrial, and medical.We are seeking a Cloud Infrastructure Specialist with a background in the MSP industry who can hit the ground running.Your key...


  • Plano, Texas, United States Tyler Technologies Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our Technical and Cloud Services team. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and performance of our infrastructure while driving automation and efficiency in our development process.Key ResponsibilitiesDesign, build, and...


  • Plano, Texas, United States Tyler Technologies Full time

    Job Title: Site Reliability Engineer, Technical and Cloud ServicesWe are seeking a highly skilled Site Reliability Engineer to join our Technical and Cloud Services team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure while driving automation and efficiency in our...


  • Plano, Texas, United States Tyler Technologies Full time

    Job Title: Site Reliability Engineer, Technical and Cloud ServicesWe are seeking a highly skilled Site Reliability Engineer to join our Technical and Cloud Services team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure while driving automation and efficiency in our...


  • Plano, Texas, United States Aloden, Inc. Full time

    Job DescriptionAloden, Inc. is seeking a skilled Cloud Infrastructure Specialist to join our team. The ideal candidate will have experience in designing, implementing, and maintaining cloud-based solutions using AWS, Azure, and/or GCP.Key Responsibilities:Design and implement cloud-based solutions using AWS, Azure, and/or GCPMonitor and optimize cloud...


  • Plano, Texas, United States Next Ventures Full time

    Cloud Engineer Job DescriptionA leading financial institution is seeking an experienced AWS Cloud Engineer to design and implement scalable cloud architectures, ensuring optimal performance and reliability of their platforms.Key Responsibilities:Own the end-to-end lifecycle of AWS-based platforms, ensuring optimal performance, reliability, and...


  • Plano, Texas, United States Apex Systems Full time

    Job Title: Site Reliability EngineerApex Systems is seeking a highly skilled Site Reliability Engineer to join our team in Plano, TX. This is a 40% remote opportunity with 2 days of remote work per week.We are looking for a talented individual with a strong background in cloud infrastructure, DevOps, and SRE to support one of our largest commercial clients...


  • Plano, Texas, United States Hispanic Technology Executive Council Full time

    About UsAt Hispanic Technology Executive Council, we are driven by a shared purpose to harness the power of technology to drive innovation and growth. Our team is dedicated to creating a workplace that is inclusive, diverse, and supportive of our employees' well-being.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team. As a...


  • Plano, Texas, United States TEK NINJAS Full time

    Job Title: Python Cloud Security SpecialistJob Description:Role Overview:TEK NINJAS is seeking a highly skilled Python Cloud Security Specialist to join our team. As a key member of our cloud security team, you will be responsible for ensuring the security and compliance of our cloud infrastructure.Key Responsibilities:Implement and manage Cloud Custodian, a...

  • AWS Cloud Engineer

    1 month ago


    Plano, Texas, United States Diverse Lynx Full time

    Cloud Infrastructure SpecialistWe are seeking a highly skilled Cloud Infrastructure Specialist to join our team at Diverse Lynx LLC. As a key member of our IT team, you will be responsible for designing, implementing, and managing cloud and on-premises environments using AWS.Key Responsibilities:Design and implement scalable and secure cloud infrastructure...