Reliability and Performance Expert

6 days ago


Austin, Texas, United States Emerald Cloud Lab Full time
Job Description

Overview

The Emerald Cloud Laboratory (ECL) is seeking a highly skilled Site Reliability Engineer - Cloud Infrastructure Specialist to join our team. As a key member of our Infrastructure and Tools team, you will be responsible for ensuring the security, reliability, and capacity of our cloud infrastructure and software applications.

Key Responsibilities

  • Design and Develop: Design and develop processes and tools to automate and audit all aspects of development and production environments and databases for the ECL cloud application back-end.
  • Infrastructure Management: Continuously improve our set of in-house Go and Python facilities for automating container builds and deployments, and our bespoke Wolfram Language-based automated unit testing environment.
  • Cloud Administration: Develop applications related to laboratory systems, automated provision and deployment of Wolfram Enterprise Private Cloud instances for integration with our customer-facing Command Center application.
  • Domain-Specific Language Infrastructure: Develop domain-specific language infrastructure in support of ECL's Symbolic Lab Language.
  • Release Planning and Coordination: Coordinate with and advise other teams to plan and execute releases of application upgrades, new services, and migrations to new architectures or infrastructures, without degradation or interruption of service.

Requirements

  • Coding Skills: Proficient in developing and automating solutions to enhance infrastructure reliability and performance.
  • Kubernetes Expertise: Expertise in container orchestration, ensuring seamless deployment, scaling, and management of microservices.
  • Cloud Administration: Skilled in cloud infrastructure management, with hands-on experience in AWS, including EC2, S3, IAM, and more. Familiar with other cloud platforms as well.
  • Observability Setup: Adept at implementing comprehensive observability solutions, including distributed tracing with OpenTelemetry (Otel), creating actionable dashboards, and setting up effective monitoring and alerting systems.
  • Cloud Networking & Security: Deep understanding of cloud networking and security concepts, including VPCs, VPNs, subnets, and security best practices.
  • DevOps Practices: Proficient in CI/CD tools, with experience in automating deployment pipelines and seamlessly deploying applications to Kubernetes from source control management (SCM).
  • SLI/SLO Metrics: Proven track record of setting up Service Level Indicators (SLIs), Service Level Objectives (SLOs), and other key performance metrics to ensure service reliability and performance.

About ECL

The Emerald Cloud Laboratory (ECL) enables life scientists to move out of the lab, and to conduct research entirely from a computer. Stepping away from manual completion of experiments at the bench, scientists on the ECL leverage the remote, automated execution of all standard biology and chemistry experiments in Emerald's industrial lab facilities, working within a software platform for all stages of research workflows, from experimental design to data analysis.

What We Offer

At Emerald Cloud Lab, we are committed to pioneering the future of scientific research by providing an innovative, cloud-based laboratory environment. We believe in the power of collaboration, diversity, and the continuous pursuit of knowledge to drive groundbreaking discoveries. If you are passionate about reshaping the landscape of scientific experimentation and eager to contribute to a culture of excellence and innovation, we invite you to join us.



  • Austin, Texas, United States Visa Full time

    Job Description**About the Role**We are seeking a highly skilled Hadoop System Engineer to join our team at Visa. As a key member of our reliability engineering team, you will be responsible for ensuring the stability and performance of our big data platforms.Key ResponsibilitiesSingle Window Support: Provide expert-level support for Hadoop-related issues,...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineer to join our dynamic team at Apple. As a key member of our B2B team, you will play a critical role in ensuring the reliability and performance of our systems and services.Key ResponsibilitiesImplement and maintain best-in-class DevOps practices to ensure the scalability,...


  • Austin, Texas, United States Visa Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer - Cloud Infrastructure Expert to join our team at Visa. As a key member of our cloud infrastructure team, you will be responsible for ensuring the security, availability, and performance of our cloud-based systems.Key ResponsibilitiesDesign, implement, and maintain scalable and...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineer to join our team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our enterprise technology systems.Key ResponsibilitiesImplement and maintain best-in-class DevOps practices to ensure the...


  • Austin, Texas, United States Thales Full time

    About the RoleThales is seeking an experienced Site Reliability Engineer to join our team. As a key member of our cloud infrastructure team, you will be responsible for designing, developing, and maintaining our cloud-based solutions.Key ResponsibilitiesCollaborate with project managers and service delivery managers to analyze traffic trends and assess the...


  • Austin, Texas, United States Visa Full time

    About the RoleWe are seeking a highly skilled Database Reliability Engineer to join our team at Visa. As a key member of our Staff Site Reliability Engineering team, you will be responsible for delivering operationally excellent open system database infrastructure.Key ResponsibilitiesDesign, engineer, and build reliable, scalable, secure, available, and...


  • Austin, Texas, United States Visa Full time

    About the RoleWe are seeking a highly skilled Staff Site Reliability Engineer to join our team at Visa. As a key member of our Staff Site Reliability Engineering team, you will be responsible for delivering operationally excellent open system database infrastructure.Key ResponsibilitiesDatabase Reliability Engineering: Primarily responsible for database...


  • Austin, Texas, United States The University of Texas at Austin Full time

    Position OverviewThe Texas Institute for Electronics (TIE) is in search of a seasoned and forward-thinking Director of Reliability Engineering to spearhead our reliability initiatives in cutting-edge semiconductor technologies, particularly focusing on 3DHI and advanced packaging.Key ResponsibilitiesOversee and direct the reliability engineering team,...


  • Austin, Texas, United States The University of Texas at Austin Full time

    About the RoleWe are seeking a highly experienced and innovative Director of Reliability Engineering to lead our reliability efforts in advanced semiconductor technologies, with a focus on 3DHI and advanced packaging.Key ResponsibilitiesLead and manage the reliability engineering team, overseeing all aspects of reliability testing, analysis, and...


  • Austin, Texas, United States The University of Texas at Austin Full time

    About the RoleThe University of Texas at Austin is seeking a highly experienced and innovative Director of Reliability Engineering to lead our reliability efforts in advanced semiconductor technologies, with a focus on 3DHI and advanced packaging.Key ResponsibilitiesLead and manage the reliability engineering team, overseeing all aspects of reliability...


  • Austin, Texas, United States The University of Texas at Austin Full time

    Position Overview The Texas Institute for Electronics (TIE) is in search of a seasoned and forward-thinking Director of Reliability Engineering to spearhead our reliability initiatives in cutting-edge semiconductor technologies, particularly focusing on 3DHI and advanced packaging. Key Responsibilities Oversee and direct the reliability engineering team,...


  • Austin, Texas, United States The University of Texas at Austin Full time

    Position OverviewThe Texas Institute for Electronics (TIE) is on the lookout for a seasoned and innovative Director of Reliability Engineering to spearhead our reliability initiatives in advanced semiconductor technologies, particularly focusing on 3DHI and advanced packaging.Key ResponsibilitiesOversee and direct the reliability engineering team, managing...


  • Austin, Texas, United States Visa Full time

    Job DescriptionCompany OverviewVisa is a leading global payments technology company, facilitating over 259 billion transactions annually across 200 countries and territories. Our mission is to connect the world through innovative, convenient, and secure payments solutions, empowering individuals, businesses, and economies to thrive.Job SummaryWe are seeking...


  • Austin, Texas, United States Visa Full time

    Job DescriptionCompany OverviewVisa is a leading global payments technology company, facilitating over 259 billion transactions annually across 200 countries and territories. Our mission is to connect the world through innovative, convenient, and secure payments solutions, empowering individuals, businesses, and economies to thrive.Job SummaryWe are seeking...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job OverviewAt the Electric Reliability Council of Texas (ERCOT), we pride ourselves on fostering a diverse and innovative work environment that empowers our employees to collaborate in shaping the future of the Texas power grid and wholesale market. Our commitment to diversity and inclusion is integral to our core values of accountability, leadership,...


  • Austin, Texas, United States Amazon Full time

    As a Senior Reliability Engineer, you will play a pivotal role in ensuring the operational excellence of Amazon's data centers globally. Your expertise will be essential in conducting thorough evaluations and providing insightful feedback on the design aspects across various engineering disciplines. In addition to your design responsibilities, you will...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Company Overview and Job RoleThe Electric Reliability Council of Texas (ERCOT) is at the forefront of managing the Texas power grid and wholesale market. We leverage advanced technologies and resources to ensure a reliable electricity supply. Our workplace is characterized by diversity and inclusivity, fostering an environment where innovation and...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job OverviewThe Electric Reliability Council of Texas (ERCOT) offers a vibrant and collaborative work environment where employees can contribute to the advancement of the Texas power grid and wholesale market through innovative technologies and resources. We are dedicated to cultivating a diverse and inclusive workforce that embodies our core values of...


  • Austin, Texas, United States Liquibase Full time

    Job OverviewCompany IntroductionLiquibase stands at the forefront of Database DevOps, empowering development teams worldwide to streamline their software delivery processes through automated database management. With over 100 million downloads, our innovative solutions are designed to enhance efficiency and governance in database operations.Position...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job OverviewAt the Electric Reliability Council of Texas (ERCOT), we foster a diverse and dynamic work environment that empowers our employees to collaborate in shaping the future of the Texas power grid and wholesale market. Our commitment to diversity and inclusion is fundamental to our corporate values, which include accountability, leadership,...