Site Reliability Engineer

3 weeks ago


Austin, United States Terminal Industries Full time

About Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last great frontier of untapped data. In the process, Terminal will address many industry-wide pain points, including compliance, manual processes, equipment location, phantom costs, and labor inefficiencies. Ultimately, Terminal will become the central nervous system for the yard, seamlessly connecting all data sources to support an extensive range of essential functions. Overview Our world class vision engineering team has built an engine that can process the movement of trucks and containers in real-time. It’s now time to unlock the potential of that engine by building SaaS applications that leverage the vision engine to transform the logistics industry. As part of Terminal’s Site Reliability Engineering team you will help build out the network and IoT infrastructure required to deploy and operate our camera technology at scale. We are seeking an experienced Site Reliability Engineer with a minimum of 5 years of relevant experience to join our team. As a founding member of our Engineering team, you will play a pivotal role in architecting and developing cutting-edge solutions. The ideal candidate possesses expertise in AWS, proficiency in operations, and running software at scale. They will have a deep understanding of event-driven technologies, hands-on experience with modern data stores, and a commitment to implementing observability and a passion for operational excellence. Taking ownership of production quality, reliability and security. Responsibilities Oversee the deployment, management, and maintenance of IoT devices, including camera systems and sensors. Ensure devices are properly integrated, configured, and secured within the network. Manage firmware updates and patches for IoT devices, ensuring that all devices are up-to-date and secure. Develop and implement strategies for efficient deployment of updates. Implement mechanisms for collecting and processing data from IoT devices. Ensure data integrity, availability, and confidentiality. Troubleshoot and resolve connectivity issues related to IoT devices. Manage integration between IoT devices and cloud infrastructure, ensuring seamless data flow and system interoperability. Design and implement solutions to scale IoT deployments effectively. Monitor device performance and system health to ensure high reliability and availability. Design, build, and operate infrastructure using Infrastructure as Code (IaC) tools like Terraform and Ansible. Develop and maintain infrastructure automation to ensure scalability and reliability. Define and implement best practices for continuous deployment of software and services using CI/CD tools such as GitHub Actions. Automate deployment processes to streamline operations. Lead incident response efforts, including diagnosis, resolution, and post-mortem analysis. Implement robust monitoring and alerting systems to ensure quick detection and resolution of issues. Ensure that systems adhere to security best practices and regulatory compliance requirements. Implement security measures and conduct regular audits to safeguard production environments. Requirements Minimum of 5 years of experience in Site Reliability Engineering or a related role, with a proven track record of managing complex production environments. Strong background in operating systems, networking, distributed systems, and database management. Expertise in AWS cloud services and infrastructure management. Hands-on experience with deploying, managing, and maintaining IoT devices and sensor systems. Knowledge of IoT protocols (e.g., MQTT, CoAP) and device integration practices. Experience in managing firmware updates and ensuring the security and functionality of IoT devices. Proficiency in managing and troubleshooting connectivity issues in IoT environments, including wireless and wired communication protocols. Experience with data collection and processing from IoT devices, including ensuring data quality and managing large volumes of data. Demonstrated experience in incident response, production monitoring, and capacity planning. Ability to handle high-pressure situations and ensure system reliability. What We Offer Joining the Terminal team means being part of a dynamic, innovative environment where your work directly impacts the future of logistics and the global supply chain. You will work closely with a team of experts passionate about operational excellence and technological innovation. We offer competitive salaries, a comprehensive benefits package, and opportunities for professional growth. #J-18808-Ljbffr



  • Austin, Texas, United States Apex Systems Full time

    Job DescriptionPosition: Site Reliability EngineerLocation: RemoteDuration: 1 yearRate: $67/hr W-2We are seeking a highly skilled Site Reliability Engineer to join our team at Apex Systems. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key...


  • Austin, Texas, United States Cape Henry Associates, Acquired by JANUS Research Group Full time

    Janus is looking for a seasoned Site Reliability Engineer / DevSecOps Developer to help grow our capability with our DoD clients.Develop Infrastructure as Code (IaC) designing, implementing, and maintaining infrastructure using IaC technologies(e.g. terraform or similar) ensuring scalable, reliable, and efficient platformsCollaborate with data and other...


  • Austin, United States JobRialto Full time

    Skills: 6+ years of experience in systems and platform operations and technology Experience with On Prem and Public Cloud - AWS, EKS Scripting languages like Python Linux Administration and Cloud, DevOps experience would be a plus Team As a member of the Site Reliability Engineering & Production Services team, you will work with other technology...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and services.Key ResponsibilitiesDesign, build, and maintain robust infrastructure and automation solutionsWork closely with...


  • Austin, Texas, United States Thales Full time

    About the RoleThales is seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and security of our cloud-based services.Key ResponsibilitiesCollaborate with project managers and service delivery managers to analyze traffic trends and capacity...


  • Austin, Texas, United States Expedia Group Full time

    Principal Site Reliability EngineerWe are looking for a highly qualified and seasoned Principal Site Reliability Engineer (SRE) to enhance our operations. The successful candidate will play a crucial role in guaranteeing the stability, scalability, and efficiency of our systems and services. You will collaborate closely with both development and operational...

  • Software Engineer

    4 days ago


    Austin, United States Apple Full time

    Carrier Services offer seamless integration of Apple Retail Stores and Apple Online store with major US Carriers for iPhone activations. We are looking for a talented Site Reliability Engineer to join our growing team. As an SRE, you will be responsi Engineer, Software Engineer, Liability, Reliability Engineer, Retail, Reliability, Technology


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our team at Apple. As a Site Reliability Engineering Manager, you will be responsible for leading a team that provides the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to...


  • Austin, United States Visa Full time

    Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, Texas, United States Expedia Group Full time

    Principal Software Development Engineer - Site ReliabilityWe are looking for a highly proficient and seasoned Principal Software Development Engineer (SRE) to enhance our team. The successful candidate will be accountable for maintaining the reliability, scalability, and performance of our systems and services. You will collaborate closely with both...


  • Austin, United States Thales USA, Inc. Full time

    Location: Austin, United States of America. Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they hav Reliability Engineer, Liability, Reliability, Engineer, Reliability, Monitoring


  • Austin, United States Computer Futures Full time

    Position Summary: We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our client in Austin. The ideal candidate will have a strong background in infrastructure as code (IaC), automation, container orchestration, and monitoring solutions. As an SRE, you will play a critical role in ensuring the reliability, scalability, and...


  • Austin, Texas, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our Apple Service Engineering team. As a key member of our team, you will be responsible for establishing and maintaining the reliability and scalability of our cloud services.Key ResponsibilitiesLead a team of engineers in providing a platform for mission-critical...


  • Austin, Texas, United States NinjaOne Full time

    About the RoleAt NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Site Reliability Engineering Manager to join our Platform Engineering team and help us scale our products to millions of end-users. You will have the opportunity to build the SRE team from the ground up...


  • Austin, United States Cape Henry Associates, Acquired by JANUS Research Group Full time

    Janus is looking for a seasoned Site Reliability Engineer / DevSecOps Developer to help grow our capability with our DoD clients.Develop Infrastructure as Code (IaC) designing, implementing, and maintaining infrastructure using IaC technologies(e.g. terraform or similar) ensuring scalable, reliable, and efficient platformsCollaborate with data and other...


  • Austin, Texas, United States Expedia Group Full time

    Principal Software Development Engineer - Site ReliabilityWe are in search of a highly qualified and seasoned Principal Software Development Engineer (SRE) to enhance our operations. The ideal candidate will be tasked with ensuring the dependability, scalability, and efficiency of our services and systems. You will collaborate closely with both development...


  • Austin, United States Terminal Industries Full time

    About Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last...


  • Austin, United States Terminal Industries Full time

    About Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last...


  • Austin, TX, United States Visa Full time

    Company DescriptionVisa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, United States Expedia Group Full time

    Senior Software Development Engineer - Site Reliability  We are seeking a highly skilled and experienced Senior Software Development Engineer (SRE) to join our team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our services and systems. You will work closely with development and operations teams to...