Principal Site Reliability Engineer

3 days ago


Austin, United States Terminal Industries Full time
About Us

Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last great frontier of untapped data. In the process, Terminal will address many industry-wide pain points, including compliance, manual processes, equipment location, phantom costs, and labor inefficiencies. Ultimately, Terminal will become the central nervous system for the yard, seamlessly connecting all data sources to support an extensive range of essential functions.

Overview

Our world class vision engineering team has built an engine that can process the movement of trucks and containers in real-time. It's now time to unlock the potential of that engine by building SaaS applications that leverage the vision engine to transform the logistics industry. As part of Terminal's Site Reliability Engineering team you will help build out the network and IoT infrastructure required to deploy and operate our camera technology at scale.

We are seeking an experienced Principal Site Reliability Engineer with a minimum of 12 years of relevant experience to join our team. As a founding member of our Engineering team, you will play a pivotal role in architecting and developing cutting-edge solutions. The ideal candidate possesses expertise in AWS, proficiency in operations, and running software at scale. They will have a deep understanding of event-driven technologies, hands-on experience with modern data stores, and a commitment to implementing observability and a passion for operational excellence. Taking ownership of production quality, reliability and security.

Responsibilities
  • Oversee the deployment, management, and maintenance of IoT devices, including camera systems and sensors. Ensure devices are properly integrated, configured, and secured within the network.
  • Manage firmware updates and patches for IoT devices, ensuring that all devices are up-to-date and secure. Develop and implement strategies for efficient deployment of updates.
  • Implement mechanisms for collecting and processing data from IoT devices. Ensure data integrity, availability, and confidentiality.
  • Troubleshoot and resolve connectivity issues related to IoT devices. Manage integration between IoT devices and cloud infrastructure, ensuring seamless data flow and system interoperability.
  • Design and implement solutions to scale IoT deployments effectively. Monitor device performance and system health to ensure high reliability and availability.
  • Design, build, and operate infrastructure using Infrastructure as Code (IaC) tools like Terraform and Ansible. Develop and maintain infrastructure automation to ensure scalability and reliability.
  • Define and implement best practices for continuous deployment of software and services using CI/CD tools such as GitHub Actions. Automate deployment processes to streamline operations.
  • Lead incident response efforts, including diagnosis, resolution, and post-mortem analysis. Implement robust monitoring and alerting systems to ensure quick detection and resolution of issues.
  • Ensure that systems adhere to security best practices and regulatory compliance requirements. Implement security measures and conduct regular audits to safeguard production environments.
Requirements
  • Minimum of 12 years of experience in Site Reliability Engineering or a related role, with a proven track record of managing complex production environments.
  • Strong background in operating systems, networking, distributed systems, and database management. Expertise in AWS cloud services and infrastructure management.
  • Hands-on experience with deploying, managing, and maintaining IoT devices and sensor systems. Knowledge of IoT protocols (e.g., MQTT, CoAP) and device integration practices.
  • Experience in managing firmware updates and ensuring the security and functionality of IoT devices.
  • Proficiency in managing and troubleshooting connectivity issues in IoT environments, including wireless and wired communication protocols.
  • Experience with data collection and processing from IoT devices, including ensuring data quality and managing large volumes of data.
  • Demonstrated experience in incident response, production monitoring, and capacity planning. Ability to handle high-pressure situations and ensure system reliability.


What We Offer

Joining the Terminal team means being part of a dynamic, innovative environment where your work directly impacts the future of logistics and the global supply chain. You will work closely with a team of experts passionate about operational excellence and technological innovation. We offer competitive salaries, a comprehensive benefits package, and opportunities for professional growth.

  • Austin, United States Charles Schwab Full time

    Position Type: RegularYour opportunityAt Schwab, you are empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us “challenge the status quo” and transform the finance industry together.  As a Principal Site Reliability Engineer for Schwab's Technology Solutions organization, you will be responsible...


  • Austin, United States TEACHER RETIREMENT SYSTEM Full time

    The Site Reliability Engineer(Microsoft Exchange) Associate assists in maintaining the reliability, scalability, and performance of TRSs IT infrastructure. The incumbent will assist in supporting the management of a hybrid Exchange environment, integrating Proofpoint as the Email Gateway, and using PowerShell scripts for automation. This position will work...


  • Austin, United States Farm Credit Bank of Texas Full time

    Job DescriptionWho we are: Farm Credit Bank of Texas is a $38.2 billion wholesale bank that has been financing agriculture and rural America for over 100 years. Headquartered in Austin, Texas, we provide funding and services to rural lending associations in five states, and we are active in the nation's capital markets. While you may not be familiar with...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job SummaryWe are seeking a highly skilled Reliability & Compliance Solutions Engineer to join our team at the Electric Reliability Council of Texas. This role will play a critical part in ensuring the reliability and compliance of our operations, working closely with subject matter experts to meet or exceed performance requirements.Main...


  • Austin, Texas, United States Apple Full time

    Role OverviewWe are seeking a skilled Principal Reliability Engineer to collaborate with our diverse hardware teams at Apple. This role is focused on ensuring our products meet the highest standards of durability and reliability.Job SummaryThe successful candidate will lead cross-functional teams, design innovative reliability tests, and drive continuous...


  • Austin, United States CV Library Full time

    Job DescriptionAs a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that help...


  • Austin, United States Centraprise Full time

    Job Role: SRE (Site Reliability Engineer)Job Type: Full time/ Permanent Location : Austin, TXJob Description :Knowledge about Linux systems, commandsExpertise in AWS and managing native services, debug skillsConfiguration management tools like cloud formation or terraform but terraform is highly preferred since that’s mostly used for Try RatingExpertise in...


  • Austin, Texas, United States The Electric Reliability Council of Texas (ERCOT) Full time

    We are seeking a talented Grid Reliability and Compliance Engineer to join our team at The Electric Reliability Council of Texas (ERCOT). As a key member of our team, you will be responsible for ensuring that ERCOT ISO meets or exceeds its reliability performance requirements.Your primary responsibilities will include monitoring and reporting ERCOT ISO and...


  • Austin, United States Terminal Industries Full time

    About Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last...


  • Austin, United States CV Library Full time

    Job DescriptionWe’re looking for a Staff Site Reliability Engineer to join Procore’s Project Execution Group. In this role, you’ll lead, collaborate, partner and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant and reliable as this team is directly...


  • Austin, Texas, United States Unreal Gigs Full time

    Job DescriptionWe are seeking a skilled Senior Manager of DevOps and Site Reliability to join our team at Unreal Gigs. This role is responsible for leading the development, maintenance, and enhancement of our user-facing application and internal tools.About UsWe are a fully remote engineering team that values collaboration, innovation, and continuous...


  • Austin, United States Visa Full time

    Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, United States Nomi Health Full time

    We are seeking a Site Reliability Engineer (SRE) to join our team in Austin, TX. You will play a pivotal role in ensuring the reliability, performance, and scalability of our services. You will collaborate with cross-functional teams to design, implement, and manage infrastructure that is robust and resilient. Your focus will be on developing and refining...


  • Austin, Texas, United States AutoRABIT Holding Inc. Full time

    About AutoRABITAutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare.About the RoleAs a Senior Site Reliability/DevOps Engineer at AutoRABIT, you will play a critical role in developing, scaling, and operating our cloud...


  • Austin, United States AutoRABIT Holding Inc. Full time

    About AutoRABIT: AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such financial institutions, insurance, and healthcare. AutoRABIT solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team,...


  • Austin, United States Meade Engineering Full time

    Company Description: Meade Engineering is a leading engineering firm, we specialize in providing innovative and cost-effective solutions for data center builders, cloud providers, and developers of large construction projects. With years of experience, our team of expert engineers, designers, and consultants is dedicated to delivering high-quality, reliable,...


  • Austin, United States Visa Full time

    Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, United States Visa Full time

    Company DescriptionVisa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, Texas, United States Electric Reliability Council of Texas Full time

    Job OverviewAt the Electric Reliability Council of Texas, we are seeking a highly skilled Power System Engineer to join our team. As a key member of our organization, you will play a crucial role in ensuring the reliable operation of the electric power grid.Key ResponsibilitiesPerform complex engineering studies, including power flow, voltage security, and...


  • Austin, Texas, United States Apple Full time

    Job TitleStaff Site Reliability Engineer, Kubernetes ASEAbout the RoleThis is a pivotal position in our Service Engineering team at Apple, where you will play a key role in shaping the future of our products and services. As an SRE, you will be responsible for supporting and scaling cloud services for thousands of development and operations engineers.Key...