Principal Site Reliability Engineer
3 days ago
Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last great frontier of untapped data. In the process, Terminal will address many industry-wide pain points, including compliance, manual processes, equipment location, phantom costs, and labor inefficiencies. Ultimately, Terminal will become the central nervous system for the yard, seamlessly connecting all data sources to support an extensive range of essential functions.
Overview
Our world class vision engineering team has built an engine that can process the movement of trucks and containers in real-time. It's now time to unlock the potential of that engine by building SaaS applications that leverage the vision engine to transform the logistics industry. As part of Terminal's Site Reliability Engineering team you will help build out the network and IoT infrastructure required to deploy and operate our camera technology at scale.
We are seeking an experienced Principal Site Reliability Engineer with a minimum of 12 years of relevant experience to join our team. As a founding member of our Engineering team, you will play a pivotal role in architecting and developing cutting-edge solutions. The ideal candidate possesses expertise in AWS, proficiency in operations, and running software at scale. They will have a deep understanding of event-driven technologies, hands-on experience with modern data stores, and a commitment to implementing observability and a passion for operational excellence. Taking ownership of production quality, reliability and security.
Responsibilities
- Oversee the deployment, management, and maintenance of IoT devices, including camera systems and sensors. Ensure devices are properly integrated, configured, and secured within the network.
- Manage firmware updates and patches for IoT devices, ensuring that all devices are up-to-date and secure. Develop and implement strategies for efficient deployment of updates.
- Implement mechanisms for collecting and processing data from IoT devices. Ensure data integrity, availability, and confidentiality.
- Troubleshoot and resolve connectivity issues related to IoT devices. Manage integration between IoT devices and cloud infrastructure, ensuring seamless data flow and system interoperability.
- Design and implement solutions to scale IoT deployments effectively. Monitor device performance and system health to ensure high reliability and availability.
- Design, build, and operate infrastructure using Infrastructure as Code (IaC) tools like Terraform and Ansible. Develop and maintain infrastructure automation to ensure scalability and reliability.
- Define and implement best practices for continuous deployment of software and services using CI/CD tools such as GitHub Actions. Automate deployment processes to streamline operations.
- Lead incident response efforts, including diagnosis, resolution, and post-mortem analysis. Implement robust monitoring and alerting systems to ensure quick detection and resolution of issues.
- Ensure that systems adhere to security best practices and regulatory compliance requirements. Implement security measures and conduct regular audits to safeguard production environments.
- Minimum of 12 years of experience in Site Reliability Engineering or a related role, with a proven track record of managing complex production environments.
- Strong background in operating systems, networking, distributed systems, and database management. Expertise in AWS cloud services and infrastructure management.
- Hands-on experience with deploying, managing, and maintaining IoT devices and sensor systems. Knowledge of IoT protocols (e.g., MQTT, CoAP) and device integration practices.
- Experience in managing firmware updates and ensuring the security and functionality of IoT devices.
- Proficiency in managing and troubleshooting connectivity issues in IoT environments, including wireless and wired communication protocols.
- Experience with data collection and processing from IoT devices, including ensuring data quality and managing large volumes of data.
- Demonstrated experience in incident response, production monitoring, and capacity planning. Ability to handle high-pressure situations and ensure system reliability.
What We Offer
Joining the Terminal team means being part of a dynamic, innovative environment where your work directly impacts the future of logistics and the global supply chain. You will work closely with a team of experts passionate about operational excellence and technological innovation. We offer competitive salaries, a comprehensive benefits package, and opportunities for professional growth.
-
Principal Site Reliability Engineer
1 month ago
Austin, United States Charles Schwab Full timePosition Type: RegularYour opportunityAt Schwab, you are empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us “challenge the status quo” and transform the finance industry together. As a Principal Site Reliability Engineer for Schwab's Technology Solutions organization, you will be responsible...
-
Site Reliability Engineer
3 weeks ago
Austin, United States TEACHER RETIREMENT SYSTEM Full timeThe Site Reliability Engineer(Microsoft Exchange) Associate assists in maintaining the reliability, scalability, and performance of TRSs IT infrastructure. The incumbent will assist in supporting the management of a hybrid Exchange environment, integrating Proofpoint as the Email Gateway, and using PowerShell scripts for automation. This position will work...
-
Site Reliability Engineer
2 days ago
Austin, United States Farm Credit Bank of Texas Full timeJob DescriptionWho we are: Farm Credit Bank of Texas is a $38.2 billion wholesale bank that has been financing agriculture and rural America for over 100 years. Headquartered in Austin, Texas, we provide funding and services to rural lending associations in five states, and we are active in the nation's capital markets. While you may not be familiar with...
-
Reliability & Compliance Solutions Engineer
3 weeks ago
Austin, Texas, United States Electric Reliability Council of Texas Full timeJob SummaryWe are seeking a highly skilled Reliability & Compliance Solutions Engineer to join our team at the Electric Reliability Council of Texas. This role will play a critical part in ensuring the reliability and compliance of our operations, working closely with subject matter experts to meet or exceed performance requirements.Main...
-
Principal Reliability Engineer
1 hour ago
Austin, Texas, United States Apple Full timeRole OverviewWe are seeking a skilled Principal Reliability Engineer to collaborate with our diverse hardware teams at Apple. This role is focused on ensuring our products meet the highest standards of durability and reliability.Job SummaryThe successful candidate will lead cross-functional teams, design innovative reliability tests, and drive continuous...
-
Site Reliability Engineer
2 days ago
Austin, United States CV Library Full timeJob DescriptionAs a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that help...
-
Site Reliability Engineer
1 month ago
Austin, United States Centraprise Full timeJob Role: SRE (Site Reliability Engineer)Job Type: Full time/ Permanent Location : Austin, TXJob Description :Knowledge about Linux systems, commandsExpertise in AWS and managing native services, debug skillsConfiguration management tools like cloud formation or terraform but terraform is highly preferred since that’s mostly used for Try RatingExpertise in...
-
Grid Reliability and Compliance Engineer
1 day ago
Austin, Texas, United States The Electric Reliability Council of Texas (ERCOT) Full timeWe are seeking a talented Grid Reliability and Compliance Engineer to join our team at The Electric Reliability Council of Texas (ERCOT). As a key member of our team, you will be responsible for ensuring that ERCOT ISO meets or exceeds its reliability performance requirements.Your primary responsibilities will include monitoring and reporting ERCOT ISO and...
-
Site Reliability Engineer
2 days ago
Austin, United States Terminal Industries Full timeAbout Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last...
-
Staff Site Reliability Engineer
3 days ago
Austin, United States CV Library Full timeJob DescriptionWe’re looking for a Staff Site Reliability Engineer to join Procore’s Project Execution Group. In this role, you’ll lead, collaborate, partner and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant and reliable as this team is directly...
-
Austin, Texas, United States Unreal Gigs Full timeJob DescriptionWe are seeking a skilled Senior Manager of DevOps and Site Reliability to join our team at Unreal Gigs. This role is responsible for leading the development, maintenance, and enhancement of our user-facing application and internal tools.About UsWe are a fully remote engineering team that values collaboration, innovation, and continuous...
-
Site Reliability Engineering
3 weeks ago
Austin, United States Visa Full timeCompany Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Site Reliability Engineer
3 days ago
Austin, United States Nomi Health Full timeWe are seeking a Site Reliability Engineer (SRE) to join our team in Austin, TX. You will play a pivotal role in ensuring the reliability, performance, and scalability of our services. You will collaborate with cross-functional teams to design, implement, and manage infrastructure that is robust and resilient. Your focus will be on developing and refining...
-
Senior Site Reliability/DevOps Engineer
1 hour ago
Austin, Texas, United States AutoRABIT Holding Inc. Full timeAbout AutoRABITAutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare.About the RoleAs a Senior Site Reliability/DevOps Engineer at AutoRABIT, you will play a critical role in developing, scaling, and operating our cloud...
-
Senior Site Reliability/DevOps Engineer
3 weeks ago
Austin, United States AutoRABIT Holding Inc. Full timeAbout AutoRABIT: AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such financial institutions, insurance, and healthcare. AutoRABIT solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team,...
-
Principal Electrical Engineer
3 days ago
Austin, United States Meade Engineering Full timeCompany Description: Meade Engineering is a leading engineering firm, we specialize in providing innovative and cost-effective solutions for data center builders, cloud providers, and developers of large construction projects. With years of experience, our team of expert engineers, designers, and consultants is dedicated to delivering high-quality, reliable,...
-
Austin, United States Visa Full timeCompany Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Senior Site Reliability Engineer
1 month ago
Austin, United States Visa Full timeCompany DescriptionVisa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Austin, Texas, United States Electric Reliability Council of Texas Full timeJob OverviewAt the Electric Reliability Council of Texas, we are seeking a highly skilled Power System Engineer to join our team. As a key member of our organization, you will play a crucial role in ensuring the reliable operation of the electric power grid.Key ResponsibilitiesPerform complex engineering studies, including power flow, voltage security, and...
-
Site Reliability Engineer
1 day ago
Austin, Texas, United States Apple Full timeJob TitleStaff Site Reliability Engineer, Kubernetes ASEAbout the RoleThis is a pivotal position in our Service Engineering team at Apple, where you will play a key role in shaping the future of our products and services. As an SRE, you will be responsible for supporting and scaling cloud services for thousands of development and operations engineers.Key...