Site Reliability Engineer
3 days ago
Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last great frontier of untapped data. In the process, Terminal will address many industry-wide pain points, including compliance, manual processes, equipment location, phantom costs, and labor inefficiencies. Ultimately, Terminal will become the central nervous system for the yard, seamlessly connecting all data sources to support an extensive range of essential functions.
Overview
Our world class vision engineering team has built an engine that can process the movement of trucks and containers in real-time. It's now time to unlock the potential of that engine by building SaaS applications that leverage the vision engine to transform the logistics industry. We're hiring the team of engineers that will architect and build these applications from the ground up.
We are seeking an experienced Site Reliability Engineer with a minimum of 5 years of relevant experience to join our team. As a member of our Engineering team, you will play a pivotal role in architecting and developing cutting-edge solutions. The ideal candidate possesses expertise in AWS, proficiency in operations, and running software at scale. They will have a deep understanding of event-driven technologies, hands-on experience with modern data stores, and a commitment to implementing observability and a passion for operational excellence. Taking ownership of production quality, reliability and security.
Responsibilities
- Design, build, and operate infrastructure using Infrastructure as Code (IaC) tools like Terraform and Ansible. Develop and maintain infrastructure automation to ensure scalability and reliability.
- Define and implement best practices for continuous deployment of software and services using CI/CD tools such as GitHub Actions. Automate deployment processes to streamline operations.
- Collaborate with cross-functional teams to establish and enforce best practices for system reliability. Utilize service-level objectives (SLOs), error budgets, and other reliability metrics to measure, monitor, and enhance system performance.
- Develop automation to eliminate operational toil and reduce overhead for managing and deploying production systems. Enhance observability and monitoring to proactively identify and address issues.
- Lead incident response efforts, including diagnosis, resolution, and post-mortem analysis. Implement robust monitoring and alerting systems to ensure quick detection and resolution of issues.
- Monitor system performance and capacity, identifying and implementing improvements to ensure high availability and reliability of services.
- Ensure that systems adhere to security best practices and regulatory compliance requirements. Implement security measures and conduct regular audits to safeguard production environments.
- Stay current with emerging technologies and industry trends. Contribute to the continuous evolution of our technology stack, adapting to new challenges and opportunities.
- Minimum of 5 years of experience in Site Reliability Engineering or a related role, with a proven track record of managing complex production environments.
- Strong background in operating systems, networking, distributed systems, and database management. Expertise in AWS cloud services and infrastructure management.
- Demonstrated experience in incident response, production monitoring, and capacity planning. Ability to handle high-pressure situations and ensure system reliability.
- Proficiency in automating infrastructure and deployment processes using tools like Terraform, Ansible, and CI/CD pipelines.
- Excellent problem-solving and analytical abilities, with the capability to diagnose and resolve complex issues in production environments.
- Strong communication skills, with the ability to convey technical concepts clearly to both technical and non-technical stakeholders.
- Proven ability to work collaboratively with cross-functional teams, including engineering, product, and operations teams.
- Experience implementing security best practices and conducting security audits to ensure compliance and protect production systems.
- Comfort with a fast-paced, dynamic startup environment. Ability to quickly learn and adapt to new technologies and methodologies.
Joining the Terminal team means being part of a dynamic, innovative environment where your work directly impacts the future of logistics and the global supply chain. You will work closely with a team of experts passionate about operational excellence and technological innovation. We offer competitive salaries, a comprehensive benefits package, and opportunities for professional growth.
-
Site Reliability Engineer
4 weeks ago
Austin, United States TEACHER RETIREMENT SYSTEM Full timeThe Site Reliability Engineer(Microsoft Exchange) Associate assists in maintaining the reliability, scalability, and performance of TRSs IT infrastructure. The incumbent will assist in supporting the management of a hybrid Exchange environment, integrating Proofpoint as the Email Gateway, and using PowerShell scripts for automation. This position will work...
-
Site Reliability Engineer
3 days ago
Austin, United States Farm Credit Bank of Texas Full timeJob DescriptionWho we are: Farm Credit Bank of Texas is a $38.2 billion wholesale bank that has been financing agriculture and rural America for over 100 years. Headquartered in Austin, Texas, we provide funding and services to rural lending associations in five states, and we are active in the nation's capital markets. While you may not be familiar with...
-
Reliability & Compliance Solutions Engineer
3 weeks ago
Austin, Texas, United States Electric Reliability Council of Texas Full timeJob SummaryWe are seeking a highly skilled Reliability & Compliance Solutions Engineer to join our team at the Electric Reliability Council of Texas. This role will play a critical part in ensuring the reliability and compliance of our operations, working closely with subject matter experts to meet or exceed performance requirements.Main...
-
Site Reliability Engineer
3 days ago
Austin, United States CV Library Full timeJob DescriptionAs a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that help...
-
Site Reliability Engineering Lead
16 hours ago
Austin, Texas, United States Jabil Full timeAbout the RoleJabil is seeking an experienced Site Reliability Engineering Lead to contribute to the transformative growth within our Intelligent Infrastructure division. The Site Reliability Lead Engineer plays a vital role in ensuring the quality and reliability of the test network infrastructure of the Intelligent Infrastructures factories on a global...
-
Grid Reliability and Compliance Engineer
2 days ago
Austin, Texas, United States The Electric Reliability Council of Texas (ERCOT) Full timeWe are seeking a talented Grid Reliability and Compliance Engineer to join our team at The Electric Reliability Council of Texas (ERCOT). As a key member of our team, you will be responsible for ensuring that ERCOT ISO meets or exceeds its reliability performance requirements.Your primary responsibilities will include monitoring and reporting ERCOT ISO and...
-
Staff Site Reliability Engineer
3 days ago
Austin, United States CV Library Full timeJob DescriptionWe’re looking for a Staff Site Reliability Engineer to join Procore’s Project Execution Group. In this role, you’ll lead, collaborate, partner and develop solutions to maintain the health of the core platform. The goal is to ensure the chosen design and architecture is highly available, performant and reliable as this team is directly...
-
DevOps and Site Reliability Engineering Lead
2 days ago
Austin, Texas, United States Unreal Gigs Full timeJob DescriptionWe are seeking a skilled Senior Manager of DevOps and Site Reliability to join our team at Unreal Gigs. This role is responsible for leading the development, maintenance, and enhancement of our user-facing application and internal tools.About UsWe are a fully remote engineering team that values collaboration, innovation, and continuous...
-
Site Reliability Engineering
3 weeks ago
Austin, United States Visa Full timeCompany Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Site Reliability Engineer
4 days ago
Austin, United States Nomi Health Full timeWe are seeking a Site Reliability Engineer (SRE) to join our team in Austin, TX. You will play a pivotal role in ensuring the reliability, performance, and scalability of our services. You will collaborate with cross-functional teams to design, implement, and manage infrastructure that is robust and resilient. Your focus will be on developing and refining...
-
Senior Site Reliability/DevOps Engineer
17 hours ago
Austin, Texas, United States AutoRABIT Holding Inc. Full timeAbout AutoRABITAutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such as financial institutions, insurance, and healthcare.About the RoleAs a Senior Site Reliability/DevOps Engineer at AutoRABIT, you will play a critical role in developing, scaling, and operating our cloud...
-
Senior Site Reliability/DevOps Engineer
3 weeks ago
Austin, United States AutoRABIT Holding Inc. Full timeAbout AutoRABIT: AutoRABIT is a hyper-growth SaaS software company and the leading provider of Salesforce DevSecOps platform for regulated industries such financial institutions, insurance, and healthcare. AutoRABIT solutions enable developers to automate their daily tasks to be more productive and increase the release velocity for their development team,...
-
Principal Site Reliability Engineer
4 days ago
Austin, United States Terminal Industries Full timeAbout Us Terminal builds software that digitizes, indexes, and automates the yard, leveraging best-in-class machine learning. Our platform provides warehouse operators with the intelligence needed to optimize their usage of trucks, trailers, chassis, containers and personnel. These are the fundamental operating assets of commerce - and represent the last...
-
Austin, United States Visa Full timeCompany Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Senior Site Reliability Engineer
1 month ago
Austin, United States Visa Full timeCompany DescriptionVisa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...
-
Site Reliability Engineer
14 hours ago
Austin, Texas, United States Appko, Inc. Full time**Job Overview:**We are looking for an experienced Site Reliability Engineer to join our team at Appko, Inc. As a SRE, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure and applications.The ideal candidate will have a strong background in DevOps, cloud computing, and software engineering,...
-
Austin, Texas, United States Electric Reliability Council of Texas Full timeJob OverviewAt the Electric Reliability Council of Texas, we are seeking a highly skilled Power System Engineer to join our team. As a key member of our organization, you will play a crucial role in ensuring the reliable operation of the electric power grid.Key ResponsibilitiesPerform complex engineering studies, including power flow, voltage security, and...
-
Principal Site Reliability Engineer
1 month ago
Austin, United States Charles Schwab Full timePosition Type: RegularYour opportunityAt Schwab, you are empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us “challenge the status quo” and transform the finance industry together. As a Principal Site Reliability Engineer for Schwab's Technology Solutions organization, you will be responsible...
-
Site Reliability Engineer
2 days ago
Austin, Texas, United States Apple Full timeJob TitleStaff Site Reliability Engineer, Kubernetes ASEAbout the RoleThis is a pivotal position in our Service Engineering team at Apple, where you will play a key role in shaping the future of our products and services. As an SRE, you will be responsible for supporting and scaling cloud services for thousands of development and operations engineers.Key...
-
TRS Site Reliability Engineer
13 hours ago
Austin, Texas, United States Teacher Retirement System of Texas Full timeThe TRS is seeking a highly skilled Microsoft Exchange Engineer to join our team in a hybrid position. As a key member of our IT staff, you will be responsible for designing, implementing, and maintaining the reliability, scalability, and performance of our IT infrastructure.The ideal candidate will have a strong background in Microsoft Exchange...