Cloud SRE and Tools Engineer
3 weeks ago
At Schwab, you're empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us "challenge the status quo" and transform the finance industry together.
Charles Schwab's Technology Services Technical Manager's thrive in a leading-edge work culture while focusing on the products and solutions that help Schwab customers learn, explore, and make life-impacting moves on their paths to achieving their goals. This position requires a highly motivated individual with strong problem-solving skills who can contribute to a highly collaborative culture and a team environment to deliver innovative, value-based reliable solutions. A proven track record of delivering high quality technology products and services in a hyper-growth environment where priorities shift quickly is key to success in the role.
This Cloud Site Reliability Engineering role is a hands-on technical role and will be responsible for executing the teams' activities on tools engineering, high availability, maintainability, support, and automaton of complex application platforms and microservices.
- Lead the execution of various SRE tasks for the Cross Enterprise organizations.
- Communicate technology plans to associated business partners ensuring collaboration across technology and the business.
- Regularly interact with technology leaders, product owners and business partners.
- Partner with the team on complex issues where analysis of situations or data requires in-depth knowledge of the applications and environment.
- Partner with highly experienced technologists, contributing to and creating successful plans to deliver solutions to the business, ensure day-to-day support, high availability, and process compliance.
- Support Operational delivery and Production focusing on proactive monitoring, rapid response Platform SRE
- Perform proactive daily system monitoring including reviewing system and application logs as well as responding to, triaging, troubleshooting and remediating incidents.
- Repair and recover from failures. Coordinate and communicate with impacted stakeholders and clients, escalating where appropriate.
- Monitor and troubleshoot issues across the entire stack - software, application, and network.
- Develop automation and processes to enable teams to deploy, manage, configure, scale, and monitor their applications.
- Help identify applications reliability and availability improvements, establish, and build solutions to continue to drive an improved experience.
- Develop and manage continuous deployment and integrate solutions.
- Create and review documentation and process regarding recurring issues, new standard operating procedures, knowledge transfer material, etc.
- Collaborate with Engineering, Scrum and Ops resources to provide technical expertise and support on key initiatives for system availability and reliability.
- 4+ years of experience in Linux/Java Software Development & Architecture, Operations, DevOps, etc.
- Demonstrated ability to resolve business and service impacting problems, evaluating all alternatives, and consulting with other technical members of the organization is required.
- Familiarity with database management systems (Oracle, SQL)
- Knowledge of Platform as a Service (PaaS) and Infrastructure as a Service (IaaS)
- Experience with Continuous Integration/Continuous Delivery (Bamboo, Go or other related tools)
- Experience with Git, JIRA and related Atlassian stack
- Experience with environment provisioning and deployment automation (Salt/Chef/Puppet)
- Ability to work with global teams.
- Flexibility to operate in an environment with changing demands and priorities.
- Experience with Terraform is preferred.
- Experience working in Cloud Environments (AWS, GCP, or Azure) highly desired.
-
Austin, Texas, United States Syntricate Technologies Full timeAbout the RoleWe are seeking a highly skilled Senior SRE with expertise in AI, automation, and cloud technologies to join our team at Syntricate Technologies. As a key member of our technical team, you will be responsible for designing, implementing, and maintaining scalable and efficient systems that support our AI-driven applications.Key...
-
AI Engineer with SRE/DevOps Expertise
3 days ago
Austin, Texas, United States Tata Consultancy Services Full timeAbout the Role:We are seeking a highly skilled AI Engineer with expertise in SRE/DevOps and Splunk to join our team at Tata Consultancy Services. As a key member of our team, you will be responsible for designing and implementing AI-powered solutions that drive business value.Key Responsibilities:Develop and deploy machine learning models using Python and...
-
Site Reliability Engineer
3 days ago
Austin, Texas, United States JobRialto Full timeAbout the RoleWe are seeking a highly motivated and experienced Site Reliability Engineer to join our Production Services/SRE Team at JobRialto. As a member of this team, you will work closely with other technology professionals to support Asset Management Technology - Cloud Platform solutions.Key ResponsibilitiesProvide level 2 application support and work...
-
DevOps/ SRE Engineer
4 weeks ago
Austin, United States Trunk Tools, Inc. Full timeJob DescriptionJob DescriptionAt Trunk Tools, we are tackling the massive $13 trillion+ construction industry. We’re an exceptional team of serial entrepreneurs, brought together by our shared mission: automating construction. Our founding team (SpaceX, Stanford, MIT, Carta, etc.) has successfully built and deployed software in construction for 140k+...
-
Site Reliability Engineer
4 days ago
Austin, United States Unreal Gigs Full timeJob DescriptionJob DescriptionIntroduction:Are you a systems expert who thrives on maintaining high availability, scalability, and performance in complex, distributed environments? Do you enjoy solving infrastructure challenges and automating everything in sight? If you're passionate about building resilient systems and ensuring 24/7 uptime, then our...
-
Engineer - Site Reliability
3 weeks ago
Austin, United States JobRialto Full timeSkills: 6+ years of experience in systems and platform operations and technology Experience with On Prem and Public Cloud - AWS, EKS Scripting languages like Python Linux Administration and Cloud, DevOps experience would be a plus Team As a member of the Site Reliability Engineering & Production Services team, you will work with other technology...
-
AI Engineer with SRE/DevOps Expertise
5 days ago
Austin, Texas, United States Tata Consultancy Services Full timeAbout the Role:We are seeking a highly skilled AI Engineer with expertise in SRE/DevOps and Splunk to join our team at Tata Consultancy Services. As a key member of our team, you will be responsible for designing and implementing AI-powered solutions that drive business value.Key Responsibilities:Develop and deploy machine learning models using Python and...
-
Site Reliability Engineer
3 days ago
Austin, Texas, United States Thales Full timeAbout the RoleThales is seeking an experienced Site Reliability Engineer to join our team. As a key member of our cloud infrastructure team, you will be responsible for designing, developing, and maintaining our cloud-based solutions.Key ResponsibilitiesCollaborate with project managers and service delivery managers to analyze traffic trends and assess the...
-
SRE Operations Engineer
3 days ago
Austin, Texas, United States Apple Full timeJob SummaryWe are seeking a highly skilled Technical Operations Expert to join our team at Apple. As an Infrastructure Management Specialist, you will be responsible for managing key technical infrastructure and ensuring the smooth operation of our systems.Key ResponsibilitiesMonitor and manage technical infrastructure to ensure high availability and...
-
Cloud Infrastructure Reliability Engineer
1 week ago
Austin, Texas, United States Emerald Cloud Lab Full timeJob OverviewSalary: $135,000 - $160,000.00The Emerald Cloud Laboratory (ECL) empowers life scientists to transition from traditional laboratory settings to conducting research entirely through digital platforms. By moving away from manual experimentation, scientists at ECL utilize automated systems to perform standard biological and chemical experiments in a...
-
Cloud Operations Lead
5 days ago
Austin, Texas, United States Infosys Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Infosys. As a key member of our cloud operations team, you will be responsible for ensuring the stability, resilience, and scalability of our cloud-based services.Key ResponsibilitiesManage a team of SREs to proactively ensure the stability, resilience, and scale...
-
Software Engineer for SRE Solutions
2 weeks ago
Austin, Texas, United States LEO DOES IT INC Full time1. RequirementsProficient in crafting clean, efficient, and testable code using Java.Expertise in the architecture, deployment, performance optimization, and troubleshooting of large-scale distributed systems on AWS.Familiarity with SRE principles such as monitoring, alerting, error budgets, fault analysis, and automation.Experience with container...
-
AI with SRE
3 days ago
Austin, United States Nastech Global Full timeJD:AI with SRE Location : AustinTX AI with SRE/ DevOps with Splunk Austin 10 yrs of total exp Experience in writing code to automate ML models and relate events and incidents AIOps run log events through models and come with anomaly detection. Python automation skils for Model Experience in ML model and deployment Kubernetes administration. Should...
-
Cloud Operations Specialist
5 days ago
Austin, Texas, United States LogicMonitor Full timeAbout UsWe're a technology company that values innovation, customer satisfaction, and employee well-being. Our team is dedicated to delivering exceptional results and fostering a culture of excellence.Job SummaryWe're seeking a highly skilled Senior Cloud Reliability Engineer to join our team. As a key member of our cloud operations team, you'll be...
-
Senior DevOps Engineer
5 days ago
Austin, Texas, United States Infosys Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at Infosys. As a key member of our organization, you will be responsible for managing a team of SREs to ensure the stability, resilience, and scalability of our services.Key ResponsibilitiesManage a team of SREs to proactively ensure the stability, resilience, and...
-
Cloud Engineer
4 days ago
Austin, Texas, United States Tech Consulting Full timeAbout the RoleWe are seeking a highly skilled Cloud Engineer to join our team at Tech Consulting. As a Cloud Engineer, you will play a critical role in designing, implementing, and maintaining cloud-based systems and infrastructure for our clients.Key ResponsibilitiesDesign and implement cloud-based solutions using Azure, AWS, or Google Cloud...
-
Reliability and Performance Expert
5 days ago
Austin, Texas, United States Emerald Cloud Lab Full timeJob DescriptionOverviewThe Emerald Cloud Laboratory (ECL) is seeking a highly skilled Site Reliability Engineer - Cloud Infrastructure Specialist to join our team. As a key member of our Infrastructure and Tools team, you will be responsible for ensuring the security, reliability, and capacity of our cloud infrastructure and software applications.Key...
-
Cloud and Data Reliability Specialist
5 days ago
Austin, Texas, United States Thales Full timeAbout the RoleThales is seeking an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and security of our cloud-based services.Key ResponsibilitiesCollaborate with Project Managers and Service Delivery Managers to analyze traffic trends and assess the...
-
Senior Cloud Engineer
3 days ago
Austin, Texas, United States Reltio Full timeAbout ReltioReltio is a leading provider of cloud-based data unification and management solutions. Our mission is to empower data and analytics leaders with unparalleled business responsiveness.Job SummaryWe are seeking a highly skilled Senior Cloud Engineer to join our TechOps team. As a key member of our team, you will be responsible for helping us scale...
-
Site Reliability Engineer
7 hours ago
Austin, Texas, United States Apple Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and services.Key ResponsibilitiesDesign, build, and maintain robust infrastructure and automation solutionsWork closely with...