Site Reliability Engineer

4 weeks ago


Atlanta, United States Blackwomenintech Full time

Join a team recognized for leadership, innovation and diversity As a Site Reliability Engineer here at Honeywell, you will play a critical role in ensuring the reliability, availability, and performance of our systems and applications. You will work closely with cross-functional teams to identify and resolve issues, implement automation solutions, and drive continuous improvement in our infrastructure. Your expertise will contribute to the success of our operations and the satisfaction of our customers.

You will report directly to our Site Reliability Engineering Manager and you'll work out of our Atlanta, GA location on a hybrid work schedule.

In this role, you will impact the reliability and performance of our systems, ensuring seamless operation and minimal downtime. You will collaborate with development teams to implement best practices for system monitoring, alerting, and incident response. You will also participate in capacity planning and performance optimization efforts to ensure optimal system performance.

Key Responsibilities * Hands-on design, analysis, development and troubleshooting of highly distributed large-scale production systems and event-driven, cloud-based services * Primarily Linux Administration, managing a fleet of Linux and Windows VMs as part of the application solutions * Involved in Pull Requests for site reliability goals * Advocate IaC (Infrastructure as Code) and CaC (Configuration as Code) practices within Honeywell HCE * Ownership of reliability, up time, system security, cost, operations, capacity and performance-analysis Monitor and report on service level objectives for a given applications services. Work with the business, Technology teams and product owners to establish key service level indicators. * Ensuring the repeatability, traceability, and transparency of our infrastructure automation * Support on-call rotations for operational duties that have not been addressed with automation * Support healthy software development practices, including complying with the chosen software development methodology (Agile, or alternatives), building standards for code reviews, work packaging, etc. * Create and maintain monitoring technologies and processes that improve the visibility to our applications' performance and business metrics and keep operational workload in-check. * Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities. * Develop, communicate, collaborate, and monitor standard processes to promote the long-term health and sustainability of operational development tasks. * Participate in technical training events, game day scenarios, and professional conferences BENEFITS OF WORKING FOR HONEYWELL * Benefits - Medical, Vision, Dental, Mental Health * Paid Vacation * 401k Plan/Retirement Benefits (as per regional policy) * Career Growth * Professional Development YOU MUST HAVE * 5+ Years of experience in system administration, application development, infrastructure development or related areas * 3+ years of in reading, understanding and writing code in the same * 3+ years Mastery of infrastructure automation technologies (like Terraform, CodeDeploy, Puppet, Ansible, Chef) * 3+ years expertise in container/container-fleet-orchestration technologies (like Kubernetes, Openshift, AKS, EKS, Docker, Vagrant, etcd, zookeeper) * 5+ years Cloud and container native Linux administration/build/management skills

WE VALUE * Versatility with troubleshooting diverse sets of hosting technologies strongly desired. These include web server platforms, application platforms, operating systems, network components, virtualization technologies, storage, and database platforms. * Expertise with cloud- continuous-deployment- based software development lifecycles (e.g. CI/CD) * Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), Caching operations & deployment experience (memcache, Redis) * Expertise with Lean/Agile deployment processes (Blue/Green, ZDT, Canary, load balancers/DNS strategies A/B test, feature flagging methodologies) * Familiarity with site and infrastructure monitoring systems (like ELK, Datadog, AppDynamics, New Relic, Splunk, Sumologic, Grafana) * Strong problem solving, root cause analysis and systems engineering skills * Excellent presentation and communication skills * Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways. * Demonstrated expertise building and managing highly scaled production infrastructure in the cloud (Azure required; AWS) * Expertise with SDLC branching, SCM, and code deployment systems (Bitbucket, git/gitflow, Jenkins, CircleCI, TravisCI, etc.)Additional Information

JOB ID:

HRD231657 Category:

Engineering Location:

715 Peachtree Street, N.E.,Atlanta,Georgia,30308,United States Exempt Engineering (GLOBAL) Honeywell is an equal opportunity employer. Qualified applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, affectional or sexual orientation, gender identity or expression, disability, nationality, sex, religion, or veteran status. PDN-9c19d81a-e08d-43c7-b22a-09665326fe04 #J-18808-Ljbffr



  • Atlanta, United States McKesson Full time

    Are you interested in solving operations problems using modern software engineering practices? Do you get excited about running mission critical infrastructure? Do you believe the only way to scale reliably is through automation? The Site Reliability Reliability Engineer, Liability, Reliability, Engineer, Software Engineer, Reliability, Technology,...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring)Location: RemoteDuration: Long Term (W2 Only)Client: Direct Job Description:Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE)Software development “hands on” engineer with excellent understanding of SDLC Application delivery.Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring) Location: Remote Duration: Long Term (W2 Only) Client: Direct Job Description: Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE) Software development hands on engineer with excellent understanding of SDLC Application delivery. Ability to translate functional and...


  • Atlanta, United States Thoucentric Full time

    Job Description Job Description: We are seeking a skilled and dedicated Site Reliability Engineer (SRE) to join our team. The SRE will be responsible for ensuring the reliability, performance, and scalability of our systems and applications. This role combines software development and systems engineering to build and run large-scale, distributed,...


  • Atlanta, United States Thoucentric Full time

    Job Description Job Description: We are seeking a skilled and dedicated Site Reliability Engineer (SRE) to join our team. The SRE will be responsible for ensuring the reliability, performance, and scalability of our systems and applications. This role combines software development and systems engineering to build and run large-scale, distributed,...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring) Location: Remote Duration: Long Term (W2 Only) Client: Direct Job Description: Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE) Software development “hands on” engineer with excellent understanding of SDLC Application delivery. Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring)Location: RemoteDuration: Long Term (W2 Only)Client: DirectJob Description:Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE)Software development “hands on” engineer with excellent understanding of SDLC Application delivery.Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring) Location: Remote Duration: Long Term (W2 Only) Client: Direct Job Description: Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE) Software development hands on engineer with excellent understanding of SDLC Application delivery. Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring)Location: RemoteDuration: Long Term (W2 Only)Client: DirectJob Description:Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE)Software development “hands on” engineer with excellent understanding of SDLC Application delivery.Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring)Location: RemoteDuration: Long Term (W2 Only)Client: DirectJob Description:Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE)Software development “hands on” engineer with excellent understanding of SDLC Application delivery.Ability to translate functional and...


  • Atlanta, United States MethodHub Full time

    Job Title: Site Reliablity Engineer (Performance Monitoring)Location: RemoteDuration: Long Term (W2 Only)Client: DirectJob Description:Experience of 6-8 Professional experience as a Site Reliability Engineer (SRE)Software development “hands on” engineer with excellent understanding of SDLC Application delivery.Ability to translate functional and...


  • Atlanta, United States Hermeus Full time

    Hermeus is an aerospace and defense technology company founded to radically accelerate air travel by delivering hypersonic aircraft. The company aims to develop hypersonic aircraft quickly and cost-effectively by integrating hardware-rich, iterative development with modern computing and autonomy. This approach has been validated through design, build, and...


  • Atlanta, United States Hermeus Full time

    Hermeus is an aerospace and defense technology company founded to radically accelerate air travel by delivering hypersonic aircraft. The company aims to develop hypersonic aircraft quickly and cost-effectively by integrating hardware-rich, iterative development with modern computing and autonomy. This approach has been validated through design, build, and...


  • Atlanta, United States Hermeus Full time

    Hermeus is an aerospace and defense technology company founded to radically accelerate air travel by delivering hypersonic aircraft. The company aims to develop hypersonic aircraft quickly and cost-effectively by integrating hardware-rich, iterative development with modern computing and autonomy. This approach has been validated through design, build, and...


  • Atlanta, United States Hermeus Full time

    Hermeus is an aerospace and defense technology company founded to radically accelerate air travel by delivering hypersonic aircraft. The company aims to develop hypersonic aircraft quickly and cost-effectively by integrating hardware-rich, iterative development with modern computing and autonomy. This approach has been validated through design, build, and...


  • Atlanta, United States Hermeus Full time

    Hermeus is an aerospace and defense technology company founded to radically accelerate air travel by delivering hypersonic aircraft. The company aims to develop hypersonic aircraft quickly and cost-effectively by integrating hardware-rich, iterative development with modern computing and autonomy. This approach has been validated through design, build, and...


  • Atlanta, Georgia, United States Ford Motor Company Full time

    At Ford Motor Company, we believe freedom of movement drives human progress. We also believe in providing you with the freedom to define and realize your dreams. With our incredible plans for the future of mobility, we have a wide variety of opportunities for you to accelerate your career potential as you help us define tomorrow's transportation.As a key...


  • Atlanta, Georgia, United States Ford Motor Company Full time

    At Ford Motor Company, we believe freedom of movement drives human progress. We also believe in providing you with the freedom to define and realize your dreams. With our incredible plans for the future of mobility, we have a wide variety of opportunities for you to accelerate your career potential as you help us define tomorrow's transportation.As a key...


  • Atlanta, United States Now100 Full time

    Company Description Now100 is committed to understanding our clients' needs and providing solutions that not only meet but exceed their expectations. We match thoroughly vetted resources to contract, contract-to-hire, and permanent positions in all industries. Job DescriptionRole: Site Reliability EngineerLocation: Atlanta, GA OR Dallas OR Austin,...


  • Atlanta, United States LTIMindtree Full time

    About Us: LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital transformation partner to more than 700+ clients, LTIMindtree brings extensive domain and technology...