Reliability Engineer

2 weeks ago


Herndon, Virginia, United States Amazon Full time
Job Title: Infrastructure Reliability Engineer

Amazon is seeking a highly skilled Infrastructure Reliability Engineer to join our team. As an Infrastructure Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for data center infrastructure and security equipment.

Key Responsibilities:
  • Proactively identify and assess reliability risks for data center infrastructure and security equipment.
  • Develop and implement reliability strategies to mitigate risks and improve data center availability.
  • Collaborate with internal and external partners to drive key aspects of product specification, risk identification, and execution.
  • Conduct root cause analysis of critical equipment failures and drive continuous improvements to improve data center availability and security.
  • Develop and maintain data center system-level reliability models and related reliability quantification and risk analysis for data center configuration optimization.
Requirements:
  • 5+ years of experience in reliability engineering, with a focus on data center infrastructure and security equipment.
  • Strong understanding of reliability engineering principles, including physics-of-failure, statistical modeling, and data analytics.
  • Experience with accelerated life testing, stress analysis, and finite element analysis.
  • Ability to influence development teams, procurement, and external partners.
  • Excellent verbal and written communication skills.
Preferred Qualifications:
  • Ph.D. in Reliability Engineering, Physics, Electrical, Mechanical, or Materials Engineering, or a related field.
  • Experience with proactive and effective reliability approaches in a cost-effective manner throughout product design, manufacture, and deployment stages.
  • Familiarity with major data center infrastructure equipment reliability performance.
  • Ability to travel within the US and internationally.

Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/disability-accommodation.


  • Reliability Engineer

    4 weeks ago


    Herndon, Virginia, United States Amazon Full time

    About the RoleWe are seeking a highly skilled Reliability Engineer to join our team at Amazon Web Services (AWS). As a Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure and security equipment.Key ResponsibilitiesProactively identify and assess reliability...


  • Herndon, Virginia, United States Amazon Full time

    Job DescriptionJob Title: Hardware Reliability EngineerJob Summary: We are seeking a highly skilled Hardware Reliability Engineer to join our team at Amazon. As a Hardware Reliability Engineer, you will be responsible for proactively driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure and security...


  • Herndon, Virginia, United States Amazon Full time

    Job SummaryWe are seeking a highly skilled Reliability Engineer to join our team at Amazon. As a Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure and security equipment. You will work closely with internal and external partners to drive key aspects of...

  • Reliability Engineer

    2 weeks ago


    Herndon, Virginia, United States Amazon Full time

    Job DescriptionAmazon is seeking a highly skilled Reliability Engineer to join our team. As a key member of our Infrastructure Reliability team, you will be responsible for driving reliability risk identification, assessment, and mitigation for datacenter infrastructure equipment, with a specific focus on High Voltage (HV) substations.Key...


  • Herndon, Virginia, United States The Swift Group Full time

    Job Title: Site Reliability EngineerThe Swift Group is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our mission-critical systems.Key Responsibilities:Design, implement, and maintain scalable and highly available...


  • Herndon, Virginia, United States The Swift Group Full time

    Job Title: Site Reliability EngineerThe Swift Group is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, performance, and availability of our mission-critical systems.Key Responsibilities:Collaborate with DevOps engineers and developers to design,...


  • Herndon, Virginia, United States Amazon Full time

    Job DescriptionAmazon is seeking a highly motivated and experienced Site Reliability Engineer to join our AWS Operations Management (AWSOM) team. As a key member of our team, you will be responsible for designing, implementing, and operating large-scale cloud infrastructure to ensure the reliability, performance, and efficiency of our services.Key...


  • Herndon, Virginia, United States Amazon Full time

    Job DescriptionWe are seeking a highly motivated and experienced Site Reliability Engineer to join our team at Amazon. As a key member of our Operations Management team, you will be responsible for designing, implementing, and maintaining the reliability, performance, and efficiency of our cloud infrastructure.Key Responsibilities:Design and implement...


  • Herndon, Virginia, United States Amazon Full time

    Job DescriptionJob Title: Reliability Engineer - Datacenter InfrastructureJob Summary: We are seeking a highly skilled Reliability Engineer to join our team at Amazon Web Services (AWS). As a Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure equipment, with...

  • Reliability Engineer

    3 weeks ago


    Herndon, Virginia, United States Amazon Full time

    About the RoleWe are seeking a highly skilled Reliability Engineer to join our team at Amazon. As a Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure equipment. This includes proactive identification of potential risks, root cause analysis of critical...

  • Reliability Engineer

    3 weeks ago


    Herndon, Virginia, United States Amazon Full time

    Job SummaryWe are seeking a highly skilled Reliability Engineer to join our team at Amazon Web Services (AWS). As a Reliability Engineer, you will be responsible for driving the reliability risk identification, assessment, and mitigation for datacenter infrastructure equipment, with a specific focus on High Voltage (HV) substations.Key ResponsibilitiesLead...


  • Herndon, Virginia, United States Peraton Full time

    Job Title: Senior Site Reliability EngineerPeraton is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our Infrastructure Managed Services (IMS) Program, you will play a crucial role in mapping and optimizing all aspects of our underlying I.T. infrastructure.Responsibilities:Utilize monitoring and triage tools to...


  • Herndon, Virginia, United States LanceSoft Full time

    Job Title: Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at LanceSoft. As a key member of our Engineering/Infrastructure team, you will be responsible for designing, developing, and implementing automated solutions to reduce risk and promote efficiencies in support of our organization's...


  • Herndon, Virginia, United States LanceSoft Full time

    Job Title: Senior Site Reliability EngineerAt LanceSoft, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our Engineering/Infrastructure team, you will play a critical role in designing, developing, and implementing solutions to ensure high-quality process automation within our Information Technology...


  • Herndon, Virginia, United States Chameleon Consulting Group Full time

    Job Title: Principal Site Reliability EngineerChameleon Consulting Group is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, instantiating, and configuring physical computer systems in classified government environments using Infrastructure as...


  • Herndon, Virginia, United States Chameleon Consulting Group Full time

    Job Title: Senior Principal Site Reliability EngineerChameleon Consulting Group is seeking a highly skilled Senior Principal Site Reliability Engineer to join our team. As a key member of our engineering team, you will play a crucial role in ensuring the reliability and performance of our systems and infrastructure.Key Responsibilities:Lead a team of...


  • Herndon, Virginia, United States Amazon Full time

    Job SummaryWe are seeking a highly skilled Quality Engineer for Infrastructure Reliability to join our team at Amazon. As a Quality Engineer, you will be responsible for ensuring the highest level of quality in our data center infrastructure. This includes working closely with suppliers to ensure that their products meet our quality standards.Key...


  • Herndon, Virginia, United States Chameleon Consulting Group Full time

    Job Title: Principal Site Reliability EngineerChameleon Consulting Group is seeking a highly skilled Principal Site Reliability Engineer to lead our team in building a Kubernetes-based capability to support cyber operations.Key Responsibilities:Build and manage a Security Operations CenterDeploy and manage Security Information and Event Management...


  • Herndon, Virginia, United States Cyber Crime Full time

    Job Title: Principal Site Reliability EngineerChameleon Consulting Group is seeking a highly skilled Principal Site Reliability Engineer to lead our team in building a Kubernetes-based capability to support cyber operations. As a key member of our team, you will be responsible for architectural design, systems design, and developing unique solutions to...


  • Herndon, Virginia, United States Amazon Full time

    Job SummaryWe are seeking a highly skilled Hardware Reliability Specialist to join our team at Amazon. As a key member of our infrastructure reliability team, you will be responsible for proactively identifying and mitigating reliability risks for our datacenter infrastructure and security equipment.Key ResponsibilitiesDrive reliability risk identification,...