HPC Engineer

1 month ago


Dallas, United States Alcority Full time
About the Role:

As an HPC (High-Performance Computing) Datacenter Engineer, you will be responsible for implementing and supporting state-of-the-art datacenter infrastructure solutions that support high-performance computing and scientific research. You will collaborate with cross-functional teams, including researchers, system administrators, network engineers, and data scientists, to understand their requirements and create efficient and scalable datacenter solutions. Your expertise in HPC technologies and emerging trends will be instrumental in driving innovation and optimizing performance within the datacenter environment. Join our team and contribute to the advancement of scientific research and innovation by designing and optimizing cutting-edge datacenter infrastructure.

Responsibilities:
  • Datacenter Architecture Design: Develop and refine datacenter architecture blueprints and guidelines, considering performance, scalability, security, and efficiency aspects. Design and implement solutions for compute, storage, networking, and cooling infrastructure that align with HPC requirements.
  • HPC Infrastructure Optimization: Continuously evaluate and enhance the datacenter infrastructure to maximize HPC performance and resource utilization. Identify and address potential bottlenecks and performance gaps, employing industry best practices and cutting-edge technologies.
  • System Integration and Deployment: Collaborate with system administrators and engineers to ensure seamless integration and deployment of HPC systems. Oversee hardware and software installation, configuration, and testing activities.
  • Research and Evaluation: Stay up to date with emerging HPC technologies, tools, and methodologies. Conduct research and feasibility studies on new hardware and software solutions to enhance datacenter capabilities. Evaluate vendor offerings and provide recommendations for procurement.
  • Performance Monitoring and Troubleshooting: Monitor and analyze datacenter performance metrics to identify issues and implement necessary optimizations. Troubleshoot complex system problems, working closely with technical teams to ensure efficient resolution and minimal impact on operations.
  • Security and Compliance: Collaborate with security teams to design and implement robust security measures within the datacenter infrastructure. Ensure compliance with relevant industry standards and regulations, such as HIPAA or GDPR, in data handling and storage.
  • Documentation and Reporting: Create comprehensive technical documentation, including architectural diagrams, standard operating procedures, and configuration guidelines. Prepare regular reports on datacenter performance, capacity planning, and future infrastructure requirements.
  • Team Collaboration and Leadership: Collaborate effectively with cross-functional teams, fostering a culture of knowledge sharing and innovation. Provide technical leadership and mentorship to junior team members, guiding them in adopting best practices and enhancing their skill sets.
Requirements:
  • Bachelor's or master's degree in computer science, engineering, or a related field or equivalent experience
  • Minimum 5 years of experience as an HPC engineer or similar role, with a strong focus on engineering and optimization.
  • In-depth knowledge of HPC technologies, including parallel computing, distributed storage systems, job scheduling, InfiniBand and Ethernet networking, GPU acceleration, and job scheduling frameworks.
  • ZFS and NiFi are a plus
  • Experience with automation tools Python, Ansible, Puppet / chef
  • Monitoring tools - Prometheus, Ganlia, Nagios, SNMP, and Telegraf
  • Experience with CFD (Computational Fluid Dynamics) workloads and associated HPC optimization a plus
  • Must have familiarity with industry-standard tools and software used in HPC environments, such as Slurm, PBS Pro, Lustre, GPFS, OpenStack, and containerization technologies (e.g., Docker, Kubernetes).
  • Strong problem-solving and analytical skills, with the ability to identify and resolve complex technical issues.
  • Excellent communication and interpersonal skills, with the ability to collaborate effectively with diverse teams and stakeholders.
  • Detail-oriented mindset with a strong focus on documentation and adherence to standards.
  • Familiarity with security protocols and compliance requirements in the context of datacenter operations.
  • Ability to adapt to a fast-paced and rapidly evolving technological landscape.
It is impossible to list every requirement for, or responsibility of, any position. Similarly, we cannot identify all the skills a position may require since job responsibilities and the Company's needs may change over time. Therefore, the above job description is not comprehensive or exhaustive. The Company reserves the right to adjust, add to or eliminate any aspect of the above description. The Company also retains the right to require all employees to undertake additional or different job responsibilities when necessary to meet business needs.

Must be legally authorized to work in the United States without the need for employer sponsorship, now or at any time in the future.

Benefits & Perks:
  • Time Off: 25 days of PTO for full-time employees and 12 company holidays.
  • Company Paid Benefits: Life insurance, Short-term disability, Long-term disability, Paid parental leave, Employee Assistance Program, and medical insurance in our high deductible health plan.
  • Optional Employee Paid Benefits: Medical insurance in our EPO plan, Dental benefits, and Vision benefits. We also offer Health Savings Accounts, Flexible Spending Accounts, Supplemental Life insurance, and more.
  • 401(k): Eligible after 60 days. Discretionary company match of 50% up to the first 6% of contributions.


EQUAL OPPORTUNITY EMPLOYER

ALCORITY IS AN EQUAL EMPLOYMENT OPPORTUNITY EMPLOYER. THE COMPANY'S POLICY IS NOT TO DISCRIMINATE AGAINST ANY APPLICANT OR EMPLOYEE BASED ON RACE, COLOR, RELIGION, NATIONAL ORIGIN, GENDER, AGE, SEXUAL ORIENTATION, GENDER IDENTITY OR EXPRESSION, MARITAL STATUS, MENTAL OR PHYSICAL DISABILITY, AND GENETIC INFORMATION, OR ANY OTHER BASIS PROTECTED BY APPLICABLE LAW. THE FIRM ALSO PROHIBITS HARASSMENT OF APPLICANTS OR EMPLOYEES BASED ON ANY OF THESE PROTECTED CATEGORIES.

  • Dallas, Texas, United States Alcority Full time

    About the Role:We are seeking an experienced HPC Datacenter Engineer to join our team at Alcority. As a key member of our datacenter operations team, you will be responsible for designing, implementing, and supporting state-of-the-art datacenter infrastructure solutions that meet the high-performance computing needs of our researchers and scientists.As an...


  • Dallas, Texas, United States Alcority Full time

    About the Role:As an HPC Datacenter Engineer at Alcority, you will be responsible for designing and implementing state-of-the-art datacenter infrastructure solutions that support high-performance computing and scientific research. You will collaborate with cross-functional teams to understand their requirements and create efficient and scalable datacenter...

  • Senior Linux Engineer

    3 weeks ago


    Dallas, United States NTT DATA Full time

    Req ID: 299982 NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.We are currently seeking a Senior Linux Engineer - Irving, Texas to join our team in Irving, Texas (US-TX), United States (US).NTT Data Americas is...


  • Dallas, United States Raytheon Full time

    Date Posted: 2024-11-11 Country: United States of America Location: AZ852: RMS AP Bldg M02 1151 East Hermans Road Building M02, Tucson, AZ, 85756 USA Position Role Type: Onsite At Raytheon, the foundation of everything we do is rooted in our values and a higher calling – to help our nation and allies defend freedoms and deter aggression. We bring the...


  • Dallas, United States Raytheon Full time

    Date Posted:2024-11-11Country:United States of AmericaLocation:AZ852: RMS AP Bldg M East Hermans Road Building M02, Tucson, AZ, 85756 USAPosition Role Type:OnsiteAt Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than...


  • Dallas, United States Raytheon Full time

    Date Posted:2024-09-30Country:United States of AmericaLocation:AZ852: RMS AP Bldg M East Hermans Road Building M02, Tucson, AZ, 85756 USAPosition Role Type:Onsite At Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than...


  • Dallas, TX, United States Raytheon Full time

    Date Posted: 2024-09-30 Country: United States of America Location: AZ852: RMS AP Bldg M East Hermans Road Building M02, Tucson, AZ, 85756 USA Position Role Type: Onsite At Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of...


  • Dallas, TX, United States Raytheon Full time

    Date Posted:2024-11-11Country:United States of AmericaLocation:AZ852: RMS AP Bldg M East Hermans Road Building M02, Tucson, AZ, 85756 USAPosition Role Type:OnsiteAt Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than...


  • Dallas, TX, United States Raytheon Full time

    Date Posted:2024-09-30Country:United States of AmericaLocation:AZ852: RMS AP Bldg M East Hermans Road Building M02, Tucson, AZ, 85756 USAPosition Role Type:Onsite At Raytheon, the foundation of everything we do is rooted in our values and a higher calling - to help our nation and allies defend freedoms and deter aggression. We bring the strength of more than...