Senior HPC Engineer
2 weeks ago
Senior HPC Engineer
Millennium's Infrastructure organization is dedicated to designing, engineering, supporting, and managing a robust server estate, systems virtualization, and core enterprise services. We are seeking a Senior HPC Engineer for a hands-on technical leadership position to support Worldquant's intiative of maintaining financial research leadership. This role is pivotal in designing, building, and maintaining our cutting-edge High-Performance Computing (HPC) and GPU clusters, which are essential for our AI and Machine Learning initiatives. The ideal candidate will have a strong background in HPC environments, with specific expertise in GPU-accelerated computing and advanced storage solutions. You will be responsible for ensuring the reliability, scalability, and performance of our computational infrastructure.
You will join a highly specialized team of exceptionally talented yet refreshingly humble individuals from diverse disciplines. We believe that delivering exceptional services requires the ability to make meaningful changes across the entire stack. Our mission is to solve real business challenges, reduce operational complexities, and foster a collaborative, team-driven environment that promotes mutual growth and success.
Key Responsibilities:
- Design and Implementation: Lead the architectural design, implementation, and maintenance of large-scale HPC and GPU clusters.
- Storage Management: In collaboration with the storage team, architect and manage high-performance storage solutions tailored for GPU-intensive workloads, ensuring low-latency data access and high throughput.
- System Optimization: Monitor, analyze, and tune the performance of the HPC environment, including compute nodes, networking fabrics, and parallel file systems.
- Automation: Develop and maintain automation scripts and tools for provisioning, configuration management, and monitoring of the HPC infrastructure.
- Collaboration: Work closely with researchers, data scientists, and software engineers to understand their computational needs and provide a robust and efficient platform to accelerate their work.
- Troubleshooting: Provide expert-level support for complex issues related to hardware, software, and networking within the HPC ecosystem.
- Technology Evaluation: Stay current with emerging technologies and industry trends in HPC, GPU computing, and storage, and conduct evaluations to recommend new solutions.
- Contribute to organizational knowledge through documentation, education, and writing maintainable code. Provide guidance to the team in your subject matter expertise.
- A Bachelor's degree in Computer Science, Engineering, or a related field.
- A minimum of 7 years of progressive experience in designing, building, and managing complex HPC environments.
- Proven experience with GPU-accelerated computing, including NVIDIA GPUs and associated software (e.g., CUDA).
- Deep expertise in high-performance storage systems and parallel file systems (e.g., Lustre, GPFS/Spectrum Scale).
- Strong proficiency in Linux/Unix operating systems, scripting languages and configuration management platforms
- Experience with cluster management and scheduling software (e.g., Kubernetes, Run.io), with a strong preference for Slurm
- Familiarity with high-speed interconnects like InfiniBand or RoCE.
- Understanding AI technologies and their applications in infrastructure automation and management. Experience with or a strong interest in implementing AI/ML solutions for infrastructure optimization, anomaly detection, or predictive analytics.
- A passion for technology and automation, with a deep sense of curiosity and ownership.
- A hands-on approach to problem-solving and a demonstrable enthusiasm for technology.
- Excellent verbal and written communication skills.
- Master's or Ph.D. in a relevant technical field.
- Experience in a buy-side financial organization.
- Experience with cloud-based HPC, preferably with GCP.
- Knowledge of containerization technologies such as Docker and Singularity.
The estimated base salary range for this position is $175,000 to $250,000, which is specific to New York and may change in the future. Millennium pays a total compensation package which includes a base salary, discretionary performance bonus, and a comprehensive benefits package. When finalizing an offer, we take into consideration an individual's experience level and the qualifications they bring to the role to formulate a competitive total compensation package.
-
Principle HPC Engineer
2 weeks ago
New York, NY, United States iO Associates Full timePrincipal HPC Engineer Role We're looking for a Principal HPC Engineer to work within a small but exceptionally skilled engineering group, partnering closely with researchers, technologists, and industry-leading vendors to build a world-class HPC ecosystem from the ground up. This is the role for engineers who blend big-picture technical vision with...
-
Principal HPC Engineer
2 days ago
New York, NY, United States Atto Trading Technologies LLC Full timeAtto Trading, a dynamic quantitative trading firm founded in 2010 and leading in global high-frequency strategies, is looking for an HPC Engineer to join our team. We are expanding an international, diverse team with experts in trading, statistics, engineering, and technology. Our disciplined approach, combined with rapid market feedback, allows us to...
-
HPC Storage Engineer
2 weeks ago
New York, NY, United States Hudson River Trading Full timeThe R&D team at Hudson River Trading (HRT) builds and maintains the computers, networks, data storage, operating systems, and software that allow our trading strategies and research environment to operate worldwide 24/7. The team manages over 100 petabytes of storage to facilitate industry-leading AI/ML-based trading. We are seeking an experienced Storage...
-
New York, NY, United States Career Developers Full timeRefer a friend: Referral fee program Career Developers Inc., a distinguished staffing and consulting firm, is proud to celebrate 30 years of service excellence. As a GSA Contract holder, we offer comprehensive staffing solutions for both commercial and government sectors nationwide. By selectively partnering with clients who share our values, we ensure...
-
Vice President of HPC Data Centers
4 days ago
New York, NY, United States RCM Life Sciences and IT Full timeDirect PlacementTitle or Role: VP of HPC Data Centers Compensation: $275,000 + cash bonus + equity Location: NYC Hybrid Company Description: As demand for AI and high-performance computing (HPC) accelerates, our client is building the infrastructure for this moment. They are laying the foundation for scalable, energy-optimized computing, designed to support...
-
Vice President of HPC Data Centers
4 days ago
New York, NY, United States RCM Life Sciences and IT Full timeDirect PlacementTitle or Role: VP of HPC Data Centers Compensation: $275,000 + cash bonus + equity Location: NYC Hybrid Company Description: As demand for AI and high-performance computing (HPC) accelerates, our client is building the infrastructure for this moment. They are laying the foundation for scalable, energy-optimized computing, designed to support...
-
Vice President of HPC Data Centers
6 days ago
New York, NY, United States RCM Technologies Full timeDirect PlacementTitle or Role: VP of HPC Data Centers Compensation: $275,000 + cash bonus + equity Location: NYC Hybrid Company Description: As demand for AI and high-performance computing (HPC) accelerates, our client is building the infrastructure for this moment. They are laying the foundation for scalable, energy-optimized computing, designed to support...
-
Senior Linux Engineer
4 days ago
New York, NY, United States Elliot Partnership Full timeSenior Linux Systems Engineer - (Kernel & Performance) New York, NY (Hybrid, 3 days in office) Highly competitive compensation package Join an elite technology and research group at the forefront of global finance, where world-class engineering and quantitative research converge to solve some of the most complex problems in any industry. Their teams are...
-
New York, NY, United States Pfizer Full timeROLE SUMMARY Pfizer's committed to the application of computational science in the areas of drug discovery and development. As part of this mission, we have recently embarked on a large-scale migration of our computational infrastructure to cloud. This role leverages extensive experience in cloud engineering and DevOps and requires a hands-on approach to...
-
Principal Systems Engineer
2 weeks ago
New York, NY, United States Elliot Partnership Full timePrincipal Systems Engineer (HPC, Python/Go) New York, NY (Hybrid, 3 days in office) Highly competitive compensation package Join an elite technology and research group at the forefront of global finance, where world-class engineering and quantitative research converge to solve some of the most complex problems in any industry. Their teams are composed of...