Senior High Performance Computing System Administrator

5 months ago


New York, United States Icahn School of Medicine at Mount Sinai Full time

Strength Through Diversity

Ground breaking science. Advancing medicine. Healing made personal.

Roles & Responsibilities:

The Scientific Computing and Data group at the Icahn School of Medicine at Mount Sinai partners with scientists to accelerate scientific discovery. To achieve these aims, we support a cutting-edge high-performance computing and data ecosystem along with MD/PhD-level support for researchers. The group is composed of a high-performance computing team, the research clinical data warehouse team and a research data services team.


The Senior HPC Administrator, High Performance Computational and Data Ecosystem, is responsible for a computational and data science ecosystem for researchers at Mount Sinai. This ecosystem includes high-performance computing (HPC) systems, clinical research databases, and a software development infrastructure for local and national projects. To meet Sinai’s scientific and clinical goals, the Senior Administrator has a good technical understanding for computational, data and software development systems along with a strong focus on customer service for researchers. The HPC Senior Administrator is an expert troubleshooter and productive team member and leads projects to effective and efficient completion independently under little to no supervision. This position reports to the Director for Computational & Data Ecosystem in Scientific Computing. Specific responsibilities are listed below.


Responsibilities


  • Design, deploy and maintain Scientific Computing’s computational and data science ecosystem including ~30,000 cores with high bandwidth, low latency interconnects, GPUs, large shared memory nodes, databases, scientific workflows and 30+ petabytes of storage in production, clinical data warehouse and software development environment.
  • Lead the troubleshooting, isolation and resolution of all technical issues including application, system, hardware, software, and network). Actively monitors the systems.
  • Maintains, tunes and manages computational, data, cloud technologies and workflow systems for ISMMS researchers, scientists and their external collaborators. Defines and deploys a comprehensive computational and data vision. Identifies and communicates system advantages/disadvantages and tradeoffs.
  • Designs, develops, implements system administration tasks, including hardware and software configuration, configuration management, system monitoring (including the development and maintenance of regression tests), usage reporting, system performance (file systems, scheduler, interconnect, high availability, etc.), security, networking and metrics, etc.
  • Collaborates effectively with research and hospital system IT, compliance, HIPAA, security and other departments to ensure compliance with all regulations and Sinai policies.
  • Participates in the integration of HPC resources with laboratory equipment such as sequencers, clinical and research data resources and systems, etc. Incorporate and link data and compute resources.
  • Researches, deploys and optimizes resource management and scheduling software and policies and actively monitoring. Designs, tunes, manages and upgrades parallel file systems, storage and data-oriented resources.
  • Researches, deploys and manages security infrastructure, including development of policies and procedures.
  • Maintain all necessary aspects of HPC in accordance with best practices. Develops and implements backup policies.
  • Prepares and manages budgets for hardware, software and maintenance. Participates in chargeback/fee recovery analysis and provides suggestions to make operations sustainable.
  • Assists in developing and writing system design for research proposals. Creates and provides clear documentation.
  • Works effectively and productively with other team members within the group and across Mount Sinai.
  • Performs related duties as assigned or requested.
  • Provides after hours support for critical system and production issues.
  • Answers and resolves user tickets.


Qualifications:


  • Bachelor's degree in computer science, engineering or another scientific field. Master's or PhD preferred
  • 8+ years (higher preferred) of progressive HPC system administration and operations (preferably in a Redhat/CentOS Linux administration, Batch HPC cluster environment)
  • Must be an expert troubleshooter; Must be a team player and customer focused
  • Experience with job scheduler such as LSF or Slurm and parallel file systems and storage
  • Experience with networking and security
  • Experience with configuration management systems such as xCAT, Puppet and/or Ansible
  • Experience of databases and web services
  • Experience in Infiniband, Gigabit Ethernet
  • Experience in an academic or research community environment
  • Script and programming experience
  • Experience with Cloud Computing
  • Ability to multitask effectively in a dynamic environment
  • Excellent communication skills, analytical ability, strong judgment and management skills, and the ability to work effectively as a liaison between both research and technology teams.
  • Strong written, oral, and interpersonal communication skills

Preferred Experience

  • Advanced degree
  • Experience with GPFS, LSF, TSM, IB and ethernet networking
  • Experience with databases and web services is highly preferred


Strength Through Diversity


The Mount Sinai Health System believes that diversity, equity, and inclusion are key drivers for excellence. We share a common devotion to delivering exceptional patient care. When you join us, you become a part of Mount Sinai’s unrivaled record of achievement, education, and advancement as we revolutionize medicine together. We invite you to participate actively as a part of the Mount Sinai Health System team by:


  • Using a lens of equity in all aspects of patient care delivery, education, and research to promote policies and practices to allow opportunities for all to thrive and reach their potential.
  • Serving as a role model confronting racist, sexist, or other inappropriate actions by speaking up, challenging exclusionary organizational practices, and standing side-by-side in support of colleagues who experience discrimination.
  • Inspiring and fostering an environment of anti-racist behaviors among and between departments and co-workers.


At Mount Sinai, our leaders strive to learn, empower others, and embrace change to further advance equity and improve the well-being of staff, patients, and the organization. We expect our leaders to embrace anti-racism, create a collaborative and respectful environment, and constructively disrupt the status quo to improve the system and enhance care for our patients. We work hard to create an inclusive, welcoming and nurturing work environment where all feel they are valued, belong and are able to advance professionally.


Explore more about this opportunity and how you can help us write a new chapter in our history



EOE Minorities/Women/Disabled/Veterans



  • new york city, United States Icahn School of Medicine at Mount Sinai Full time

    Strength Through DiversityGround breaking science. Advancing medicine. Healing made personal.Roles & Responsibilities: The Scientific Computing and Data group at the Icahn School of Medicine at Mount Sinai partners with scientists to accelerate scientific discovery. To achieve these aims, we support a cutting-edge high-performance computing and data...


  • New York, New York, United States Icahn School of Medicine at Mount Sinai Full time

    About the RoleThe Icahn School of Medicine at Mount Sinai seeks a highly skilled Senior High Performance Computing System Administrator to join our Scientific Computing and Data group. This role is responsible for designing, deploying, and maintaining a cutting-edge high-performance computing and data ecosystem.In this position, you will work closely with...


  • New York, New York, United States Columbia University Full time

    Job DescriptionColumbia University Irving Medical Center seeks a highly experienced Senior Systems Engineer to manage its complex High-Performance Computing infrastructure. This individual will be responsible for designing, implementing, and maintaining a scalable and secure computing environment that meets the needs of researchers across various...


  • New York, United States P. Chappel Associates, Inc. Full time

    Software Engineer - High Performance Computing Software application and library development in C, Cilk, Python, and other languages for unique high-performance computer architecture. Position is based in New York, NY. You will be working with a team of leading computer architects in the industry and contribute to pushing the forefront of computing. Software...


  • New York, New York, United States dYdX Full time

    Key ResponsibilitiesWe're seeking a skilled Senior Software Engineer to join our team at dYdX. Your primary focus will be on developing high-performance systems that can process billions of dollars in trading volume daily.Key Responsibilities:Design and implement low-latency order book matching enginesDevelop trading engines and REST & WebSocket API...


  • New York, New York, United States Vatic Labs Full time

    Senior Systems Engineer RoleVatic Labs is seeking a Senior Systems Engineer to join our team. As a key member, you will design and build high performance trading systems using C/C++ in a Linux environment.About the RoleThis is an excellent opportunity to work with a talented team of engineers, traders, and researchers. You will have the chance to develop...


  • New York, New York, United States Waymo Full time

    Job Summary As a senior compiler engineer at Waymo, you'll play a pivotal role in shaping the future of autonomous driving technology. Our Compute Team is responsible for designing and developing the compute platform that runs the Waymo Driver, ensuring it meets the high standards of reliability, efficiency, and performance required for self-driving...


  • New York, New York, United States Memorial Sloan Full time

    About Memorial Sloan Kettering Cancer Center\Memorial Sloan Kettering Cancer Center (MSK) is a world-renowned cancer treatment and research institution. Our mission is to end cancer for life by providing innovative, patient-centered care to individuals of all ages.\Job Summary\We are seeking an experienced Senior HPC Engineer to join our Research Technology...


  • New York, New York, United States Selby Jennings Full time

    About the RoleSelby Jennings is seeking an experienced High Performance Computing Expert to join our Server Reliability Engineering team. The successful candidate will have a strong background in HPC, with experience in managing and optimizing large-scale computing environments.Key ResponsibilitiesHPC Environment Management: Manage and maintain our HPC...


  • New York, New York, United States Fourier Ltd Full time

    Job OverviewAn opportunity has arisen for a skilled Software Engineer to enhance the high-performance computing (HPC) storage infrastructure of a leading market maker.


  • New York, New York, United States City of New York Full time

    Job Title: Computer System Administrator**About the Role**We are seeking a skilled IT support specialist to join our team at the City of New York. As a computer system administrator, you will be responsible for providing technical support to our end-users, troubleshooting hardware and software issues, and configuring computer systems to ensure optimal...


  • New York, New York, United States Fourier Ltd Full time

    Job Description:As a key member of our Research & Development team, you will work on mission-critical systems that drive global operations 24/7.You will be responsible for scaling and maintaining high-performance computing environments, designing tools to enhance research capabilities, and supporting a state-of-the-art distributed computing infrastructure.


  • New York, New York, United States Memorial Sloan Kettering Full time

    Memorial Sloan Kettering Cancer Center is committed to providing a collaborative and supportive environment for its employees. As a Senior HPC Engineer, you will have the opportunity to work alongside world-class researchers and technologists to advance cancer research through high-performance computing.We are seeking an experienced professional with a...


  • New York, United States Relevante Full time

    Salary: $145,000.00 We have partnered with a large non-profit organization in the New York, NY area to provide them with a Senior Systems Administrator. Please review the below description and let us know if you are interested. Must Have Skills for the Senior Systems Administrator: #1. 8+ years hands-on experience deploying, administering, and upgrading...

  • Senior Data Engineer

    2 weeks ago


    New York, New York, United States adMarketplace Full time

    Company Overview:">adMarketplace is a leading consumer search technology company that empowers advertisers to measure media performance accurately. With a mission to deliver engaging search experiences, the company has created best-of-breed consumer search technology using cutting-edge architecture.">Salary and Benefits:">The estimated salary for this role...


  • New York, New York, United States York College of Pennsylvania Full time

    **Job Description:**We are seeking a highly skilled Senior IT Systems Administrator to join our team at York College of Pennsylvania. As a key member of the Infrastructure and Security Services Group, you will be responsible for the general systems administration of servers, storage, and applications.**Responsibilities:**Manage virtual desktop...


  • New York, New York, United States The Goldman Sachs Group Full time

    **Job Overview**Goldman Sachs Electronic Trading (GSET) has launched an initiative to become the top provider in Electronic US Listed Options Trading by building superior technology and delivering high-quality products. We are looking for an experienced Platform Engineer to join our team and help drive our Electronic Options Trading business...


  • New York, New York, United States Icahn School of Medicine at Mount Sinai Full time

    Company OverviewThe Icahn School of Medicine at Mount Sinai is a renowned institution dedicated to advancing medical research and education. With a rich history of innovation, the school has established itself as a leader in the field of scientific computing and data science.Our team is passionate about accelerating scientific discovery through the...


  • New York, United States Cyfle Full time

    About the job Senior Systems Administrator Salary: $145,000.00 We have partnered with a large non-profit organization in the New York, NY area to provide them with a Senior Systems Administrator. Please review the below description and let us know if you are interested. Must Have Skills for the Senior Systems Administrator: #1. 8+ years hands-on experience...


  • New York, New York, United States Fourier Ltd Full time

    About the RoleWe are seeking a highly skilled High-Performance Computing Specialist to join our Research & Development team at Fourier Ltd. As a key member of the team, you will be responsible for designing and implementing large-scale infrastructure solutions that drive global operations 24/7.Key Responsibilities:Design and implement large-scale...