Senior Linux HPC Systems Engineer
1 day ago
Senior Linux HPC Systems Engineer
Founded in 1999 in the beautiful Smoky Mountains of East Tennessee, Cadre5 provides innovative technical solutions to our customers locally and nationally. Our Cadre5 Lab Partners division has partnered with the Information Technology Services Directorate (ITSD) at Oak Ridge National Laboratory (ORNL) to recruit a Senior Linux HPC Systems Engineer to design, operate and maintain clusters, servers, and workstations supporting services where science happens at ORNL
ORNL delivers scientific discoveries and technical breakthroughs needed to realize solutions in energy and national security and provides economic benefit to the nation. This premier research institution located near Knoxville in Oak Ridge, TN, addresses national needs through impactful research and world-leading research centers.
This is a full-time, permanent position that follows a hybrid model. Minimum of 3 days on-site a week.
Why Cadre5?
- Working with highly talented team members
- 3 weeks' vacation
- Excellent medical insurance, including employer-paid benefits
Job Responsibilities:
· Advocate and promote HPC and clustered computing services to researchers who process large data sets and/or develop code as a part of their project.
· Ensure the availability, performance, scalability, and security of production systems.
· Leverage automation and monitoring solutions that minimize our day-to-day maintenance and scout opportunities to optimize system management practices or system performance.
· Collaborate with technical POCs for the programs that we support to install and help tune the performance of various scientific toolsets.
· Optimize both workflows and monitoring solutions to take advantage of our 24/7 operations staff, which significantly reduces the need for off-hours support. We use Email, Jira, Confluence, Teams, Slack, and other collaboration solutions to stay in contact.
· Deliver ORNL's mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service. Promote equal opportunity by fostering a respectful workplace – in how we treat one another, work together, and measure success.
Basic Qualifications:
· A BS degree in computer science, computer engineering, information technology, information systems, science, engineering, business, or a related discipline and a minimum of eight (8) to twelve (12) years of aligned professional experience is required for consideration. An overall combination of equivalent education and experience may be considered.
- Masters' holders should have a minimum of seven (7) to ten (10) years of relevant and aligned experience.
- PhD holders should have a minimum of four (4) to six (6) years of relevant and aligned experience.
· Three (3) or years of proven experience with configuration management and automation tools such as Git, Jenkins, Ansible, or Puppet.
· Moderate proficiency in at least one scripting language such as Bash, Python, or others.
· Experience performing advanced troubleshooting and system administration with Linux Servers.
· Experience supporting large data systems.
· A strong desire to innovate and identify new technologies and opportunities and be able to communicate the potential benefits of those choices to others within the team and our research partners.
· A collaborative and upbeat approach to thrive on the opportunity to build trust and credibility and ultimately become a trusted advisor to our research teams.
· The ability to obtain and maintain a Department of Energy "Q" clearance is required. This requires US Citizenship.
Preferred Qualifications:
· Active DOE Q, active DOD Top Secret, or active DOD TS/SCI clearance is heavily preferred for consideration.
· Solid understanding of multiple operating systems and cluster technologies.
· Experience with Rocky/Centos/RHEL, Ubuntu, VMware.
· Understanding of HPC platforms to support users with SLURM job submissions and troubleshooting.
· Experience building and running containerized applications in an HPC environment.
· Experience with multiple deployment mechanisms like Diskless, Warewulf, and traditional deployment (cobbler, PXEboot, and/or Bright).
· Experience managing systems utilizing GPU (NVIDIA and AMD) clusters for AI/ML and/or image processing.
· Knowledge of networking fundamentals including TCP/IP, traffic analysis, common protocols, and network diagnostics.
· Experience with Infiniband networks and diagnostics.
· Extensive experience with High Performance Parallel File Systems (Lustre, WEKA, GPFS, etc).
· Experience with performance and diagnostic tools for benchmarking, analysis and tuning of systems, networking, and storage.
· Experience with Grafana, CheckMK, Nagios, Zabbix, SolarWinds, Ganglia, or other network and device monitoring systems.
· Previous experience working in a government, scientific or other highly technical environment.
· Good documentation skills, including ability to prepare simple documentation web pages.
Benefits
Cadre5 offers excellent pay and benefits, to include full medical, dental, and vision coverage coupled with 401K match, 15 days PTO, and 10 holidays.
Cadre5 is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. Cadre5 is an E-Verify Employer.
-
Senior HPC Linux Systems Engineer
1 week ago
Knoxville, Tennessee, United States Cadre5 Full timeSenior HPC Linux Systems Engineer Founded in 1999 in the beautiful Smoky Mountains of East Tennessee, Cadre5 provides innovative technical solutions to our customers locally and nationally. Our Cadre5 Lab Partners division has partnered with The High-Performance Computing Systems Section within the National Center for Computational Sciences (NCCS) at...
-
Senior Software Engineer
2 days ago
Knoxville, Tennessee, United States RDI Technologies Inc. Full timeRDI Technologies combines computer vision and digital signal processing to bring newtechnology to the industrial maintenance and asset reliability markets. We are excited tobring game-changing technology to this well-established market, and we need to bring newtalent to our development team.Role DescriptionAs a member of our development team, you will...
-
Senior Engineer- Wall Conditioning
1 week ago
Knoxville, Tennessee, United States Type One Energy Full timeJoin us in our mission to commercialize fusion energy About Type One EnergyType One Energy Group is mission-driven to provide sustainable, affordable fusion power to the world. Established in 2019 and venture-backed in 2023, the company is led by a team of globally recognized fusion scientists with a strong track record of building state-of-the-art...
-
Senior Full Stack Engineer
1 week ago
Knoxville, Tennessee, United States Shep Digital Solutions Full timeLocation: Knoxville, TNDepartment: EngineeringType: Full-TimeAbout Shep Digital SolutionsShep Digital Solutions is redefining how retailers engage with consumers through next-generation retail media technology. Shep's approach is retailer-first, providing turnkey content management and ad operations. Our proprietary platform powers omnichannel retail media...
-
Senior Transmission Engineer
3 days ago
Knoxville, Tennessee, United States GC-Squared Resources, LLC Full timeSenior Transmission Engineer - RenewablesDenver, Colorado (Hybrid-eligible)About Our ClientOur client is a well-established Engineering and Design company with a reputation of integrity and accomplishment. Having grown to over 1,000 employees over the past 35 years, our client still maintains a sincere dedication to its employees offering unparalleled...
-
Senior Electrical Engineer
2 weeks ago
Knoxville, Tennessee, United States Geosyntec Consultants Full time $120,000 - $180,000 per yearOverviewDo you want to build an impactful career to change the world for the better?Geosyntec has an exciting opportunity for aSenior Electrical Engineerto be based out of ourKennesaw, GA; Atlanta, GA; Birmingham, AL; Chattanooga, TN; Nashville, TN; Knoxville, TN; Charlotte, NC; Tallahassee, FL; Jacksonville, FL; Orlando, FL; Tampa, FL, Boca Raton, FL;...
-
Senior Traffic and ITS Engineer
1 week ago
Knoxville, Tennessee, United States Stantec Full timeSmart. Equitable. Resilient. Join our Smart(ER) Mobility team, helping to build the future of mobility. We're leveraging emerging technologies to create a more equitable and resilient mobility ecosystem.Your OpportunityStantec seeks an energetic, highly motivated, detail-oriented, self-starter to join the Nashville, TN office as a Senior Traffic and ITS...
-
Senior Network Engineer
5 days ago
Knoxville, Tennessee, United States The Judge Group Full timeLocation:Knoxville, TNSalary:$105,000.00 USD Annually - $135,648.00 USD AnnuallyDescription:Senior Networking Engineer*About the Job*The Senior Networking Engineer plays a critical role in the design, implementation, and management of advanced network infrastructure. This position requires deep technical expertise in LAN, WAN, WLAN, cloud-based solutions,...
-
Knoxville, Tennessee, United States North Wind Group Full timeLocation:Idaho Falls, ID / Las Vegas, NV / Salt Lake City, UT / Birmingham, AL / Knoxville, TNTitle:Senior Electrical EngineerSchedule (FT/PT):Regular Full TimeTravel Required:Yes (0-25%)*Government Clearance:*Ability to ObtainAt LBYD Federal, "Large Firm Expertise, Small Firm Responsive", is our motto we live by and practice each day. With the capability to...
-
Senior Electrical Engineer
3 days ago
Knoxville, Tennessee, United States Chen Moore and Associates Full timeAbout UsFounded in 1986, Chen Moore and Associates (CMA), Inc. has grown into a highly regarded multi-disciplinary firm. CMA's areas of expertise include utility infrastructure, roadway, site development, electrical, planning, landscape architecture, and construction engineering services for both private and public sector clients. CMA prides itself on...