HPC-Kubernetes Solutions Architect
2 days ago
Title:
HPC Kubernetes Solutions Architect
Location:
Dallas, TX
Duration:
Permanent Position
Compensation:
$200,000 - $350,000/year
Work Requirements:
US Citizen, GC Holders or Authorized to Work in the U.S.
HPC Kubernetes Solutions Architect
- As an HPC Kubernetes Solutions Architect, you will act as a trusted advisor to customers, guiding them through the design, integration, and adoption of GPU-accelerated Kubernetes platforms purpose-built for high-performance computing (HPC), AI/ML training, simulation, and scientific workloads.
- This is a customer-facing architecture role with accountability across the entire solution lifecycle — from early discovery and requirements analysis, through reference architecture design, proof-of-concept delivery, and deployment, to long-term optimization and platform evolution.
- You will be responsible for creating architectural blueprints and integration strategies that enable customers to achieve measurable performance and scalability outcomes, while preparing them for future growth and technology shifts.
- In addition, you will collaborate closely with product, engineering, and operations teams, ensuring customer feedback informs roadmap priorities and helping define the next generation of Kubernetes-based HPC orchestration.
- This role is ideal for someone who combines deep technical expertise in Kubernetes and GPU orchestration with the ability to engage customers as a solution strategist, aligning today's workloads with tomorrow's innovation.
Responsibilities:
- Act as the primary architectural point of contact for customers adopting GPU-accelerated Kubernetes platforms for HPC and AI/ML workloads.
- Partner with customers to capture workload requirements, performance objectives, scaling needs, and integration constraints, translating them into reference architectures and actionable solution designs.
- Architect and operate Kubernetes clusters optimized for GPU workloads, leveraging NVIDIA
- Integrate and tune Multi-Instance GPU (MIG), GPU sharing, and scheduler extensions (e.g., Volcano, Slurm integration, kube-scheduler plugins) to maximize efficiency in multi-tenant environments.
- Develop or extend custom Kubernetes operators and controllers in Go/Python to automate HPC infrastructure services.
- Design and recommend secure multi-tenant Kubernetes environments, implementing RBAC, OPA/Gatekeeper policies, namespace isolation, and workload quotas.
- Lead proof-of-concept and benchmarking engagements, using profiling tools, workload characterization, and telemetry to validate solution performance and scalability.
- Define and document integration strategies across compute, storage, networking, and orchestration layers, including CNI plugins (NVIDIA CNI, Multus, Cilium), storage systems (Lustre, GPFS, Ceph, VAST), and container runtimes (containerd, NVIDIA Container Toolkit).
- Drive observability and monitoring solutions with Prometheus, Grafana, DCGM Exporter, and OpenTelemetry, ensuring visibility into GPU health, cluster utilization, and workload performance.
- Support GitOps-driven CI/CD pipelines for Kubernetes infrastructure using ArgoCD, FluxCD, Helm, and Kustomize.
- Collaborate with HPC, ML, and DevOps teams to validate performance and scalability in hybrid or on-premise environments.
- Provide architectural leadership during onboarding and deployment, ensuring successful integration of Kubernetes clusters with HPC schedulers and enterprise IT systems.
- Build and maintain strategic relationships with ecosystem vendors (e.g., NVIDIA, Cisco, storage partners), incorporating emerging technologies into customer environments.
- Share future insights with customers on GPU roadmaps, interconnect advancements (e.g., InfiniBand, RoCE, NVLink), and container orchestration trends.
- Represent the organization in customer design sessions, technical workshops, and industry conferences, positioning yourself as a thought leader in Kubernetes for HPC.
Required Skills:
- Extensive experience in Kubernetes architecture and operations for HPC or GPU-intensive environments.
Strong technical expertise in:
- NVIDIA GPU stack (GPU Operator, device plugins, MIG, NVML, DCGM).
- Kubernetes internals (CRDs, RBAC, scheduler extensions, custom operators/controllers).
- Distributed and parallel storage integration with Kubernetes for HPC workloads.
- High-performance networking (InfiniBand, RDMA, RoCE) in containerized environments.
- Proven ability to design scalable, secure, and resilient Kubernetes-based architectures for HPC and AI/ML use cases.
- Proficiency in Go or Python for Kubernetes operator or controller development.
- Experience with workload profiling, benchmarking, and performance tuning.
- Strong customer engagement skills, capable of translating requirements into actionable architectures and presenting solutions effectively.
- Collaborative mindset with experience working across engineering, product, and operations teams.
Preferred Experience:
- Demonstrated success in end-to-end customer solution delivery, from requirements discovery to deployment and adoption.
- Familiarity with containerized HPC environments (e.g., Singularity/Apptainer).
- Exposure to automation and GitOps practices for Kubernetes platform management (e.g., ArgoCD, FluxCD).
- Contributions to open-source projects in the Kubernetes or NVIDIA ecosystem.
- Experience advising on future adoption strategies, helping customers prepare for emerging GPU, interconnect, and orchestration technologies.
- Bachelor's or Master's degree in Computer Science, Engineering, Physics, or related technical field.
- Relevant Kubernetes and container certifications such as CKA, CKAD, or CKS, alongside cloud certifications like AWS Solutions Architect or Azure Solutions Architect Expert.
About INSPYR Solutions
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients' business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at
INSPYR Solutions provides Equal Employment Opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, or genetics. In addition to federal law requirements, INSPYR Solutions complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities.
Information collected and processed through your application with INSPYR Solutions (including any job applications you choose to submit) is subject to INSPYR Solutions' Privacy Policy and INSPYR Solutions' AI and Automated Employment Decision Tool Policy: By submitting an application, you are consenting to being contacted by INSPYR Solutions through phone, email, or text.
-
HPC Security Solutions Architect
2 days ago
Dallas, Texas, United States INSPYR Solutions Full timeTitle:HPC Security Solutions ArchitectLocation:Dallas, TX (Hybrid and Relocation Assistance Provided)Duration:Permanent Direct HireCompensation:$200K-$325K Base Salary, plus BonusWork Requirements:US Citizen, GC Holders or Authorized to Work in the U.S.HPC Security Solutions ArchitectAs an HPC Security Solutions Architect, you will design and integrate...
-
Kubernetes Engineer
11 hours ago
Dallas, Texas, United States Broward Sheriff County Full timePosition: Kubernetes EngineerDuration: 12 Months plusLocation: Hybrid - Dallas, TXJob Description:In this role, you will design, implement, and optimise GPU-accelerated container platforms at scale, enabling high-performance workloads (AI/ML, HPC, LLM training) across hybrid or on-prem environments.You will have deep expertise with both NVIDIA and Kubernetes...
-
Principal Solutions Architect
2 days ago
Dallas, Texas, United States Anblicks Full time $150,000 - $200,000 per yearPrincipal Solutions ArchitectLead in setting goals, providing guidance, and ensuring the team's alignment with the organization's business objectives. Design, develop, architect, implement, and support complex solutions and applications using AWS, Azure, Databricks, Apache Spark, Python, Golang, Java, Snowflake, Databricks delta lake, Kafka, AWS SQS, SNS,...
-
Dallas, Texas, United States Wipro Full time $60,000 - $135,000 per yearJob DescriptionJob Title: Agentic AI with OpenStack and Kubernetes for application deployment Architect (CI/CD)Req Id: 102473City: DallasState/Province: TexasPosting Start Date: 10/9/25Wipro Limited (NYSE: WIT, BSE: 507685, NSE: WIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients' most...
-
Senior Azure Solutions Architect
4 days ago
Dallas, Texas, United States Aperia Full time $120,000 - $200,000 per yearSummaryJoin Aperia Solutions, a leader in SaaS solutions for the Payments and Compliance industries. Aperia is a Texas-based fintech and managed consultancy firm that creates custom SaaS applications and other software-based solutions for the payments, banking, and processing industry. Founded in 1999, Aperia offers business intelligence, risk management,...
-
Solutions Architect, Data
4 days ago
Dallas, Texas, United States Gruve Full time $160,000 - $200,000 per yearAbout GruveGruve is an innovative software services startup dedicated to transforming enterprises to AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more...
-
Solutions Architect
2 days ago
Dallas, Texas, United States Smart IT Frame LLC Full time $104,000 - $160,000 per yearSolution ArchitectDallas TXExperienced Enterprise Solution Architect skilled in designing and leading large-scale, cloud-based, and distributed systems.Proficient in web components, APIs, microservices, data analytics, networking, and data storage.Strong background with Adobe solutions, Micro Frontend/Microservice, and Omnichannel architectures.Adept at...
-
GenAI Architect
4 days ago
Dallas, Texas, United States Info Way Solutions Full time $180,000 - $300,000 per yearGenAI Architect – Job DescriptionJob Title:Generative AI (GenAI) ArchitectExperience:10+ years in software/AI engineering, with 3–5 years in AI/ML and hands-on experience in LLM/GenAI solutionsLocation:OpenType:Full-timeRole OverviewWe are seeking a highly skilledGenAI Architectto lead the design, development, and implementation of enterprise-grade...
-
Solution Architect
18 hours ago
Dallas, Texas, United States VRIZE Full time $120,000 - $180,000 per yearAbout the Role:We are seeking an experienced and strategicSolution Architectto lead the design and implementation of high-level architecture solutions aligned with our business goals. This role is crucial in driving innovation, ensuring technology alignment with business strategies, and maintaining enterprise architecture standards across the...
-
PLM Solution Architect – Remote
4 days ago
Dallas, Texas, United States Stott and May Full time $170,000 - $190,000 per yearPLM Solution Architect – Remote (U.S. Travel) – Full TimeSkills– PLM, Product Lifecycle Management, Solution Architect, PLM Solution Architect, PLM Lead ArchitectMy client who is a leading Global Software provider is currently looking to on board an experiencedPLM Solution Architect.The successful candidate will have an accomplished track record in...