Current jobs related to Engineering Leader, AI Workload Management - Santa Clara, California - Oracle


  • Santa Clara, California, United States Oracle Full time

    Lead the Future of AI Workload OrchestrationOracle is seeking a highly experienced Senior Director of Engineering to lead the development and operation of our AI workload orchestration platforms. As a key member of our AI Infrastructure organization, you will be responsible for building and managing a team of software engineers to design, develop, and deploy...


  • Santa Clara, California, United States Oracle Full time

    Job SummaryWe are seeking a highly experienced Senior Director of Engineering to lead our AI Workload Orchestration team. As a key member of our organization, you will be responsible for providing strategic direction and technical leadership to our software development organization.Key ResponsibilitiesLead the development and operation of AI workload...

  • Product Manager

    2 weeks ago


    Santa Clara, California, United States Global AI Platform Corporation Full time

    About Global AI Platform CorporationGlobal AI Platform Corporation is a pioneering company in the AI industry, founded in June 2023. Headquartered in Santa Clara, California, with additional operations in Pangyo, South Korea, we are dedicated to developing cutting-edge AI technologies. Our flagship project is the Personal AI Assistant (PAA), designed to...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior High-Performance AI Training EngineerWe are seeking a highly skilled Senior High-Performance AI Training Engineer to join our team at NVIDIA. As a key member of our engineering team, you will be responsible for optimizing AI training workloads on innovative hardware and software platforms.Key Responsibilities:Understand and analyze AI...


  • Santa Monica, California, United States Flawless AI Full time

    Unlock the Future of Entertainment with Flawless AIFlawless AI is revolutionizing the film industry with its cutting-edge Gen AI film editing tools. As a Machine Learning Engineering Manager, you will play a crucial role in shaping the future of entertainment.Key Responsibilities:Manage a team of machine learning engineers to execute large-scale data...


  • Santa Clara, California, United States Celestial AI Full time

    About Celestial AICelestial AI is a pioneering company in the field of artificial intelligence, striving to push the boundaries of innovation and performance. As the industry grapples with the challenges of AI workloads, we are committed to delivering cutting-edge solutions that address the 'Memory Wall' problem and enable unprecedented scalability and...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a highly skilled AI Developer Relations Manager to lead our partnerships with developers within a large CSP, working with engineering, research, applications, and new initiatives.This role will be a combination of developer advocacy, product management, and business development.You will work closely with many groups within NVIDIA, including...


  • Santa Clara, California, United States NVIDIA Full time

    NVIDIA is a leader in the AI revolution, driving innovation in industries with our cutting-edge GPU technology. Our GPUs power groundbreaking advancements in AI, big data, and deep learning.We're seeking visionary leaders to join us as Senior SRE Engineering Leader. As a key member of our team, you'll lead our globally distributed clusters, ensuring seamless...


  • Santa Clara, California, United States NVIDIA Full time

    Senior SRE Engineering LeaderNVIDIA is a pioneer in the AI revolution, driving innovation in industries with our cutting-edge GPU technology. Our GPUs power groundbreaking advancements in AI, from self-driving cars to innovative research in computer vision, speech recognition, and more.We're seeking visionary leaders to join us on an exciting journey as...

  • Senior AI Engineer

    2 weeks ago


    Santa Clara, California, United States NVIDIA Full time

    Unlock the Power of AI with NVIDIAWe're seeking a talented Senior Developer Technology Engineer to join our AI Compute DevTech team. As a key member of our team, you'll play a crucial role in researching and developing techniques to accelerate AI workloads on advanced computer architectures.Key Responsibilities:Research and develop GPU acceleration...


  • Santa Clara, California, United States Boson AI Full time

    About Boson AIBoson AI is an innovative startup dedicated to developing cutting-edge language tools for global use. Our team of experts in Deep Learning, Optimization, NLP, AutoML, and Statistics is working tirelessly to create high-quality generative AI models for language and beyond.Job DescriptionWe are seeking a highly skilled Machine Learning Engineer...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior AI-HPC Storage EngineerNVIDIA is seeking a highly skilled Senior AI-HPC Storage Engineer to join our GPU AI/HPC Infrastructure team. As a member of this team, you will provide leadership in the design and implementation of groundbreaking fast storage solutions to enable runs of demanding deep learning, high performance computing, and...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior High-Performance AI Training EngineerNVIDIA is seeking a highly skilled Senior High-Performance AI Training Engineer to join our team. As a key member of our engineering team, you will be responsible for optimizing AI training workloads on innovative hardware and software platforms.Key Responsibilities:Understand, analyze, profile, and...


  • Santa Clara, California, United States NVIDIA Full time

    About NVIDIANVIDIA is a leader in AI, machine learning, and datacenter acceleration. Our company has continuously reinvented itself over two decades, with a strong focus on innovation and growth.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our GPU AI/HPC Infrastructure team. As a key member of our team, you will be responsible...


  • Santa Clara, California, United States Rivos Full time

    Job Title: AI Software Development EngineerAbout the Role:We are looking for a highly skilled AI Software Development Engineer to join our team at Rivos. As a key member of our silicon, software, and platform design team, you will be responsible for building and maintaining our AI software stack. Key Responsibilities:* Build-up components of an AI Software...


  • Santa Clara, California, United States NVIDIA Full time

    Job Title: Senior AI-HPC Storage Solutions ArchitectNVIDIA is a leader in the field of artificial intelligence and high-performance computing, and we are seeking a highly skilled Senior AI-HPC Storage Solutions Architect to join our team.About the Role:We are looking for an expert in designing and implementing high-performance storage solutions for our AI...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a Senior High-Performance AI Training Engineer to join our team at NVIDIA. As a key member of our team, you will be responsible for optimizing AI training workloads on innovative hardware and software platforms.This role offers the opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads...


  • Santa Clara, California, United States NVIDIA Full time

    We are seeking a Senior High-Performance AI Training Engineer to join our team at NVIDIA. As a key member of our team, you will be responsible for optimizing AI training workloads on innovative hardware and software platforms.This role offers the opportunity to directly impact the hardware and software roadmap in a fast-growing technology company that leads...


  • Santa Clara, California, United States XPENG Motors Full time

    We are seeking a highly skilled AI Performance Engineer to join our team at XPeng Motors, a leading smart electric vehicle company.As a key member of our software engineering team, you will be responsible for optimizing the training and inference performance of state-of-art ML infrastructure and foundation models for autonomous driving.With a strong...


  • Santa Clara, California, United States Oracle Full time

    Job DescriptionAt Oracle Cloud Infrastructure (OCI), we are building the world's largest AI clusters at the highest-tier of performance and value across AI workloads.The AI Infrastructure organization (AI2) at OCI is leading this effort.We are seeking a highly motivated and experienced product manager to join our team.Key ResponsibilitiesDefine and deliver...

Engineering Leader, AI Workload Management

2 months ago


Santa Clara, California, United States Oracle Full time

About the Role

We are seeking a highly experienced and skilled Engineering Leader to join our team at Oracle. As a Senior Director of Engineering, AI Workload Orchestration, you will be responsible for leading the software development organization building out and operating AI platforms that operate at unprecedented speed, scale, and reliability.

Key Responsibilities

  • Lead the development and operation of AI platforms that enable AI researchers to manage GPU clusters across the full model lifecycle.
  • Collaborate with geographically distributed teams to deliver large-scale projects on-time with high quality.
  • Provide leadership, direction, and strategy to establish and develop the organization to meet and execute on strategy.
  • Work with the largest players in the AI space to build systems that operate at unprecedented speed, scale, and reliability.

Requirements

  • MS or BS in Computer Science, or equivalent experience.
  • 5+ years of experience managing Software Engineering teams.
  • 12+ years of software engineering experience.
  • Strong communication skills, analytical skills, and project management skills.

Preferred Qualifications

  • 7-10+ years' experience delivering and operating large-scale, highly available distributed systems.
  • Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals.
  • Working familiarity with networking protocols (TCP/IP, HTTP) and standard network architectures.
  • Strong experience and detailed technical knowledge in distributed systems, high-performance computing, and GPU systems.
  • Experience in AI model training infrastructure.

What We Offer

  • Comprehensive benefits package, including medical, dental, and vision insurance, short-term disability, and long-term disability.
  • 401(k) Savings and Investment Plan with company match.
  • Flexible Vacation and Paid Time Off.
  • 11 paid holidays and paid sick leave.
  • Employee Stock Purchase Plan and financial planning and group legal services.