Lead Generative AI Engineer

Found in: Appcast Linkedin GBL C2 - 2 weeks ago


Palo Alto, United States Krutrim Full time

Location: Palo Alto, CA, US


Type of Job: Full-time


About Krutrim: Building AI computing for the future


Krutrim, a part of the Ola group, is working on creating the AI computing stack of the future. We endeavor to deliver a state-of-the-art AI computing stack that encompasses the AI computing infrastructure, AI Cloud, foundational models, and AI-powered end applications for the Indian market.


Our envisioned AI computing stack can empower consumers, startups, enterprises and scientists across the world to build their end AI applications or AI models. While we are building foundational models across text, voice, and vision relevant to our focus markets, we are also developing AI training and inference platforms that enable AI research and development across industry domains.


The platforms being built by Krutrim have the potential to impact millions of lives in India, across income and education strata, and across languages.


The team at Krutrim represents a convergence of talent across AI research, Applied AI, Cloud Engineering, and semiconductor design. Our teams operate from three locations: Bangalore, Singapore & San Francisco.


Job Description:

We are looking for an experienced Lead Generative AI Engineer to train, optimize, scale, and deploy a variety of generative AI models such as large language models, voice/speech foundation models, vision and multi-modal foundation models using cutting-edge techniques and frameworks. In this hands-on role, you will architect and implement state of art neural architecture, robust training and inference infrastructure to efficiently take complex models with billions of parameters to production while optimizing for low latency, high throughput, and cost efficiency.


Key Responsibilities:

  1. Architect and refine foundation model infrastructure to support the deployment of optimized AI models with a focus on C/C++, CUDA, and kernel-level programming enhancements.
  2. Implement state-of-the-art optimization techniques, including quantization, distillation, sparsity, streaming, and caching, for model performance enhancements.
  3. Spearhead the development of Vision pipelines, ensuring scalable training and inference workflows of 10s and 100s of billions of parameter foundation models.
  4. Should be able to innovate for the state-of-the-art architectures involving Panoptic Segmentation, Image Classification and Image Generation. It is expected that the candidate experiments with the internals of Vision Transformers and convolutional Models like ConvNext, CLIP, Visual Question Answering (VQA) and Diffusion Models. Practice around AI Arts, Image Prompts, Conditional Image Generation will be an additional advantage.
  5. Execute training and inference processes with a key emphasis on minimizing latency and maximizing throughput, utilizing GPU clusters and custom hardware.
  6. Innovate on current model deployment platforms, employing AWS, GCP, and GPU clusters, to enable high scalability and responsiveness.
  7. Integrate and tailor frameworks such as PyTorch, TensorFlow, DeepSpeed, and FSDP for the advancement of super-fast model training and inference.
  8. Advance the deployment infrastructure with MLOps frameworks such as KubeFlow, MosaicML, Anyscale, Terraform, ensuring robust development and deployment cycles.
  9. Enhance post-deployment mechanisms with exhaustive testing, real-time monitoring, and sophisticated explainability and robustness checks.
  10. Drive continuous improvement initiatives for deployed models with automated pipelines for drift detection and performance degradation.
  11. Lead the charge in model management, encompassing version control, reproducibility, and lineage tracking.
  12. Cultivate a culture of high-performance computing and optimization within the AI/ML domain, propagating best practices and knowledge sharing.


Qualifications:

  1. Ph.D. with 5+ years or MS with 8+ years of experience in ML Engineering, Data Science, or related fields.
  2. Demonstrated expertise in high-performance computing with proficiency in Python, C/C++, CUDA, and kernel-level programming for AI applications.
  3. Extensive experience in the optimization of training and inference for large-scale AI models, including practical knowledge of quantization, distillation, and Vision Pipelines.
  4. It will be of additional benefit if the Candidate understands Diffusion Models (DDPM), Variational Autoencoders, Bayesian Modelling, Stochastic Variational Inference (SVI) and Reinforcement Learning.
  5. Experience in building 10s and 100s of billions of parameters generative AI foundation models
  6. AI training job scheduling, orchestration, and management via SLURM and Kubeflow.
  7. Proven success in deploying optimized ML systems on a large scale, utilizing cloud infrastructures and GPU resources.
  8. In-depth understanding and hands-on experience with advanced model optimization frameworks such as DeepSpeed, FSDP, PyTorch, TensorFlow, and corresponding MLOps tools.
  9. Familiarity with contemporary MLOps frameworks like MosaicML, Anyscale, Terraform, and their application in production environments.
  10. Strong grasp of state-of-the-art ML infrastructures, deployment strategies, and optimization methodologies.
  11. An innovative problem-solver with strategic acumen and a collaborative mindset.
  12. Exceptional communication and team collaboration skills, with an ability to lead and inspire.



  • Palo Alto, United States WinWire Full time

    **Location**:Remote (USA) Prompt Engineering and AI Chatbot Acumen: Developing sophisticated AI chatbot solutions, including QA/chatbots, translation, and search/summarization functionalities. Understanding of GenAI Foundation Models and Vector DB: Leveraging foundational AI models and vector database technologies for advanced AI capabilities. RAG...


  • Palo Alto, United States Knitit.ai Full time

    We are looking for a AI/ML Engineer to join a small team of ambitious people that are building an AI-powered assistant product in the Palo Alto, CA. We build innovative solutions that aim to amplify the power of our users through intelligent interactions. This position is ideal for a talented expert eager to apply their skills in the production of highly...

  • Generative AI Engineer

    Found in: Appcast US C2 - 2 weeks ago


    Palo Alto, United States Knitit.ai Full time

    We are looking for a AI/ML Engineer to join a small team of ambitious people that are building an AI-powered assistant product in the Palo Alto, CA. We build innovative solutions that aim to amplify the power of our users through intelligent interactions. This position is ideal for a talented expert eager to apply their skills in the production of highly...

  • Generative AI Engineer

    Found in: Appcast Linkedin GBL C2 - 2 weeks ago


    Palo Alto, United States Knitit.ai Full time

    We are looking for a AI/ML Engineer to join a small team of ambitious people that are building an AI-powered assistant product in the Palo Alto, CA. We build innovative solutions that aim to amplify the power of our users through intelligent interactions. This position is ideal for a talented expert eager to apply their skills in the production of highly...

  • Customer Solutions Engineer

    Found in: Appcast US C2 - 2 weeks ago


    Palo Alto, United States Hippocratic AI Full time

    About Us:Hippocratic AI is harnessing generative AI to augment the workforce of healthcare professionals with an infinitely scalable set of generative AI nursing, care coordination, clinical administration, and customer service agents that unlock an unprecedented level of care and service at low cost without sacrificing quality or safety. Its initial product...

  • Customer Solutions Engineer

    Found in: Appcast Linkedin GBL C2 - 2 weeks ago


    Palo Alto, United States Hippocratic AI Full time

    About Us:Hippocratic AI is harnessing generative AI to augment the workforce of healthcare professionals with an infinitely scalable set of generative AI nursing, care coordination, clinical administration, and customer service agents that unlock an unprecedented level of care and service at low cost without sacrificing quality or safety. Its initial product...

  • Technical Lead

    1 week ago


    Palo Alto, United States Skale Talent Full time

    We’re hiring for an exciting seed-stage ML company based in Palo Alto. The founding team has successfully exited AI companies in the past, and they have top-tier VC investors including ones that backed DeepMind and OpenAI. The startup is working in LLMs for virtual voice-based conversational agents. They have developed a novel training framework for deep...


  • Palo Alto, United States Lutra AI Full time

    Lutra is a cutting-edge technology company that makes it easy to deeply leverage AI in our work and lives, so that we can reclaim time to spend on the things that truly matter to us. We are a small team based in the San Francisco Bay Area, with deep AI expertise. If you enjoy learning and using the latest AI technologies, and want to turn them into useful...

  • Software Engineer, Full Stack

    Found in: Talent US C2 - 2 weeks ago


    Palo Alto, United States Lutra AI Full time

    Lutra is a cutting-edge technology company that makes it easy to deeply leverage AI in our work and lives, so that we can reclaim time to spend on the things that truly matter to us. We are a small team based in the San Francisco Bay Area, with deep AI expertise. If you enjoy learning and using the latest AI technologies, and want to turn them into useful...


  • Palo Alto, United States Lutra AI Full time

    Job DescriptionJob DescriptionLutra is a cutting-edge technology company that makes it easy to deeply leverage AI in our work and lives, so that we can reclaim time to spend on the things that truly matter to us. We are a small team based in the San Francisco Bay Area, with deep AI expertise. If you enjoy learning and using the latest AI technologies, and...

  • LLM Research Engineer

    Found in: Appcast US C2 - 2 weeks ago


    Palo Alto, United States CHAI: AI Platform Full time

    AI Research Engineer (LLM Optimization)$250-350K | PALO ALTO, CAChai is one of the fastest-growing, generative AI startups in Silicon Valley. YouTube but for LLM's - we have over 1 million active users.Who we are looking for:We need a relentless engineer with 3+ years of experience overseeing and being responsible for optimizing our LLMs. Ensuring they are...

  • LLM Research Engineer

    Found in: Appcast Linkedin GBL C2 - 2 weeks ago


    Palo Alto, United States CHAI: AI Platform Full time

    AI Research Engineer (LLM Optimization)$250-350K | PALO ALTO, CAChai is one of the fastest-growing, generative AI startups in Silicon Valley. YouTube but for LLM's - we have over 1 million active users.Who we are looking for:We need a relentless engineer with 3+ years of experience overseeing and being responsible for optimizing our LLMs. Ensuring they are...

  • Customer Success Manager

    Found in: Appcast US C2 - 4 days ago


    Palo Alto, United States Hippocratic AI Full time

    About Us:Hippocratic AI is harnessing generative AI to augment the workforce of healthcare professionals with an infinitely scalable set of generative AI nursing, care coordination, clinical administration, and customer service agents that unlock an unprecedented level of care and service at low cost without sacrificing quality or safety. Its initial product...


  • Palo Alto, United States Hippocratic AI Full time

    About Us:Hippocratic AI is harnessing generative AI to augment the workforce of healthcare professionals with an infinitely scalable set of generative AI nursing, care coordination, clinical administration, and customer service agents that unlock an unprecedented level of care and service at low cost without sacrificing quality or safety. Its initial product...

  • Customer Success Manager

    Found in: Appcast Linkedin GBL C2 - 4 days ago


    Palo Alto, United States Hippocratic AI Full time

    About Us:Hippocratic AI is harnessing generative AI to augment the workforce of healthcare professionals with an infinitely scalable set of generative AI nursing, care coordination, clinical administration, and customer service agents that unlock an unprecedented level of care and service at low cost without sacrificing quality or safety. Its initial product...

  • Software Engineer, Full Stack

    Found in: Resume Library US A2 - 14 hours ago


    Palo Alto, California, United States Lutra AI Full time

    Lutra is a cutting-edge technology company that makes it easy to deeply leverage AI in our work and lives, so that we can reclaim time to spend on the things that truly matter to us. We are a based in the San Francisco Bay Area, with deep AI expertise. If you enjoy learning and using the latest AI technologies, and want to turn them into useful and...

  • Global Learning

    1 week ago


    Palo Alto, United States Symphony Industrial AI Full time

    IntroductionFounded in 2017, SymphonyAI employs a global workforce of 3,000+ team members delivering packaged enterprise AI solutions for a range of critical industry use cases. With deep expertise in every industry we serve, SymphonyAI builds packaged AI solutions targeted to solve specific challenges. Best-in-class predictive and generative AI provide...


  • Palo Alto, United States Unreal Gigs Full time

    Job DescriptionJob DescriptionAbout Our Firm:We are at the forefront of artificial intelligence research and development, akin to the most renowned AI labs in the world. Our mission is to develop AI technologies that benefit humanity, tackling some of the most challenging problems across various domains. Join us to be a part of a team that’s shaping the...


  • Palo Alto, United States CareerBuilder Full time

    At Hippocratic AI, we are at the forefront of technological innovation, leveraging advanced computing resources to solve complex problems. Our dedicated GPU clusters, including high-end NVIDIA A100 and H100 models, are crucial for our data processing, machine learning, and computational tasks, including the development and optimization of Large Language...


  • Palo Alto, United States AIMon Full time

    We're seeking a talented and experienced ML Engineer who thrives in researching and implementing high-performance ML solutions that can handle massive scale. You'll play a key role in designing, developing, and deploying next-generation applications that unlock the reliable adoption of Generative AI technologies. Traits : The ideal candidate will possess the...