Senior Software Engineer

4 weeks ago


New York, United States GEICO Full time

Senior Software Engineer AI/ML Infra Apply to join GEICO as a Senior Software Engineer AI/ML Infra. Base pay range $105,000 $300,000 per year Company Overview GEICO is a leading insurance company known for its mission to protect people when they need it most. We thrive through relentless innovation, exceed customer expectations, and foster a supportive and inclusive culture. Role Overview GEICO AI ML Infrastructure team seeks an exceptional Senior ML Platform Engineer to build and scale machine learning infrastructure focusing on Large Language Models (LLMs) and AI applications. The role blends deep technical expertise in cloud platforms, container orchestration, and MLOps with strong leadership and mentoring. You will design, implement, and maintain scalable, reliable systems enabling data science and engineering teams to deploy and operate LLMs efficiently at scale. Key Responsibilities ML Platform & Infrastructure Design and implement scalable infrastructure for training, fine?tuning, and serving open source LLMs (Llama, Mistral, Gemma, etc.) Architect and manage Kubernetes clusters for ML workloads, including GPU scheduling, autoscaling, and resource optimization Design, implement, and maintain feature stores for ML model training and inference pipelines Build and optimize LLM inference systems using frameworks like vLLM, TensorRT?LLM, and custom serving solutions Ensure 99.9%+ uptime for ML platforms through monitoring, alerting, and incident response procedures Design and implement ML platforms using DataRobot, Azure Machine Learning, Azure Kubernetes Service (AKS), and Azure Container Instances Develop and maintain infrastructure using Terraform, ARM templates, and Azure DevOps Implement cost?effective solutions for GPU compute, storage, and networking across Azure regions Ensure ML platforms meet enterprise security standards and regulatory compliance requirements Evaluate and potentially implement hybrid cloud solutions with AWS/GCP as backup or specialized use cases DevOps & Platform Engineering Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools Implement automated model training, validation, deployment, and monitoring workflows Set up comprehensive observability using Prometheus, Grafana, Azure Monitor, and custom dashboards Continuously optimize platform performance, reducing latency and improving throughput for ML workloads Design and implement backup, recovery, and business continuity plans for ML platforms Technical Leadership & Mentoring Mentor junior engineers and data scientists on platform best practices, infrastructure design, and MLOps Lead comprehensive code reviews focusing on scalability, reliability, security, and maintainability Design and deliver technical onboarding programs for new team members Establish and champion engineering standards for ML infrastructure, deployment practices, and operational procedures Create technical documentation, runbooks, and deliver internal training sessions on platform capabilities Cross?Functional Collaboration Work closely with data scientists to understand requirements and optimize workflows for model development and deployment Collaborate with product engineering teams to integrate ML capabilities into customer?facing applications Support research teams with infrastructure for experimenting with cutting?edge LLM techniques and architectures Present technical solutions and platform roadmaps to leadership and cross?functional stakeholders Required Qualifications Experience & Education Bachelors degree in computer science, engineering, or related technical field (or equivalent experience) 5+ years of software engineering experience with focus on infrastructure, platform engineering, or MLOps 2+ years of hands?on experience with large?scale machine learning infrastructure and deployment 1+ years of experience working with Large Language Models and transformer architectures Technical Skills Core Requirements Proficient in Python; strong skills in Go, Rust, or Java preferred Proven experience working with open source LLMs (Llama 2/3, Qwen, Mistral, Gemma, Code Llama, etc.) Proficient in Kubernetes, including custom operators, Helm charts, and GPU scheduling Deep expertise in Azure services (AKS, Azure ML, Container Registry, Storage, Networking) Experience implementing and operating feature stores (Chronon, Feast, Tecton, Azure ML Feature Store, or custom solutions) Hands?on experience with inference optimization using vLLM, TensorRT?LLM, Triton Inference Server, or similar DevOps & Platform Skills Advanced experience with Azure DevOps, GitHub Actions, Jenkins, or similar CI/CD platforms Proficiency with Terraform, ARM templates, Pulumi, or CloudFormation Deep understanding of Docker, container optimization, and multi?stage builds Experience with Prometheus, Grafana, ELK stack, Azure Monitor, and distributed tracing Knowledge of SQL and NoSQL databases, data warehousing, and vector databases Leadership & Soft Skills Demonstrated track record of mentoring engineers and leading technical initiatives Experience leading design reviews focusing on compliance, performance, and reliability Excellent ability to explain complex technical concepts to diverse audiences Strong analytical and troubleshooting skills for complex distributed systems Experience managing cross?functional technical projects and coordinating with multiple stakeholders Preferred Qualifications Advanced Experience Masters degree in computer science, machine learning, or related field 6+ years of platform engineering or infrastructure experience Experience with Staff Engineer or Tech Lead roles in ML/AI organizations Background in distributed systems and high?performance computing Open?source contributions to ML infrastructure projects or LLM frameworks Specialized Skills Hands?on experience with Azure, AWS (SageMaker, EKS), and/or GCP (Vertex AI, GKE) Experience with specialized hardware (A100s, H100s, TPUs, TEEs) and optimization Experience with RLHF and LLM fine?tuning workflows Experience with Milvus, Pinecone, Weaviate, Qdrant, or similar vector storage solutions Deep experience with MLflow, Kubeflow, DataRobot, or similar platforms Industry Knowledge Understanding of AI safety principles, model governance, and regulatory compliance Background in regulated industries with understanding of data privacy requirements Experience supporting ML research teams and academic partnerships Deep understanding of GPU optimization, memory management, and high?throughput systems Annual Salary $105,000 $300,000 per year GEICO Pledge Great Company: GEICO protects people when they need it most and constantly evolves to meet their needs. Great Careers: We offer personalized development programs, training, certifications, and coaching. Great Culture: Inclusive, collaborative culture rooted in integrity and a bias for action. Great Rewards: Competitive compensation, 401K, performance incentives, and benefits like mental health and family assistance. Equal Employment Opportunity Statement The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race, color, religious creed, national origin, ancestry, age, gender, pregnancy, sexual orientation, gender identity, marital status, familial status, disability or genetic information, in compliance with applicable federal, state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled. GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job, unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants. #J-18808-Ljbffr



  • New York, NY, United States CData Software Full time

    Senior Software Engineer (C++, Linux) C++ development skills at the "Application Level" in either Linux or Windows Skaneateles NY 13153 (Remote 100% Possible for solid consultant anywhere in New York Only) Max - $60/hr to 62/hr C2C. What you'll be doing: As a Senior Software Engineer, you will participate in the research and development of advanced medical...


  • New York, NY, United States CData Software Full time

    Senior Software Engineer (C++, Linux) C++ development skills at the "Application Level" in either Linux or Windows Skaneateles NY 13153 (Remote 100% Possible for solid consultant anywhere in New York Only) Max - $60/hr to 62/hr C2C. What you'll be doing: As a Senior Software Engineer, you will participate in the research and development of advanced medical...


  • New York, NY, United States CData Software Full time

    Senior Software Engineer (C++, Linux) C++ development skills at the "Application Level" in either Linux or Windows Skaneateles NY 13153 (Remote 100% Possible for solid consultant anywhere in New York Only) Max - $60/hr to 62/hr C2C. What you'll be doing: As a Senior Software Engineer, you will participate in the research and development of advanced medical...


  • New York, NY, United States CData Software Full time

    Senior Software Engineer (C++, Linux) C++ development skills at the "Application Level" in either Linux or Windows Skaneateles NY 13153 (Remote 100% Possible for solid consultant anywhere in New York Only) Max - $60/hr to 62/hr C2C. What you'll be doing: As a Senior Software Engineer, you will participate in the research and development of advanced medical...


  • New York, United States Nifty Software e.U. Full time

    rePurpose is the Leading Packaging Sustainability and Compliance Platform for CPG brands to streamline EPR compliance and make credible sustainability claims. At rePurpose, we believe a world free of plastic waste is achievable within our lifetime. To accelerate our impact, we are hiring a Senior Software Engineer with full-stack experience to support our...


  • New York, United States Nifty Software e.U. Full time

    rePurpose is the Leading Packaging Sustainability and Compliance Platform for CPG brands to streamline EPR compliance and make credible sustainability claims. At rePurpose, we believe a world free of plastic waste is achievable within our lifetime. To accelerate our impact, we are hiring a Senior Software Engineer with full-stack experience to support our...


  • New York, United States Truelogic Software Full time

    Join to apply for the Senior Software Engineer - Hospitality role at Truelogic Software. About Truelogic At Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, we’ve been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders,...


  • New York, United States Truelogic Software Full time

    Join to apply for the Senior Software Engineer - Hospitality role at Truelogic Software. About Truelogic At Truelogic we are a leading provider of nearshore staff augmentation services headquartered in New York. For over two decades, weve been delivering top-tier technology solutions to companies of all sizes, from innovative startups to industry leaders,...


  • New York, United States Python Software Foundation Full time

    As a Senior Software Engineer at Known, you can expect to work across various projects. Youll join small, highly focused teams where youll have the opportunity to significantly influence the direction of our products, team practices, and the companys broader technical culture! On the backend, you'll work on distributed systems, API endpoints, event-driven...


  • New York, United States Python Software Foundation Full time

    As a Senior Software Engineer at Known, you can expect to work across various projects. You’ll join small, highly focused teams where you’ll have the opportunity to significantly influence the direction of our products, team practices, and the company’s broader technical culture! On the backend, you'll work on distributed systems, API endpoints,...