Platform Ops

6 days ago


Frisco, United States ZipRecruiter Full time

Job DescriptionJob Description

Candidate must go to ExoSoft's client location in Frisco TX or Bellevue WA 3-5 days every week depending on client needs. Valid prior job references are must.

Employment Type

  • Corp-to-Corp
  • 1099 individuals
  • W2 full-time is negotiable.

Candidate Employment Status

  • US
  • US Green-Card / GC EAD
  • Valid H1B with minimum 1+Yr validity
  • No OPT-EAD

Job Description:

We are seeking a highly skilled and motivated Platform Ops Engineer to manage and optimize the operational performance of Open AI GPT and related AI platforms across our enterprise. The role focuses on ensuring efficient, scalable, and secure operations for AI-based products and services, including large models (LLMs) and custom GPT solutions.

The ideal candidate will have a blend of expertise in AI/ML model operations, DevOps, infrastructure management, and strong problem-solving skills to support mission-critical AI applications.

Key Responsibilities:

AI/LLM Operations & Monitoring:

  • Manage the day-to-day operations of Open AI GPT models and other AI/ML platforms.
  • Implement automated monitoring and alerting for model performance, drift, and infrastructure health.
  • Ensure high availability, reliability, and scalability of deployed GPT models across the enterprise.
  • Optimize resource allocation and scaling for large model deployments, ensuring cost-effectiveness.

Automation & CI/CD Pipelines:

  • Design and maintain automated CI/CD pipelines for rapid deployment of AI/ML models.
  • Collaborate with data science and engineering teams to streamline model retraining and updates.
  • Integrate MLOps tools and platforms (e.g., Kubeflow, MLflow, or other AI orchestration tools).

Security & Compliance:

  • Implement and manage security policies around data privacy, model access, and infrastructure security.
  • Ensure AI platforms adhere to enterprise-level compliance and governance standards.
  • Identify and mitigate risks related to AI model vulnerabilities and data usage.

Infrastructure Management:

  • Administer cloud-based infrastructure (e.g., Azure,) used for AI/ML model deployment.
  • Handle model orchestration, scaling, and optimization in containerized environments (Kubernetes, Docker).
  • Support hybrid cloud/on-prem infrastructure setups where required.

Collaboration & Stakeholder Management:

  • Work closely with data scientists, AI engineers, and product teams to align AI Ops activities with business goals.
  • Serve as the central point of contact for troubleshooting AI-related issues, providing root-cause analysis, and addressing performance bottlenecks.
  • Document operational workflows, best practices, and post-mortem analyses for continuous improvement.

Proactive Issue Resolution:

  • Use predictive analytics and anomaly detection techniques to prevent AI platform issues before they impact the business.
  • Lead incident management for AI platform disruptions and resolve operational issues in a timely manner.

Experience:

  • 5+ years of experience in AI Ops, MLOps, DevOps, or platform operations.
  • Proven expertise with AI/ML platforms, especially Open AI GPT, other LLMs, or enterprise-grade AI services.

Technical Expertise:

  • Hands-on experience with cloud platforms Preferably Azure for AI/ML deployments.
  • Proficiency with AI frameworks and libraries (TensorFlow, PyTorch, etc.).
  • Experience with CI/CD tools (Jenkins, GitLab, CircleCI) and infrastructure-as-code (Terraform, Ansible).
  • Familiarity with containerization (Docker, Kubernetes) and orchestration tools.
  • Understanding of AI model lifecycle management, versioning, and governance.

Skills:

  • Strong scripting/programming skills (Python, Bash, etc.).
  • Analytical and problem-solving mindset with the ability to address complex operational issues.
  • Excellent communication skills to engage with cross-functional teams and present solutions to stakeholders.
  • Experience in managing high-performance, distributed systems.

  • Platform Ops

    2 months ago


    Frisco, United States EXOSOFT TECH Inc. Full time

    Job DescriptionJob DescriptionCandidate must go to ExoSoft's client location in Frisco TX or Bellevue WA 3-5 days every week depending on client needs. Valid prior job references are must.Employment TypeCorp-to-Corp1099 individualsW2 full-time is negotiable.Candidate Employment StatusUS CitizenUS Green-Card / GC EADValid H1B with minimum 1+Yr validityNo...


  • Frisco, Texas, United States Omni Inclusive Full time

    About the Role:We are seeking a Platform Operations Leader to join our team at Omni Inclusive. As a key member of our Platform Operations department, you will play a critical role in establishing roadmaps, designs, and implementing Platform Ops (Artificial Intelligence for IT Operations); for private cloud, and machine learning capabilities to automate and...

  • Technology Manager

    2 months ago


    Frisco, United States comerica Full time

    Technology Manager, Enterprise Platform Engineering, Document Management & RoboticsThe Technology Manger provides thought leadership to deliver creative and efficient technical solutions. This role leads the development of strategic plans for products and/or initiatives. This role is also responsible for leading their resources to develop high level delivery...


  • Frisco, Texas, United States Omni Inclusive Full time

    Omni InclusiveJob Description:We are looking for a highly skilled Vault Deployment Expert to join our team. The successful candidate will be responsible for designing, deploying, and sustaining a vault platform that prioritizes reliability and scalability.Requirements: Deep understanding of cloud computing principles, including virtualization,...

  • Technology Manager

    2 weeks ago


    Frisco, United States Comerica Full time

    Technology Manager, Enterprise Platform Engineering, Document Management & RoboticsMake sure to read the full description below, and please apply immediately if you are confident you meet all the requirements.The Technology Manger provides thought leadership to deliver creative and efficient technical solutions. This role leads the development of strategic...

  • Technology Manager

    4 weeks ago


    Frisco, United States comerica Full time

    Technology Manager, Enterprise Platform Engineering, Document Management & RoboticsThe Technology Manger provides thought leadership to deliver creative and efficient technical solutions. This role leads the development of strategic plans for products and/or initiatives. This role is also responsible for leading their resources to develop high level delivery...

  • Security Engineer

    5 months ago


    Frisco, United States Omni Inclusive Full time

    Design, deploy and sustain a vault platform that prioritizes reliability and scalability Required Skills Deep understanding of cloud computing principles, including virtualization, containerization, microservices, and serverless computing; Risk Management, RHCOS security, container security, Kubernetes security, IAM security, network security, auditing,...

  • Security Engineer(API

    2 months ago


    Frisco, United States Omni Inclusive Full time

    Responsible for analysis, design and implementation coordination for tool and service designs within the cloud security & identity domain. Understanding of cloud computing principles, including virtualization, containerization, microservices, and serverless computing; Risk Management, RHCOS security, container security, Kubernetes security, IAM security,...


  • Frisco, Texas, United States Omni Inclusive Full time

    Omni Inclusive is seeking a highly skilled Enterprise AI and Security Architect to lead the implementation of Platform Ops (Artificial Intelligence for IT Operations) in private cloud and machine learning capabilities. The successful candidate will work with collaborators to understand the context, develop strategies, and ensure delivery is scalable,...

  • Security Engineer(IAM)

    2 months ago


    Frisco, United States Omni Inclusive Full time

    The role is to establish the roadmaps, designs, and supports the implementation of Platform Ops (Artificial Intelligence for IT Operations); for private cloud, and machine learning capabilities to automate and streamline operational workflows. The successful candidate will be comfortable working with collaborators to understand the whole context, develop...


  • Frisco, United States Perfict Global, Inc. Full time

    About Us: Perfict Global is a leading IT consulting services provider focused on providing innovative and successful business workforce solutions to Fortune 500 companies. Our trained and experienced professionals constantly strive to bring together the best technologies available to manage client's complex business and technology, participate in...


  • Frisco, United States Futran Tech Solutions Pvt. Ltd. Full time

    Job Title: Senior Machine Learning Engineer Mandatory Experience: Model Deployment Model Monitoring ML Ops Vertex AI Work Location Frisco, TX (Remote Considered) Job Description: Senior Machine Learning Engineer This position will collaborate closely with stakeholders, including data scientists and analysts, to understand the organization's data and...


  • Frisco, United States HealthTexas Provider Network Full time

    Baylor Scott & White Health is seeking a Board Certified/Board Eligible General Surgeon to help expand our Acute Care and Trauma Surgery service line at our brand-new hospital campus located in Frisco, Texas. This is an employed career opportunity with a generous benefits package that offers work-life balance, a competitive salary, productivity bonus, moving...


  • Frisco, United States HealthTexas Provider Network Full time

    Baylor Scott & White Health is seeking a Board Certified/Board Eligible General Surgeon to help expand our Acute Care and Trauma Surgery service line at our brand-new hospital campus located in Frisco, Texas. This is an employed career opportunity with a generous benefits package that offers work-life balance, a competitive salary, productivity bonus, moving...


  • Frisco, United States Redwood Software Full time

    It's fun to work in a company where people truly BELIEVE in what they're doing! We're committed to bringing passion and customer focus to the business. OUR MISSION At Redwood Software we unleash human potential. We empower our customers with lights-out automation for their mission-critical business processes. Redwood Software is the leader in full stack...

  • Technology Manager

    2 weeks ago


    Frisco, United States Comerica Full time

    Technology Manager- Mainframe ApplicationsAre you the right candidate for this opportunity Make sure to read the full description below.The Technology Manger provides thought leadership to deliver creative and efficient technical solutions. This role leads the technical development and support of the core mainframe products and/or initiatives. This role is...


  • Frisco, TX, United States HealthTexas Provider Network Full time

    Baylor Scott & White Health is seeking a Board Certified/Board Eligible General Surgeon to help expand our Acute Care and Trauma Surgery service line at our brand-new hospital campus located in Frisco, Texas. This is an employed career opportunity with a generous benefits package that offers work-life balance, a competitive salary, productivity bonus, moving...


  • Frisco, TX, United States HealthTexas Provider Network Full time

    Baylor Scott & White Health is seeking a Board Certified/Board Eligible General Surgeon to help expand our Acute Care and Trauma Surgery service line at our brand-new hospital campus located in Frisco, Texas. This is an employed career opportunity with a generous benefits package that offers work-life balance, a competitive salary, productivity bonus, moving...