| Job Description: |
Must Have Technical/Functional Skills • Strong expertise in AI Ops / MLOps / LLM Ops practices • End-to-end model lifecycle management • Advanced model monitoring, observability, and alerting frameworks • Drift detection, performance tracking, and automated retraining • CI/CD and DevSecOps for AI/ML systems • Scalable deployment architectures for AI/ML and LLM • AI governance, risk, security, and compliance frameworks [ • AI platform engineering and operational tooling experience • Performance optimization (latency, cost, scalability) for AI workloads • Strong experience in automation of AI operations workflows • Data pipeline integration and ML infrastructure management • Cross-functional collaboration with engineering, data, and platform teams
Roles & Responsibilities • Define and implement AI Ops / MLOps / LLM Ops strategy for enterprise AI platforms • Manage end-to-end AI operations lifecycle (deployment, monitoring, scaling, optimization) • Establish model monitoring, observability, and alerting frameworks for production AI systems • Implement model lifecycle management (versioning, deployment, retraining, rollback, drift detection) • Define and track AI Ops KPIs (performance, reliability, incident reduction, automation efficiency) • Ensure high availability, scalability, and performance of AI systems in production • Drive adoption of CI/CD and DevSecOps practices for AI/ML systems • Implement governance, risk, security, and compliance controls for AI systems • Collaborate with AI engineering, data, and platform teams for seamless operation • Manage incident response, root-cause analysis, and continuous improvement for AI system • Optimize cost, latency, and resource utilization of AI workloads • Drive automation of AI operations processes and workflows
Role Descriptions: AI Ops Leader Skills: AI for Leadership Experience Required: 10 & Above |