As an experienced Software Engineer to work on client ML Ops team responsible for managing the Model deployment and other operations activities:
Production Support & Monitoring: Provide ongoing monitoring and support of ML models, pipelines, and services to ensure stability and performance.
Deployment & Release Management: Manage CI/CD pipelines for ML model deployment, versioning, and production releases.
Vulnerability Remediation & Management: Identify, remediate, and track security vulnerabilities in ML infrastructure and dependencies.
Data & Model Governance: Maintain model lineage, audit trails, and compliance with data privacy and regulatory standards.
Incident & Change Management: Handle production incidents, perform root cause analysis, and manage approved changes to ML systems.
Mandatory Skills: ML Ops Vertex AI AWS Sagemaker Python Airflow