Day 14 - Kserve Implementation for Intent Classifier Model

UpdatedApril 12, 2026

•3 min read

Day 14 - Kserve Implementation for Intent Classifier Model

👋 Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!

SKILLS:

🔹 Languages & Runtimes: Python, Shell Scripting, HCL, YAML 🔹 Cloud Technologies: AWS, Microsoft Azure, GCP 🔹 Infrastructure Tools: Docker, Terraform, AWS CloudFormation 🔹 Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps 🔹 Web Development: HTML, CSS, Bootstrap, Python, SQL

Job & Responsibilities:

🚀 Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. 🔒 Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. ⚙️ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. 📅 Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. 🔧 Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.

I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.

Let's Connect & Grow:

If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.

Part of seriesMLOps

Introduction

Deploying ML models into production is where real challenges begin.

In this guide, we’ll walk through a complete end-to-end deployment of an Intent Classification model using KServe— without relying on Knative.

👉 This setup uses:

Plain Kubernetes (RawDeployment mode)
Scikit-learn model
Minimal infrastructure
Production-like approach

What You’ll Learn

By the end of this guide, you’ll be able to:

Train and package an ML model
Deploy it using KServe
Expose it via Kubernetes service
Perform real-time inference using REST API

Architecture Overview

User → curl request → Kubernetes Service → KServe Predictor Pod → Model → Response

Step 1: Setup Environment

sudo apt update
sudo apt install python3.12-venv -y

python3 -m venv .venv
source .venv/bin/activate

Step 2: Clone and Train the Model

git clone https://github.com/sumanprasad007/Intent-classifier-model.git

cd Intent-classifier-model
git switch kserve

pip install -r requirements.txt
python3 model/train.py

Verify model artifact:

ls model/artifacts/

Step 3: Install Cert Manager

KServe uses webhooks secured via TLS, so cert-manager is required.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml

Verify:

kubectl get pods -n cert-manager

Step 4: Install KServe CRDs

kubectl create namespace kserve

helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd \
  --version v0.16.0 \
  -n kserve \
  --wait

Step 5: Install KServe Controller (RawDeployment Mode)

👉 This avoids the need for Knative.

helm install kserve oci://ghcr.io/kserve/charts/kserve \
  --version v0.16.0 \
  -n kserve \
  --set kserve.controller.deploymentMode=RawDeployment \
  --wait

Verify:

kubectl get pods -n kserve

Step 6: Deploy the Intent Classifier Model

kubectl create namespace intent

cat <<EOF | kubectl apply -n intent -f -
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: intent-classifier
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "https://github.com/sumanprasad007/Intent-classifier-model/releases/download/4.0/intent_model.pkl"
      resources:
        requests:
          cpu: "100m"
          memory: "512Mi"
        limits:
          cpu: "1"
          memory: "1Gi"
EOF

Step 7: Verify Deployment

kubectl get inferenceservice intent-classifier -n intent
kubectl get pods -n intent

Expected:

intent-classifier-predictor-xxxxx   Running

Step 8: Expose the Model

kubectl -n intent port-forward svc/intent-classifier-predictor 8080:80

Step 9: Perform Inference

curl -s -X POST http://localhost:8080/v1/models/intent-classifier:predict \
  -H "Content-Type: application/json" \
  -d '{"instances":["I want to cancel my subscription"]}' | jq

Expected Output

{
  "predictions": ["complaint"]
}

⚠️ Common Pitfalls (and Fixes)

No pods created

👉 Ensure RawDeployment mode is enabled & wait for crd pods to be up and running

Webhook errors

👉 Check cert-manager is running

Knative error

👉 Use:

--set kserve.controller.deploymentMode=RawDeployment

Key Takeaways

KServe supports multiple deployment modes
RawDeployment is ideal for:
- Simpler setups
- Learning environments
- Lightweight production use
You don’t always need:
- Knative
- Istio
- Complex networking

#technology #blogging #cloud-computing #automation #k8s #kubernetes #cicd-cjy1vtdk2005kjjs17n8couc3 #docker #mlops #ml #aws #cloud

Comments

Join the discussion

No comments yet. Be the first to comment.

MLOps

Part 6 of 20

Practical MLOps series breaking down how ML systems work in production — from data pipelines to deployment, monitoring, and retraining. No buzzwords, just real-world MLOps concepts explained simply for engineers and data teams.

Up next

Day 13: Implementing KServe - deploy ML Models on Production

In the previous section, we understood: What KServe is Why it exists How it simplifies model serving Today, we move from concept → real implementation. We will deploy a machine learning model usi

Day 14 - Kserve Implementation for Intent Classifier Model

Introduction

What You’ll Learn

Architecture Overview

Step 1: Setup Environment

Step 2: Clone and Train the Model

Step 3: Install Cert Manager

Step 4: Install KServe CRDs

Step 5: Install KServe Controller (RawDeployment Mode)

Step 6: Deploy the Intent Classifier Model

Step 7: Verify Deployment

Step 8: Expose the Model

Step 9: Perform Inference

Expected Output

⚠️ Common Pitfalls (and Fixes)

No pods created

Webhook errors

Knative error

Key Takeaways

Comments

MLOps

Day 13: Implementing KServe - deploy ML Models on Production

More from this blog

Day 19: Kubeflow for MLOps - Architecture, Components & Lifecycle

Day 18 - Deploy and Serve Model for Inference using AWS SageMaker

Day 17 - Create and Save Models to SageMaker

Day 16: End-to-End Setup of SageMaker Using AWS CLI

Day 15 - SageMaker Fully managed AWS MLOps Tool

Command Palette

Introduction

What You’ll Learn

Architecture Overview

Step 1: Setup Environment

Step 2: Clone and Train the Model

Step 3: Install Cert Manager

Step 4: Install KServe CRDs

Step 5: Install KServe Controller (RawDeployment Mode)

Step 6: Deploy the Intent Classifier Model

Step 7: Verify Deployment

Step 8: Expose the Model

Step 9: Perform Inference

Expected Output

⚠️ Common Pitfalls (and Fixes)

No pods created

Webhook errors

Knative error

Key Takeaways

Comments

MLOps

Day 13: Implementing KServe - deploy ML Models on Production

More from this blog