Skip to main content

Command Palette

Search for a command to run...

Day 14 - Kserve Implementation for Intent Classifier Model

Updated
3 min read
Day 14 - Kserve Implementation for Intent Classifier Model
P

👋 Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!

SKILLS:

🔹 Languages & Runtimes: Python, Shell Scripting, HCL, YAML 🔹 Cloud Technologies: AWS, Microsoft Azure, GCP 🔹 Infrastructure Tools: Docker, Terraform, AWS CloudFormation 🔹 Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps 🔹 Web Development: HTML, CSS, Bootstrap, Python, SQL

Job & Responsibilities:

🚀 Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. 🔒 Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. ⚙️ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. 📅 Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. 🔧 Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.

I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.

Let's Connect & Grow:

If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.


Introduction

Deploying ML models into production is where real challenges begin.

In this guide, we’ll walk through a complete end-to-end deployment of an Intent Classification model using KServe— without relying on Knative.

👉 This setup uses:

  • Plain Kubernetes (RawDeployment mode)

  • Scikit-learn model

  • Minimal infrastructure

  • Production-like approach


What You’ll Learn

By the end of this guide, you’ll be able to:

  • Train and package an ML model

  • Deploy it using KServe

  • Expose it via Kubernetes service

  • Perform real-time inference using REST API


Architecture Overview

User → curl request → Kubernetes Service → KServe Predictor Pod → Model → Response

Step 1: Setup Environment

sudo apt update
sudo apt install python3.12-venv -y

python3 -m venv .venv
source .venv/bin/activate

Step 2: Clone and Train the Model

git clone https://github.com/sumanprasad007/Intent-classifier-model.git

cd Intent-classifier-model
git switch kserve

pip install -r requirements.txt
python3 model/train.py

Verify model artifact:

ls model/artifacts/

Step 3: Install Cert Manager

KServe uses webhooks secured via TLS, so cert-manager is required.

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.yaml

Verify:

kubectl get pods -n cert-manager

Step 4: Install KServe CRDs

kubectl create namespace kserve

helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd \
  --version v0.16.0 \
  -n kserve \
  --wait

Step 5: Install KServe Controller (RawDeployment Mode)

👉 This avoids the need for Knative.

helm install kserve oci://ghcr.io/kserve/charts/kserve \
  --version v0.16.0 \
  -n kserve \
  --set kserve.controller.deploymentMode=RawDeployment \
  --wait

Verify:

kubectl get pods -n kserve

Step 6: Deploy the Intent Classifier Model

kubectl create namespace intent

cat <<EOF | kubectl apply -n intent -f -
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: intent-classifier
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "https://github.com/sumanprasad007/Intent-classifier-model/releases/download/4.0/intent_model.pkl"
      resources:
        requests:
          cpu: "100m"
          memory: "512Mi"
        limits:
          cpu: "1"
          memory: "1Gi"
EOF

Step 7: Verify Deployment

kubectl get inferenceservice intent-classifier -n intent
kubectl get pods -n intent

Expected:

intent-classifier-predictor-xxxxx   Running

Step 8: Expose the Model

kubectl -n intent port-forward svc/intent-classifier-predictor 8080:80

Step 9: Perform Inference

curl -s -X POST http://localhost:8080/v1/models/intent-classifier:predict \
  -H "Content-Type: application/json" \
  -d '{"instances":["I want to cancel my subscription"]}' | jq

Expected Output

{
  "predictions": ["complaint"]
}

⚠️ Common Pitfalls (and Fixes)

No pods created

👉 Ensure RawDeployment mode is enabled & wait for crd pods to be up and running


Webhook errors

👉 Check cert-manager is running


Knative error

👉 Use:

--set kserve.controller.deploymentMode=RawDeployment

Key Takeaways

  • KServe supports multiple deployment modes

  • RawDeployment is ideal for:

    • Simpler setups

    • Learning environments

    • Lightweight production use

  • You don’t always need:

    • Knative

    • Istio

    • Complex networking


MLOps

Part 6 of 20

Practical MLOps series breaking down how ML systems work in production — from data pipelines to deployment, monitoring, and retraining. No buzzwords, just real-world MLOps concepts explained simply for engineers and data teams.

Up next

Day 13: Implementing KServe - deploy ML Models on Production

In the previous section, we understood: What KServe is Why it exists How it simplifies model serving Today, we move from concept → real implementation. We will deploy a machine learning model usi

More from this blog

D

DeployToCloud

405 posts

👋 Welcome to my Hashnode blog! I'm a DevOps Engineer with 2+ years of experience. Join ~5k followers and explore 320+ blogs on Python, AWS, Docker, Jenkins, Linux, and more. Let's connect & grow 🚀