Skip to main content

Command Palette

Search for a command to run...

Day 10 - Best Practices to Deploy ML Models Using Kubernetes

Updated
5 min read
Day 10 - Best Practices to Deploy ML Models Using Kubernetes
P

👋 Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!

SKILLS:

🔹 Languages & Runtimes: Python, Shell Scripting, HCL, YAML 🔹 Cloud Technologies: AWS, Microsoft Azure, GCP 🔹 Infrastructure Tools: Docker, Terraform, AWS CloudFormation 🔹 Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps 🔹 Web Development: HTML, CSS, Bootstrap, Python, SQL

Job & Responsibilities:

🚀 Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. 🔒 Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. ⚙️ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. 📅 Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. 🔧 Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.

I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.

Let's Connect & Grow:

If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.

We will take the Intent Classifier model, containerize it, deploy it on an AWS EKS cluster, and expose it using an Ingress Controller. The goal of this exercise is to understand how ML models move from a local development environment to a scalable production platform.


Architecture Overview

The deployed system follows this architecture:

Client
   ↓
Ingress Controller (Traefik)
   ↓
Kubernetes Service
   ↓
Deployment (Pods)
   ↓
Gunicorn
   ↓
Flask API (/predict)
   ↓
ML Model

Key components used in this setup:

  • Docker (containerization)

  • Kubernetes Deployment

  • Kubernetes Service

  • Traefik Ingress Controller

  • AWS EKS Cluster

  • Gunicorn + Flask Model Server


Prerequisites

Before starting, ensure the following tools are installed and configured.

Required tools:

  • AWS CLI (configured with aws configure)

  • kubectl

  • eksctl

  • Docker

  • Helm

You will also need a Docker Hub account to push the container image.


Step 1 - Clone the Project Repository

Clone the Intent Classifier repository and switch to the Kubernetes branch.

git clone https://github.com/iam-veeramalla/Intent-classifier-model.git

Step 2 - Intent Classifier Model

This project demonstrates a simple machine learning workflow.

The project includes:

  • Training a lightweight text classification model

  • Saving the trained model artifact

  • Serving predictions using a Flask API endpoint

The prediction endpoint is exposed as:

POST /predict

Running the Model Locally

Create a virtual environment and install dependencies.

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Train the model:

python model/train.py

This will generate the model artifact:

model/artifacts/intent_model.pkl

Run the API server:

python app.py

The API will start on:

http://127.0.0.1:6000

Example Request

curl -X POST http://127.0.0.1:6000/predict \
-H "Content-Type: application/json" \
-d '{"text":"I want to cancel my subscription"}'

Example response:

{
  "intent": "complaint",
  "probabilities": {
    "complaint": 0.85,
    "question": 0.05
  }
}

Step 3 - Build and Push the Docker Image

Next, we containerize the ML application.


Login to Docker Hub

docker login

Build the Docker Image

Replace <dockerhub-username> with your Docker Hub username.

docker build -t <dockerhub-username>/intent-classifier:latest .

Tag the Image

docker tag <dockerhub-username>/intent-classifier:latest <dockerhub-username>/intent-classifier:v1

Push the Image

Push the latest version:

docker push <dockerhub-username>/intent-classifier:latest

Push the versioned tag:

docker push <dockerhub-username>/intent-classifier:v1

Verify the Image

docker pull <dockerhub-username>/intent-classifier:latest

Step 4 - Create a Kubernetes Cluster (EKS)

We will create an Amazon EKS cluster using eksctl.


Create Cluster

eksctl create cluster \
  --name my-cluster \
  --region us-east-1 \
  --version 1.32 \
  --nodegroup-name standard-workers \
  --node-type t3.medium \
  --nodes 2 \
  --nodes-min 1 \
  --nodes-max 3 \
  --managed

This command automatically creates:

  • VPC

  • Subnets

  • EKS Control Plane

  • Managed Node Group

  • Required IAM roles


Configure kubeconfig

If required, update kubeconfig manually.

aws eks update-kubeconfig --region us-east-1 --name my-cluster

Verify the cluster:

kubectl get nodes
kubectl get pods -n kube-system

Verify Cluster Details

List clusters:

eksctl get cluster

Describe cluster:

aws eks describe-cluster --name my-cluster --region us-east-1

Cleanup (Delete Cluster)

To avoid unnecessary AWS costs, delete the cluster after testing.

eksctl delete cluster --name my-cluster --region us-east-1

Step 5 - Kubernetes Manifests

We now deploy the ML application to Kubernetes.


Create Namespace

apiVersion: v1
kind: Namespace
metadata:
  name: intent-namespace

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: intent-classifier
  namespace: intent-namespace
  labels:
    app: intent-classifier
spec:
  replicas: 2
  selector:
    matchLabels:
      app: intent-classifier
  template:
    metadata:
      labels:
        app: intent-classifier
    spec:
      containers:
        - name: intent-classifier
          image: <image-name>
          ports:
            - containerPort: 6000

Service

apiVersion: v1
kind: Service
metadata:
  name: intent-classifier
  namespace: intent-namespace
spec:
  selector:
    app: intent-classifier
  ports:
    - port: 80
      targetPort: 6000
  type: ClusterIP

Step 6 - Install Traefik Ingress Controller

To expose our service externally, we install the Traefik Ingress Controller.


Add Helm Repository

helm repo add traefik https://helm.traefik.io/traefik
helm repo update

Install Traefik

helm install traefik traefik/traefik \
  --namespace traefik \
  --create-namespace

Verify Installation

kubectl get pods -n traefik

The Traefik controller should now be running.


Retrieve the Load Balancer Endpoint

kubectl get svc -n traefik

Look for the service of type LoadBalancer and note the external endpoint.


Step 7 - Create Ingress Resource

Finally, create an Ingress rule to route traffic to the model API.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: intent-classifier-ingress
  namespace: intent-namespace
  annotations:
    kubernetes.io/ingress.class: traefik
spec:
  ingressClassName: traefik
  rules:
    - host: <your-domain-or-elb-dns>
      http:
        paths:
          - path: /predict
            pathType: Prefix
            backend:
              service:
                name: intent-classifier
                port:
                  number: 80

Final Test

Once the ingress is configured, test the endpoint.

curl -X POST http://<your-domain-or-elb-dns>/predict \
-H "Content-Type: application/json" \
-d '{"text":"I want to cancel my subscription"}'

If everything is configured correctly, the API will return the predicted intent.


What we Achieved

By completing this exercise, you have implemented a production-style ML deployment pipeline.

You successfully:

  • trained a machine learning model

  • containerized the model server

  • pushed the image to Docker Hub

  • deployed the model on Kubernetes

  • exposed the service through an Ingress controller

  • enabled scalable inference using Kubernetes pods

This workflow demonstrates how machine learning systems transition from local development to production infrastructure.

MLOps

Part 10 of 20

Practical MLOps series breaking down how ML systems work in production — from data pipelines to deployment, monitoring, and retraining. No buzzwords, just real-world MLOps concepts explained simply for engineers and data teams.

Up next

Day 9 – Model Deployment & Serving Using Kubernetes

(End-to-End Practical Implementation) Today we take the same Intent Classifier model and deploy it using: Docker + Kubernetes + Ingress + Autoscaling This is how modern production ML systems are bui

More from this blog

D

DeployToCloud

405 posts

👋 Welcome to my Hashnode blog! I'm a DevOps Engineer with 2+ years of experience. Join ~5k followers and explore 320+ blogs on Python, AWS, Docker, Jenkins, Linux, and more. Let's connect & grow 🚀