We will take the Intent Classifier model, containerize it, deploy it on an AWS EKS cluster, and expose it using an Ingress Controller. The goal of this exercise is to understand how ML models move from a local development environment to a scalable production platform.

Architecture Overview

The deployed system follows this architecture:

Client
   ↓
Ingress Controller (Traefik)
   ↓
Kubernetes Service
   ↓
Deployment (Pods)
   ↓
Gunicorn
   ↓
Flask API (/predict)
   ↓
ML Model

Key components used in this setup:

Docker (containerization)
Kubernetes Deployment
Kubernetes Service
Traefik Ingress Controller
AWS EKS Cluster
Gunicorn + Flask Model Server

Prerequisites

Before starting, ensure the following tools are installed and configured.

Required tools:

AWS CLI (configured with aws configure)
kubectl
eksctl
Docker
Helm

You will also need a Docker Hub account to push the container image.

Step 1 - Clone the Project Repository

Clone the Intent Classifier repository and switch to the Kubernetes branch.

git clone https://github.com/iam-veeramalla/Intent-classifier-model.git

Step 2 - Intent Classifier Model

This project demonstrates a simple machine learning workflow.

The project includes:

Training a lightweight text classification model
Saving the trained model artifact
Serving predictions using a Flask API endpoint

The prediction endpoint is exposed as:

POST /predict

Running the Model Locally

Create a virtual environment and install dependencies.

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Train the model:

python model/train.py

This will generate the model artifact:

model/artifacts/intent_model.pkl

Run the API server:

python app.py

The API will start on:

http://127.0.0.1:6000

Example Request

curl -X POST http://127.0.0.1:6000/predict \
-H "Content-Type: application/json" \
-d '{"text":"I want to cancel my subscription"}'

Example response:

{
  "intent": "complaint",
  "probabilities": {
    "complaint": 0.85,
    "question": 0.05
  }
}

Step 3 - Build and Push the Docker Image

Next, we containerize the ML application.

docker login

Build the Docker Image

Replace <dockerhub-username> with your Docker Hub username.

docker build -t <dockerhub-username>/intent-classifier:latest .

Tag the Image

docker tag <dockerhub-username>/intent-classifier:latest <dockerhub-username>/intent-classifier:v1

Push the Image

Push the latest version:

docker push <dockerhub-username>/intent-classifier:latest

Push the versioned tag:

docker push <dockerhub-username>/intent-classifier:v1

Verify the Image

docker pull <dockerhub-username>/intent-classifier:latest

Step 4 - Create a Kubernetes Cluster (EKS)

We will create an Amazon EKS cluster using eksctl.

Create Cluster

eksctl create cluster \
  --name my-cluster \
  --region us-east-1 \
  --version 1.32 \
  --nodegroup-name standard-workers \
  --node-type t3.medium \
  --nodes 2 \
  --nodes-min 1 \
  --nodes-max 3 \
  --managed

This command automatically creates:

VPC
Subnets
EKS Control Plane
Managed Node Group
Required IAM roles

Configure kubeconfig

If required, update kubeconfig manually.

aws eks update-kubeconfig --region us-east-1 --name my-cluster

Verify the cluster:

kubectl get nodes
kubectl get pods -n kube-system

Verify Cluster Details

List clusters:

eksctl get cluster

Describe cluster:

aws eks describe-cluster --name my-cluster --region us-east-1

Cleanup (Delete Cluster)

To avoid unnecessary AWS costs, delete the cluster after testing.

eksctl delete cluster --name my-cluster --region us-east-1

Step 5 - Kubernetes Manifests

We now deploy the ML application to Kubernetes.

Create Namespace

apiVersion: v1
kind: Namespace
metadata:
  name: intent-namespace

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: intent-classifier
  namespace: intent-namespace
  labels:
    app: intent-classifier
spec:
  replicas: 2
  selector:
    matchLabels:
      app: intent-classifier
  template:
    metadata:
      labels:
        app: intent-classifier
    spec:
      containers:
        - name: intent-classifier
          image: <image-name>
          ports:
            - containerPort: 6000

Service

apiVersion: v1
kind: Service
metadata:
  name: intent-classifier
  namespace: intent-namespace
spec:
  selector:
    app: intent-classifier
  ports:
    - port: 80
      targetPort: 6000
  type: ClusterIP

Step 6 - Install Traefik Ingress Controller

To expose our service externally, we install the Traefik Ingress Controller.

Add Helm Repository

helm repo add traefik https://helm.traefik.io/traefik
helm repo update

Install Traefik

helm install traefik traefik/traefik \
  --namespace traefik \
  --create-namespace

Verify Installation

kubectl get pods -n traefik

The Traefik controller should now be running.

Retrieve the Load Balancer Endpoint

kubectl get svc -n traefik

Look for the service of type LoadBalancer and note the external endpoint.

Step 7 - Create Ingress Resource

Finally, create an Ingress rule to route traffic to the model API.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: intent-classifier-ingress
  namespace: intent-namespace
  annotations:
    kubernetes.io/ingress.class: traefik
spec:
  ingressClassName: traefik
  rules:
    - host: <your-domain-or-elb-dns>
      http:
        paths:
          - path: /predict
            pathType: Prefix
            backend:
              service:
                name: intent-classifier
                port:
                  number: 80

Final Test

Once the ingress is configured, test the endpoint.

curl -X POST http://<your-domain-or-elb-dns>/predict \
-H "Content-Type: application/json" \
-d '{"text":"I want to cancel my subscription"}'

If everything is configured correctly, the API will return the predicted intent.

What we Achieved

By completing this exercise, you have implemented a production-style ML deployment pipeline.

You successfully:

trained a machine learning model
containerized the model server
pushed the image to Docker Hub
deployed the model on Kubernetes
exposed the service through an Ingress controller
enabled scalable inference using Kubernetes pods

This workflow demonstrates how machine learning systems transition from local development to production infrastructure.

Day 10 - Best Practices to Deploy ML Models Using Kubernetes

Architecture Overview

Prerequisites

Step 1 - Clone the Project Repository

Step 2 - Intent Classifier Model

Running the Model Locally

Example Request

Step 3 - Build and Push the Docker Image

Build the Docker Image

Tag the Image

Push the Image

Verify the Image

Step 4 - Create a Kubernetes Cluster (EKS)

Create Cluster

Configure kubeconfig

Verify Cluster Details

Cleanup (Delete Cluster)

Step 5 - Kubernetes Manifests

Create Namespace

Deployment

Service

Step 6 - Install Traefik Ingress Controller

Add Helm Repository

Install Traefik

Verify Installation

Retrieve the Load Balancer Endpoint

Step 7 - Create Ingress Resource

Final Test

What we Achieved

Comments

MLOps

Day 9 – Model Deployment & Serving Using Kubernetes

More from this blog

Day 19: Kubeflow for MLOps - Architecture, Components & Lifecycle

Day 18 - Deploy and Serve Model for Inference using AWS SageMaker

Day 17 - Create and Save Models to SageMaker

Day 16: End-to-End Setup of SageMaker Using AWS CLI

Day 15 - SageMaker Fully managed AWS MLOps Tool

Command Palette

Architecture Overview

Prerequisites

Step 1 - Clone the Project Repository

Step 2 - Intent Classifier Model

Running the Model Locally

Example Request

Step 3 - Build and Push the Docker Image

Login to Docker Hub

Build the Docker Image

Tag the Image

Push the Image

Verify the Image

Step 4 - Create a Kubernetes Cluster (EKS)

Create Cluster

Configure kubeconfig

Verify Cluster Details

Cleanup (Delete Cluster)

Step 5 - Kubernetes Manifests

Create Namespace

Deployment

Service

Step 6 - Install Traefik Ingress Controller

Add Helm Repository

Install Traefik

Verify Installation

Retrieve the Load Balancer Endpoint

Step 7 - Create Ingress Resource

Final Test

What we Achieved

Comments

MLOps

Day 9 – Model Deployment & Serving Using Kubernetes

More from this blog