Skip to main content

Command Palette

Search for a command to run...

Day 7 - From Model to API: How Companies Actually Deploy ML in Production

Updated
5 min read
Day 7 - From Model to API: How Companies Actually Deploy ML in Production
P

👋 Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!

SKILLS:

🔹 Languages & Runtimes: Python, Shell Scripting, HCL, YAML 🔹 Cloud Technologies: AWS, Microsoft Azure, GCP 🔹 Infrastructure Tools: Docker, Terraform, AWS CloudFormation 🔹 Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps 🔹 Web Development: HTML, CSS, Bootstrap, Python, SQL

Job & Responsibilities:

🚀 Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. 🔒 Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. ⚙️ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. 📅 Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. 🔧 Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.

I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.

Let's Connect & Grow:

If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.


Big Picture First

Model serving =

👉 Deploying an ML model like a production microservice

👉 Exposing it via API

👉 Scaling it

👉 Monitoring it

👉 Versioning it

👉 Rolling it out safely

Exactly what DevOps already does for apps.

The difference is:

Instead of deploying code, we deploy code + model artifacts + data behavior


1️⃣ Flask on VM = “Classic App Deployment”

This is like deploying a web app the old-school DevOps way.

You install Python on a VM

Run a Flask app

Expose REST endpoint

Put a load balancer in front

Scale by adding more VMs

DevOps analogy

👉 This is like deploying a Node.js or Java API directly on EC2/VMs.

Same problems:

  • OS patching

  • scaling lag

  • manual capacity planning

  • instance drift

  • configuration snowflakes

Example

A startup builds a fraud detection API.

They:

  • spin up 2 VMs

  • run Flask + Gunicorn

  • load the ML model in memory

  • expose /predict

  • add an AWS load balancer

  • autoscale VMs when CPU spikes

It works…

But:

  • VM startup takes minutes

  • scaling is slow

  • upgrades are risky

  • debugging infra takes time

When this makes sense

✅ POC

✅ small traffic

✅ single team

✅ simple environment

✅ GPU-heavy workloads needing host control

Manager takeaway

This is DevOps 2015 style infrastructure.

Works.

Simple.

But not cloud-native.


2️⃣ Containers + Kubernetes = “Modern DevOps Microservices”

This is the containerized version of the same idea.

You Dockerize the model server

Deploy it to Kubernetes

Use HPA to autoscale pods

Expose with Ingress

DevOps analogy

👉 Exactly like deploying microservices on Kubernetes.

Nothing new conceptually.

Same workflows:

  • CI builds image

  • push to registry

  • deploy via Helm

  • rolling updates

  • autoscaling

  • observability

  • service mesh

  • canary deploys

The ML model just becomes another microservice.

Example

An e-commerce company deploys recommendation models.

Each model runs in its own container.

Traffic increases during sales events.

Kubernetes automatically:

  • scales pods

  • routes traffic

  • replaces failed pods

  • manages rollout

DevOps team treats it like any other production service.

Why this is powerful

Containers solve:

  • portability

  • reproducibility

  • fast scaling

  • infra consistency

Kubernetes adds:

  • resilience

  • rollout strategies

  • resource scheduling

  • GPU scheduling

  • cost control

Manager takeaway

This is DevOps-native ML deployment.

If your org already runs Kubernetes → this is the natural extension.


3️⃣ Amazon SageMaker = “Managed DevOps”

This is like using a fully managed platform instead of running your own cluster.

You don’t manage servers.

You don’t manage Kubernetes.

You deploy models via AWS APIs.

AWS handles:

  • provisioning

  • scaling

  • monitoring

  • model versioning

  • endpoints

  • rollbacks

DevOps analogy

👉 Like using Heroku / Cloud Run / Lambda instead of managing servers.

You trade flexibility for simplicity.

Example

A fintech company deploys loan approval models.

They:

  • upload model to S3

  • register it

  • create SageMaker endpoint

  • enable autoscaling

  • enable monitoring

No cluster management.

Just policy + automation.

When this shines

✅ AWS-heavy org

✅ fast-moving teams

✅ compliance requirements

✅ minimal infra ops

✅ strong audit needs

Trade-off

Less control

Vendor lock-in

Higher managed cost

But huge productivity gain.

Manager takeaway

This is platform engineering for ML.

Buy instead of build.


4️⃣ KServe = “Kubernetes but ML-native”

KServe is like Kubernetes with ML intelligence built in.

Instead of writing deployments manually, you declare:

“This is my model”

KServe handles:

  • model loading

  • autoscaling

  • canary rollouts

  • explainability

  • scale-to-zero

  • inference routing

DevOps analogy

👉 Like Argo Rollouts + HPA + CI/CD + service mesh — but specialized for ML.

It abstracts ML complexity the same way Kubernetes abstracts containers.

Example

A large enterprise runs 200 ML models.

They don’t want:

200 custom deployments.

They use KServe.

Each model becomes a Kubernetes resource:

InferenceService

Platform team manages the framework.

Data teams deploy models safely.

Why teams love it

  • standardization

  • governance

  • repeatability

  • platform-level control

  • cost optimization

  • ML-specific rollout strategies

Manager takeaway

This is Kubernetes platform engineering for ML.

Best for orgs scaling ML seriously.


Simple mental model

Think of it as infrastructure maturity levels:

Level 1 → VM Flask

“Manual DevOps”

Level 2 → Containers + Kubernetes

“Cloud-native DevOps”

Level 3 → SageMaker

“Managed platform DevOps”

Level 4 → KServe

“Enterprise ML platform DevOps”


Real-world comparison

Imagine deploying a food delivery backend:

VM Flask = renting kitchen + cooking yourself

Kubernetes = automated smart kitchen

SageMaker = restaurant franchise system

KServe = global food platform infrastructure

All deliver food.

But scale and control differ.

Same with ML serving.


MLOps

Part 13 of 20

Practical MLOps series breaking down how ML systems work in production — from data pipelines to deployment, monitoring, and retraining. No buzzwords, just real-world MLOps concepts explained simply for engineers and data teams.

Up next

Deploying an ML Model with FastAPI + Docker (Hands-On MLOps) flow to deploy the Ml Model

Train a small ML model Expose it as an API Package it with Docker Run it like a real production service Step 1: Create a simple model and save it Create a file called train.py from sklearn.datasets import load_iris from sklearn.ensemble import ...

More from this blog

D

DeployToCloud

405 posts

👋 Welcome to my Hashnode blog! I'm a DevOps Engineer with 2+ years of experience. Join ~5k followers and explore 320+ blogs on Python, AWS, Docker, Jenkins, Linux, and more. Let's connect & grow 🚀