Day 7 - From Model to API: How Companies Actually Deploy ML in Production

👋 Hello! I'm passionate about DevOps and have over 1+ years of experience in the field. I'm proficient in a variety of cutting-edge technologies and always motivated to expand my knowledge and skills. Let's connect and grow together!
SKILLS:
🔹 Languages & Runtimes: Python, Shell Scripting, HCL, YAML 🔹 Cloud Technologies: AWS, Microsoft Azure, GCP 🔹 Infrastructure Tools: Docker, Terraform, AWS CloudFormation 🔹 Other Tools: Linux, Git and GitHub Actions, Jenkins, Jira, GitLab (beginner), Docker, AWS DevOps 🔹 Web Development: HTML, CSS, Bootstrap, Python, SQL
Job & Responsibilities:
🚀 Improved development efficiency by implementing CI/CD pipelines, resulting in a 30% reduction in deployment time on the test server. 🔒 Strengthened deployment and testing reliability by utilizing Docker containers and optimizing Dockerfile, reducing development issues on the test server by 20%. ⚙️ Automated S3 bucket log creation with Shell scripting, eliminating 100% of manual search and saving 2 hours per week. 📅 Scheduled EC2 instance start/stop using Lambda functions and Event Bridge, leading to a 25% decrease in infrastructure costs. 🔧 Utilized AWS, Linux, Python, Docker, Shell scripting, Terraform, Jenkins Pipelines, and automation to streamline workflows and improve overall system performance.
I'm very detail-oriented and possess strong written and verbal communication skills. As a high performer with a possibility mindset, I strive to solve problems using efficient approaches.
Let's Connect & Grow:
If you find my profile suitable for the role you are searching for, please feel free to reach out to me at sumanprasad9766@gmail.com.
Big Picture First
Model serving =
👉 Deploying an ML model like a production microservice
👉 Exposing it via API
👉 Scaling it
👉 Monitoring it
👉 Versioning it
👉 Rolling it out safely
Exactly what DevOps already does for apps.
The difference is:
Instead of deploying code, we deploy code + model artifacts + data behavior
1️⃣ Flask on VM = “Classic App Deployment”
This is like deploying a web app the old-school DevOps way.
You install Python on a VM
Run a Flask app
Expose REST endpoint
Put a load balancer in front
Scale by adding more VMs
DevOps analogy
👉 This is like deploying a Node.js or Java API directly on EC2/VMs.
Same problems:
OS patching
scaling lag
manual capacity planning
instance drift
configuration snowflakes
Example
A startup builds a fraud detection API.
They:
spin up 2 VMs
run Flask + Gunicorn
load the ML model in memory
expose
/predictadd an AWS load balancer
autoscale VMs when CPU spikes
It works…
But:
VM startup takes minutes
scaling is slow
upgrades are risky
debugging infra takes time
When this makes sense
✅ POC
✅ small traffic
✅ single team
✅ simple environment
✅ GPU-heavy workloads needing host control
Manager takeaway
This is DevOps 2015 style infrastructure.
Works.
Simple.
But not cloud-native.
2️⃣ Containers + Kubernetes = “Modern DevOps Microservices”
This is the containerized version of the same idea.
You Dockerize the model server
Deploy it to Kubernetes
Use HPA to autoscale pods
Expose with Ingress
DevOps analogy
👉 Exactly like deploying microservices on Kubernetes.
Nothing new conceptually.
Same workflows:
CI builds image
push to registry
deploy via Helm
rolling updates
autoscaling
observability
service mesh
canary deploys
The ML model just becomes another microservice.
Example
An e-commerce company deploys recommendation models.
Each model runs in its own container.
Traffic increases during sales events.
Kubernetes automatically:
scales pods
routes traffic
replaces failed pods
manages rollout
DevOps team treats it like any other production service.
Why this is powerful
Containers solve:
portability
reproducibility
fast scaling
infra consistency
Kubernetes adds:
resilience
rollout strategies
resource scheduling
GPU scheduling
cost control
Manager takeaway
This is DevOps-native ML deployment.
If your org already runs Kubernetes → this is the natural extension.
3️⃣ Amazon SageMaker = “Managed DevOps”
This is like using a fully managed platform instead of running your own cluster.
You don’t manage servers.
You don’t manage Kubernetes.
You deploy models via AWS APIs.
AWS handles:
provisioning
scaling
monitoring
model versioning
endpoints
rollbacks
DevOps analogy
👉 Like using Heroku / Cloud Run / Lambda instead of managing servers.
You trade flexibility for simplicity.
Example
A fintech company deploys loan approval models.
They:
upload model to S3
register it
create SageMaker endpoint
enable autoscaling
enable monitoring
No cluster management.
Just policy + automation.
When this shines
✅ AWS-heavy org
✅ fast-moving teams
✅ compliance requirements
✅ minimal infra ops
✅ strong audit needs
Trade-off
Less control
Vendor lock-in
Higher managed cost
But huge productivity gain.
Manager takeaway
This is platform engineering for ML.
Buy instead of build.
4️⃣ KServe = “Kubernetes but ML-native”
KServe is like Kubernetes with ML intelligence built in.
Instead of writing deployments manually, you declare:
“This is my model”
KServe handles:
model loading
autoscaling
canary rollouts
explainability
scale-to-zero
inference routing
DevOps analogy
👉 Like Argo Rollouts + HPA + CI/CD + service mesh — but specialized for ML.
It abstracts ML complexity the same way Kubernetes abstracts containers.
Example
A large enterprise runs 200 ML models.
They don’t want:
200 custom deployments.
They use KServe.
Each model becomes a Kubernetes resource:
InferenceService
Platform team manages the framework.
Data teams deploy models safely.
Why teams love it
standardization
governance
repeatability
platform-level control
cost optimization
ML-specific rollout strategies
Manager takeaway
This is Kubernetes platform engineering for ML.
Best for orgs scaling ML seriously.
Simple mental model
Think of it as infrastructure maturity levels:
Level 1 → VM Flask
“Manual DevOps”
Level 2 → Containers + Kubernetes
“Cloud-native DevOps”
Level 3 → SageMaker
“Managed platform DevOps”
Level 4 → KServe
“Enterprise ML platform DevOps”
Real-world comparison
Imagine deploying a food delivery backend:
VM Flask = renting kitchen + cooking yourself
Kubernetes = automated smart kitchen
SageMaker = restaurant franchise system
KServe = global food platform infrastructure
All deliver food.
But scale and control differ.
Same with ML serving.




