Building an End-to-End ML Deployment Pipeline with MLflow, FastAPI, and Docker
Deploying machine learning models is more than just training β itβs about tracking, versioning, serving, and monitoring. In this post, Iβll walk you through how I built a production-ready ML pipeline using:
- MLflow for experiment tracking and model registry
- FastAPI for serving models via REST API
- MinIO for artifact storage (S3-compatible)
- Docker Compose for orchestration
π Full source code:
π github.com/liviaerxin/mlops-fastapi-mlflow-minio
βΈ»
Project Overviewβ
This project provides:
- A structured pipeline to log, register, and serve ML models
- Docker-based setup with MLflow, FastAPI, and MinIO
- Simple training and inference workflows
π Project structure:
.
βββ docker-compose.yml
βββ inference-server/
βββ mlflow-local-train-remote-register/
βββ mlflow-server/
βββ train.py
βββ README.md
ποΈ For full instructions, check the README
Key Componentsβ
MLflowβ
MLflow is a platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment.
- Experiment tracking (
mlflow.log_param
,mlflow.log_metric
) - Model registration and versioning
- Stage promotion (
Staging
,Production
)
NOTE: MLflow is used
remotely
in docker container environment here.
MinIOβ
S3-compatible artifact store used by MLflow
to save models. For production, you might migrate to use AWS S3 or GCS.
NOTE: MinIO is used
remotely
in docker container environment here.
FastAPIβ
Simple REST API to serve ML models loaded from the MLflow registry.
Loads the remote model via:
mlflow.set_tracking_uri(os.environ.get("MLFLOW_SERVER", "http://127.0.0.1:5000"))
model = mlflow.pyfunc.load_model(f"models:/{MODEL_NAME}/{MODEL_STAGE}")
NOTE: FastAPI is used
remotely
in docker container environment here.
Training and Registering a Modelβ
NOTE: mlflow-local-train-remote-register is used
locally
outside the docker container environment here.
Run the training script:
python train.py
Then log and register it to a remote MLflow server:
python register-remote.py
Key code snippet:
os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://127.0.0.1:9000" # Expose the minio via host 9000 port
mlflow.set_tracking_uri("http://127.0.0.1:5001") # Expose the mlflow server via host 5001 port
mi = mlflow.pytorch.log_model(model, artifact_path="model", registered_model_name=MODEL_NAME)
Bonus: Updating Models and Rolling Deploymentβ
You can update models by:
- Logging a new version to MLflow
- Promoting it to
Production
stage - Using FastAPI logic to reload the latest version without downtime
You may also support blue-green deployments using Docker or Kubernetes.
Conclusionβ
This setup gives you a scalable and reproducible ML deployment pipeline with clear separation of concerns:
- MLflow handles tracking and registry
- MinIO manages artifact storage
- FastAPI exposes inference endpoints
- Docker Compose glues it all together
βΈ»
π GitHub Repo
All code and instructions: π github.com/liviaerxin/mlops-fastapi-mlflow-minio