Skip to main content

Building an End-to-End ML Deployment Pipeline with MLflow, FastAPI, and Docker

Β· 3 min read

Deploying machine learning models is more than just training β€” it’s about tracking, versioning, serving, and monitoring. In this post, I’ll walk you through how I built a production-ready ML pipeline using:

  • MLflow for experiment tracking and model registry
  • FastAPI for serving models via REST API
  • MinIO for artifact storage (S3-compatible)
  • Docker Compose for orchestration

πŸ‘‰ Full source code:
πŸ”— github.com/liviaerxin/mlops-fastapi-mlflow-minio


βΈ»

Project Overview​

This project provides:

  • A structured pipeline to log, register, and serve ML models
  • Docker-based setup with MLflow, FastAPI, and MinIO
  • Simple training and inference workflows

πŸ“ Project structure:

.
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ inference-server/
β”œβ”€β”€ mlflow-local-train-remote-register/
β”œβ”€β”€ mlflow-server/
β”œβ”€β”€ train.py
└── README.md

πŸ—‚οΈ For full instructions, check the README

Key Components​

MLflow​

MLflow is a platform to manage the ML lifecycle, including experimentation, reproducibility, and deployment.

  • Experiment tracking (mlflow.log_param, mlflow.log_metric)
  • Model registration and versioning
  • Stage promotion (Staging, Production)

NOTE: MLflow is used remotely in docker container environment here.

MinIO​

S3-compatible artifact store used by MLflow to save models. For production, you might migrate to use AWS S3 or GCS.

NOTE: MinIO is used remotely in docker container environment here.

FastAPI​

Simple REST API to serve ML models loaded from the MLflow registry.

Loads the remote model via:

mlflow.set_tracking_uri(os.environ.get("MLFLOW_SERVER", "http://127.0.0.1:5000"))
model = mlflow.pyfunc.load_model(f"models:/{MODEL_NAME}/{MODEL_STAGE}")

NOTE: FastAPI is used remotely in docker container environment here.

Training and Registering a Model​

NOTE: mlflow-local-train-remote-register is used locally outside the docker container environment here.

Run the training script:

python train.py

Then log and register it to a remote MLflow server:

python register-remote.py

Key code snippet:

os.environ["MLFLOW_S3_ENDPOINT_URL"] = "http://127.0.0.1:9000" # Expose the minio via host 9000 port
mlflow.set_tracking_uri("http://127.0.0.1:5001") # Expose the mlflow server via host 5001 port
mi = mlflow.pytorch.log_model(model, artifact_path="model", registered_model_name=MODEL_NAME)

Bonus: Updating Models and Rolling Deployment​

You can update models by:

  • Logging a new version to MLflow
  • Promoting it to Production stage
  • Using FastAPI logic to reload the latest version without downtime

You may also support blue-green deployments using Docker or Kubernetes.

Conclusion​

This setup gives you a scalable and reproducible ML deployment pipeline with clear separation of concerns:

  • MLflow handles tracking and registry
  • MinIO manages artifact storage
  • FastAPI exposes inference endpoints
  • Docker Compose glues it all together

βΈ»

πŸ”— GitHub Repo

All code and instructions: πŸ”— github.com/liviaerxin/mlops-fastapi-mlflow-minio