Skip to content

Components

mikewoodward94 edited this page Aug 29, 2024 · 2 revisions

There are two main components of this MLOps repo: mlflow_server which sets up the MLOps environment, and mlops which is used to build the csc-mlops python package available on PyPI.

mlflow_server

The mlflow server is made up of four networked containers that serve specific purposes necessary for MLOps:

  • MLflow: The MLflow container hosts the MLflow server instance. This is responsible for tracking and logging MLOps events sent to it, such as experiment runs.
  • postgres: This is a database server container only visible to the MLflow container. Mlflow entities such as metrics, parameters, and configuration options are logged to this database.
  • MINIO: The MINIO container hosts the MINIO server, this is a self hosted S3 storage location. This is where artefacts from MLflow such as models and images are stored.
  • NGINX: The nginx container acts as a reverse proxy to control network traffic.

mlops

The csc-mlops package is generally used by developers utilising the cli tool to automatically utilise MLOps processes.

This includes:

  • Project configuration
  • Communication with MLflow
  • Ensuring project code is committed and current
  • Docker image built
  • Project logger configured

The two main components here are Experiment.py and cli.py. Experiment.py is a wrapper around mlflow.run() which creates a Docker image to run a training script with settings as specified in cli.py.

Clone this wiki locally