Skip to content

Open source machine learning local development environment

License

Notifications You must be signed in to change notification settings

sebastianvermaas/mlstack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Stack

release Black Logo License: MIT

It is common for data scientists to be well equipped in languages and packages commonly used for statistical analysis and modeling. It is less common that data scientists are equipped to properly implement those models in production pipelines.

MLStack provides a toolkit for Data Scientists to develop production-level modules in their local development environment.

Design




MLStack provides two toolkits with shared dependencies:

  • Conda environment - An Anaconda environment with common ML python libraries
  • Kubernetes cluster - A Kubernetes cluster with common ML components

Getting Started

Prerequisites

MLStack assumes that you have Docker (19.03), Kubernetes (1.16), and Conda (4.7) installed. Installation instructions are not given as differents operating systems and environments require specific configuration.

Install

MLStack can be installed with the following. Note that the setup will take some time as Docker images are pulled and/or built. So grab a cup of ☕ and relax! (or read logs .. or both)

# Clone into the repository
git clone https://github.com/sebastianvermaas/mlstack.git
cd mlstack

# Create your mlstack conda environment
conda env create -f conda.yml
conda activate mlstack

# Install the Python library and CLI
pip install -e .

# Setup command for building Docker images
mlstack setup

Usage

Build

The mlstack build command builds the Docker images in the build directory. Images that require additional python requirements can be built with the --requirements flag. For example:

mlstack build --image airflow --requirements requirements.txt

Create

The mlstack create command creates a Kubernetes cluster specified in the manifests.

mlstack create
mlstack create --manifest spark --volume-mount mymount --host-path path/to/my/host

Close

The mlstack delete command deletes a Kubernetes manifest.

mlstack close
mlstack close --manifest spark

TODO

mlstack create bucket mybucket
mlstack upload data --bucket mybucket