Sign Up for the Quarterly Newsletter

VMware Machine Learning Platform

Our goal is to provide an end-to-end ML platform for Data Scientists to perform their job more effectively by running ML workloads on top of VMware infrastructure.

Using vMLP allows to:

  • Save the costs by enabling efficient use of shared GPUs for ML workfloads
  • Reduce the risks of broken Data Science workflows by leveraging well-tested and ready-to-use demos and project templates
  • Faster "go-to-market" for ML models by utilizing end-to-end oriented tooling including fast and easy model deployment and serving via standardized REST API

Quickly setup a virtualized cloud infrastructure to conduct Data Science experiments:

  • Virtualized environment based on VMware and Kubernetes
  • Familiar Jupyter Notebooks and distributed model training based on Open Source Kubeflow 1.0 GA and Horovod
  • GPU support available on Tanzu Kubernetes GRID using vGPU and NVIDIA Kubernetes Device Plugin
  • Quick turn-around on model testing based on the built-in deployment framework

Utilize a set of example Notebooks and libraries for common Data Science tasks, including:

  • Data collection (extract data from various sources, and describe the data semantics using metadata)
  • Data cleansing and transformation (clean up collected data and transform them from its raw form to a structured form more suitable for analytic processing)
  • Model training (develop predictive and optimization machine learning models)
  • Model serving (deploy model into a runtime environment where an online REST API request will be served)

Share metadata about your Data Sources to enable team collaboration using our Data Manager component.

Store and reliably track your experiments for reproducibility using MLflow model repository.

Leverage JupyterLab extensions to achieve smooth model training and deployment experience.

Please consult the Deployment Guide section of the README.md inside the archive

Please see the readme file in the download after unzipping.

Version 0.3.0

  • Federated ML based on FATE
  • Istio 1.4.9
  • Horovod 0.19.2
  • Upgraded major components (MLflow 1.10.0, Pandas 1.0.3 and others)
  • Important stability bug fixes
  • Added documentation

Includes contributions from: Jiahao "Luke" Chen (bug fixes and Federated ML/FATE integration),
Shan Lahiri (Getting Started Guide), Jason Hutson (relentlessly debugging Kubernetes on VMware
infra), Nick Ford (sorting out VMware NSX Advanced Load Balancer/AVI Networks configuration and issues)

Version 0.2.0

  • Added support for vSphere with Kubernetes and Tanzu Kubernetes GRID
  • Upgraded to Kubeflow 1.0 GA
  • Added Istio-based authentication
  • Added support for Jupyter Notebooks and model training on GPUs
  • Jupyter images feature Horovod support for local CPU/GPU model development
  • Upgraded major ML libraries (CUDA 10.1, TF 2.1.0, PyTorch 1.4.0, Horovod 0.19.1 and others)
  • Upgraded major components (MLflow 1.6.0 and others)
  • Updated demos and project templates, added examples of image classification, NLP, model explanation
  • Added vMLP (Data Manager, model deployments) UI
  • Introduced ml-manager Python API
  • Added an easy-to-use installer
  • Lots of bug fixes

Includes contributions from: Dave Ho (demo and template updates, model explanation ml-manager module)

Version 0.1.0

  • Initial release
  • Supports PKS/Harbor on VMware Cloud Foundation
  • Kubeflow 0.5.1
  • CPU-based model training
  • TF 1.14.0, PyTorch 1.1.0, Horovod 0.16.4
  • Introduce JupyterLab plugin with demos/templates tiles and train/deploy dialog support
  • Introduced a new data registry component called Data Manager
  • Introduced model repository based on MLflow

Includes contributions from: Andrii Myrgorod (PKS workfload domain testbed setup)