Sign Up for the Quarterly Newsletter

Supernova - Accelerating Machine Learning Inference

With machine learning is widely used in enterprises, big data are trained on the edge, inference services go to production either in the cloud or on the edge.

On the edge

  • Edge devices have limited resources, space and power supply
  • Edge servers cost much higher than devices
  • Hardware accelerators are heterogeneous in architecture and various on interfaces and performance on the edge

In the cloud

  • Accelerator market is dominated by Nvidia GPU
  • Other options come as AMD GPU, Intel Habana Goya/Altera FPGA, AWS Inferentia, Xilinx FPGA etc
  • Common inference interfaces from cloud to edge doesn’t appear generally
  • Limitation on specific hardware accelerators or cloud leads to new vendor lock-in

Project Supernova is to build a common machine learning inference service framework by enabling machine learning inference accelerators across edge endpoint devices, edge systems and cloud, with or without hardware accelerators.

  • Micro-service based architecture with Restful API
  • Support heterogenous system architectures from leading vendors
  • Support accelerator compilers to native code
  • Neutral to ML training framework file formats
  • Work on both edge devices and clouds
  • Hardware CPU support:
    • x86-64, ARM64
  • Hardware accelerator support:
    • Intel VPU, Google Edge TPU, Nvidia GPU, AMD GPU
  • Software
    • Inference toolkit support: OpenVINO, TensorRT & Tenserflow Lite
    • Training framework data format: Tensorflow, Caffe, ONNX, MxNet

The common computing platforms including most resource-constrained edge system, PC, server, etc, where can deploy Linux/Docker.

Instructions are included in the download directory as "Supernova Quickstart Guide.docx"

Version 1.0 Update 

Compared 0.0.1, this release supports:

  1. New HW accelerators - AMD GPU + Xilinx FPGA
  2. CPU Accelerations with OpenVINO
  3. Basic K8S deployment
  4. Versatile APIs
  5. vSphere Bitfusion support
  6. More use cases like facial mask
 
It's version 2 compared to Fling version 1, https://flings.vmware.com/supernova-accelerating-machine-learning-inference,
  1. New HW accelerators - AMD GPU + Xilinx FPGA
  2. CPU Accelerations with OpenVINO ased on AVX/SSE
  3. Basic K8S deployment
  4. Versatile API set
  5. More use cases like facial mask
  6. vSphere Bitfusion support