fling logo of Sample Data Platform on VMware Cloud Foundation with VMware Tanzu for Kubernetes Provisioning

Sample Data Platform on VMware Cloud Foundation with VMware Tanzu for Kubernetes Provisioning

version 1.1 — November 11, 2020

Contributors 2

View All

Comments 0

View All

Summary

Data is king and your users need a modern data platform quickly.

With this Fling, you will leverage your VMware Cloud Foundation 4.0 deployment and stand a sample data platform on a Tanzu Kubernetes Grid guest cluster in less than 20-minutes comprising of Kafka, Spark, Solr, and ELK.

Additionally, this Fling comes with a market data sample application (using real market data from dxFeed) that shows how all these data platform components work together.

The diagram below depicts the final outcome of the Data Platform on VCF4.0:

The sample market data app:

The diagram below depicts the final outcome of the Market Data Sample App:

  1. Data Platform Provisioning: automation instantiates Kafka, Spark, Solr, ELK in 20 minutes.
  2. Kafka Publisher: Retrieves delayed market data from publicly available DevExpert dxfeed and publishes it on multiple Kafka topics
  3. Spark Subscriber : Subscribes to the market data from Kafka topics, manipulates the data and finally writes into Solr.
  4. Logstash Subscriber : Subscribes to the market data from Kafka topics and writes into Elasticsearch through logstash and also creates a kibana dashboard.

About dxFeed - https://www.dxfeed.com : it is a subsidiary of Devexperts, with the primary focus of delivering financial markets information and services to buy-side and sell-side institutions of the global financial industry, specifically to traders, data analysts, quants and portfolio managers.

Sample Trades data from dxFeed:

  {"Trade" : {
    "MSFT" : {
      "eventSymbol" : "MSFT",
      "eventTime" : 0,
      "time" : 1598385600578,
      "timeNanoPart" : 0,
      "sequence" : 200,
      "exchangeCode" : "Q",
      "price" : 216.47,
      "change" : 0.0,
      "size" : 2197641.0,
      "dayVolume" : 8833.0,
      "dayTurnover" : 1917851.3,
      "tickDirection" : "ZERO_UP",
      "extendedTradingHours" : false
    }}} 

Sample Quotes data from dxFeed:

 {"Quote" : {
    "MSFT" : {
      "eventSymbol" : "MSFT",
      "eventTime" : 0,
      "sequence" : 0,
      "timeNanoPart" : 0,
      "bidTime" : 1598433071000,
      "bidExchangeCode" : "P",
      "bidPrice" : 217.17,
      "bidSize" : 3.0,
      "askTime" : 1598433300000,
      "askExchangeCode" : "P",
      "askPrice" : 217.39,
      "askSize" : 1.0
    }}} 

Final Output: Kibana Dashboard

 

Stay tuned for future VMware Tanzu data platforms initiatives.

Part One:

 

Part Two:

Requirements

  • Kubernetes cluster with 1 control plane and 8 worker nodes.
  • Control Plane: 1 Small VM (2 CPUs, 2GB RAM)
  • Worker Nodes: 8 Large VMs (4 CPUs, 16GB RAM)

Instructions

Sample Data Platform Fling

Data is king and your users need a sample data platform quickly.

With this Fling, you will leverage your VMware Cloud Foundation 4.0 deployment and stand a sample data platform on a Tanzu Kubernetes Grid guest cluster in less than 20-minutes comprising of Kafka, Spark, Solr, and ELK.

Setup Instructions

Run the below commands to untar dataplatforms-fling-package-kubernetes.tar.gz

tar -xvf dataplatforms-fling-package-kubernetes.tar.gz

cd kubernetes

BASE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null 2>&1 && cd .. && pwd)"

We have divided the Data Platform automation in three parts:

  • Infrastructure: Setup of Tanzu Kubernetes Guest Cluster (VMware Cloud Foundatation configured as per VMware Validated Design (VVD)). The final outcome is a k8s cluster with 8 workers as per yaml file.
  • Data Platform Components: Installation of Kafka, Spark, Solr and ElasticSearch.
  • Market Data Sample App: Installation of the Market Data Sample App, an app that reads from an external source (market data from dxfeed ) and adds data on data platform.

Infrastructure: Tanzu Kubernetes Grid (TKG) Guest Cluster on VMware Cloud Foundation

You will need to have VMware Cloud Foundation configured following the practices as stated on VMware Validated Design - VVD.

We added some pre-flight check steps on your VMware Cloud Foundation cluster before installing the Data Platform. See README on $BASE_DIR/kubernetes/infrastructure/vvd.

But at very high level, you will need to apply a TanzuKubernetesCluster manifest file and generate a kubeconfig.

cd $BASE_DIR/kubernetes/infrastructure/vvd

# login on the supervisor cluster before running the following
kubectl apply -f dp-k8s-cluster-tkc.yaml 

Sample Data Platform Components

Once the Tanzu Kubernetes Cluster is running with at least 8 workers in Ready status, you can install the data platform components with install script - datacomponents/setup-dp-components.sh. Please check the README on $BASEDIR/kubernetes/datacomponents for more details on the validation of the data platform setup.

Dependencies: Please ensure the following command line utilities are present in your system before you proceed with the installation:

  1. kubectl
  2. git
cd $BASE_DIR/kubernetes
cp $BASE_DIR/kubernetes/datacomponents/setup-dp-components.sh .
sh setup-dp-components.sh -c all

By the end of this execution, you will have Bitnami Kafka (default), Spark, Solr and Elastic Search installed.

Market Data Sample App

Once you stood up the sample data platform components, you can install the market data sample app that consume market data from dxfeed.com and publish kafka topics and write on Solr using install script - $BASE_DIR/kubernetes/marketdata-sampleapp/setup-marketdata-sampleapp.sh. You can check the README on $BASE_DIR/kubernetes/marketdata-sampleapp for more details on the validation of the market data sample app.

cd $BASE_DIR/kubernetes
cp $BASE_DIR/kubernetes/marketdata-sampleapp/setup-marketdata-sampleapp.sh .
sh setup-marketdata-sampleapp.sh
Please go through the README on datacomponents and marketdata-sampleapp directories for the final outcome validation.

 

Changelog

Version Update 1.1

  • Bug fix for storage class for bitnami kafka

Similar Flings

No similar flings found. Check these out instead...
Aug 19, 2020
fling logo of VMware Container For Folding@Home

VMware Container For Folding@Home

version 1.0

VMware Container for Folding@ Home is a docker container for running folding at home client. This container is supported on both Docker standalone clients and on a Kubernetes Cluster.

May 16, 2016
fling logo of VMware GOLD vApp STIG Assessment and Remediation Tool (START)

VMware GOLD vApp STIG Assessment and Remediation Tool (START)

version 1.0

This fling provides SCAP based assessment and remediation capabilities on any remote Linux machine running OpenSCAP. It can be used to assess compliance, provide ansible based remediation and harden the target OS. Check out the videos for a detailed capability and usage.

Nov 26, 2019
fling logo of Kubewise

Kubewise

version 1.1.0

Kubewise is a simple multi-platform desktop client for Kubernetes®.

Feb 23, 2021
fling logo of VMware Cloud Foundation Powernova

VMware Cloud Foundation Powernova

version 1.0

VMware Cloud Foundation Powernova is a Fling built on top of VCF that provides the users the ability to perform Power Operations (Power ON, Power OFF) seamlessly across the entire inventory. It has a sleek UI to visualize the entire VCF inventory (which is the first of its kind for VCF) across the domains of VCF.

Apr 27, 2018
fling logo of Cross vCenter VM Mobility - CLI

Cross vCenter VM Mobility - CLI

version 1.6.0

Cross vCenter VM Mobility - CLI is a command line interface (CLI) tool that can be used to migrate or clone a VM from one host to another host managed by a linked or isolated vCenter (VC) instance.

May 17, 2023
UPDATED
fling logo of Python Client for VMC on AWS

Python Client for VMC on AWS

version 2.0.1

Python Client for VMware Cloud on AWS is an open-source Python-based tool. Written in Python, the tool enables VMware Cloud on AWS users to automate the consumption of their VMware Cloud on AWS SDDC. Note this is not to interact with your VMware Cloud on AWS vCenter but to run tasks such as creating and deleting networks, setting up security groups and services and building network security rules on the Management and Compute Gateways.

View More