Please provide your feedback in this short Flings' survey.
fling logo of Mjolnir : Automation Library for VMware Mangle

Mjolnir : Automation Library for VMware Mangle

version 1.0 — April 27, 2022

Summary

Mjolnir is a python utility package that helps in performing fault injections on remote hosts. It is a lightweight python wrapper that consumes VMware Mangle REST APIs in backend to inject or remediate a fault on a host machine.


Mangle is opensource product developed by VMware, which enables end-user to run chaos engineering. This product can be accessed or used with the help of both UI and API (Learn more about Mangle here Mangle Documentation). But it would be tedious for end-user to integrate this product into their automation framework and their development pipeline. End-user will have to write their integration code specific to their framework. But using Mjolnir, end-user will be able to perform any fault injections with just 3 steps in their framework.


This way can achieve the following Benefits:
  1. We can thoroughly test our applications and systems in our automation.
  2. We can better identify the nature and cause of production failures.
  3. We can prepare for the unexpected.

This utility is platform independent (tested on Linux and Windows) and would be distributed as a python’s wheel package which can be easily installed using python's native pip3 command.

Requirements

  • Python 3.6 or above
  • Require paramiko, requests and urllib3 python packages to be installed
  • Deploy and install Mangle server (Deployment Guide)

Instructions

Installation instructions:

  • Make sure the requirements are met.
  • Download the Mjolnir python package(.whl) from flings.vmware.com .
  • Its recommended to use Python virtual environment for the wheel package installation. (Learn more about python venv here https://docs.python.org/3.7/library/venv.html?highlight=pyvenv)
    1. Create python virtual environment using ‘venv’ module (Use python3.6 or above)
    2. Activate python virtual environment
    3.             
      # Create virtual environment
      $ python3.6 -m venv mjolnir-env
      $ ls mjolnir-env/
      bin  include  lib  lib64  pyvenv.cfg  share
      
      # Activate mjolnir virtual environment
      $ source mjolnir-env/bin/activate
      
      # Check pip version in virtual environment
      (mjolnir-env) $ pip --version
      pip 9.0.1 from /root/mjolnir-env/lib/python3.6/site-packages (python 3.6)
      (mjolnir-env) $
                  
                

instruction1.png

  • Install wheel package using python’s native pip (Ensuring pip version points to python 3.6 or above)
        
        # Install Mjolnir wheel package
        (flings-env)$ pip3 install vmware_mjolnir-1.0.19649893-py2.py3-none-any.whl
        
    

instruction2.png


Mjolnir Usage:

  • Once the package is installed, you can follow the below steps to inject.
    1. Import mjolnir package
    2. Configuring mangle server
    3. Inject faults in desired machines
    4. Remidiate injected faults

  • Sample snippet:
        
        import logging
        import sys
        import time
    # import mjolnir package import vmware.mjolnir as mjolnir
    logging.root.setLevel(logging.INFO) logger = logging.getLogger() logger.addHandler(logging.StreamHandler(sys.stdout))
    logger.info("configuring mangle server") mjolnir.configure_mangle_server(MANGLE_IP, MANGLE_USERNAME,PASSWORD, PORT)
    # Specify host machines on which faults to be injected machines = [{'ip': HOST_IP, 'username': USERNAME, 'password': PASSWORD, 'ssh_port': PORT}]
    logger.info("injecting faults in the above machines") task_id_lst = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="CPU", cpuload=60, timeout=120)
    # Place holder for your code (that you want to perform when machine has fault) time.sleep(50)
    logger.info("clearing faults based on the taskids returned") mjolnir.clear_faults(task_id_lst)

  • INFRA Fault types:

    1. MEMORY Fault
                      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="MEMORY", memoryload=60, timeout=120)
                      
                  
    2. DISKIO Fault
                      
      # iosize: (bytes) To write in blocks of 5 KB to the disk of the specified VSMS specify the IO Size as 5120 (5 KB = 5120 bytes).
                     Max supported value is 5MB (5 MB = 5120 * 1024 bytes)
      # target_dir: specific directory location or partition to write to for simulating the DISK IO
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="DISKIO", iosize=5120, target_dir="/config", timeout=120)
                      
                  
    3. KILL PROCESS Fault
                      
      # pid
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="PROCESSKILL", process_id=5161)
      
      # process descriptor
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="PROCESSKILL", process_descriptor="com.vmware.nsx.cbm.Main")
                      
                  
    4. STOP SERVICE Fault
                      
      # INFO:: enables graceful shutdown of any process that is running on the specified VSMS using the appropriate stop commands
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="STOPSVC", svc_name="corfu-server", timeout=120)
                      
                  
    5. FILE HANDLER LEAK Fault
                      
      # INFO:: enables you to simulate conditions where a program requests for a handle to a resource but does not release it when the resource is no longer in use.
                This condition if left over extended periods of time, will lead to "Too many open file handles" errors and will cause performance degradation or crashes
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="FILEHANDLERLEAK", timeout=120)
      
      # NOTE:
      # a) Clear Fault(Remediation) not supported for this fault.
                      
                  
    6. DISK SPACE Fault
                      
      # diskload:   80 to simulate a Disk usage of 80% of the total disk size or space allocated for a partition
      # target_dir: specific directory location or partition to write to for simulating the DISK FAULT
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="DISKSPACE", diskload=80, target_dir="/config", timeout=120)
                      
                  
    7. KERNEL PANIC Fault
                      
      # INFO:: simulates conditions where the operating system abruptly stops to prevent further damages, security breaches or data corruption
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="KERNELPANIC", timeout=120)
      
      # NOTE:
      # a) Clear Fault(Remediation) not supported for this fault.
      # b) After injecting Kernel fault end-user need to turned-on the machine by own.
                      
                  
    8. CLOCK SKEW Fault
                      
      # INFO:: simulates conditions where the endpoint time is distorted and doesn't align with the standard NTP time.
                The skew can be in 'seconds', 'minutes', 'hours' or 'days' as specified at the time of running the fault
      
      # clock_skew_oper:: PAST
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="CLOCKSKEW", clock_skew_oper="PAST", seconds=120, minutes=0, hours=0, days=0, timeout=120)
      
      # clock_skew_oper:: FUTURE
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="CLOCKSKEW", clock_skew_oper="FUTURE", seconds=60, minutes=0, hours=0, days=0, timeout=120)
                      
                  
    9. NETWORK PARTITION Fault
                      
      # hosts: host IP or a list of host IPs to which the endpoint should lose network connectivity due to network partition
      task_id = mjolnir.inject_fault(topology, [topology.testbed.vsms[0]], fault_type="INFRA", fault_sub_type="NWPARTITION", hosts=[topology.testbed.vsms[1].ip, topology.testbed.vsms[2].ip], timeout=120)
                      
                  
    10. NETWORK: PACKET DELAY
                      
      # latency: (millisecond) simulate a packet delay of 150ms on a particular network interface of VSMS
      # nic-name: could be eth0, eth1, br0 etc depending on what adapter you would want to target for the fault
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="NETWORK", nw_fault_type='DELAY', latency=150, nicname='eth0', timeout=120)
                      
                  
    11. NETWORK: PACKET DUPLICATE Fault
                      
      # percentage: (%)value to specify what percentage of the packets should be duplicated
                     For e.g: 10 to simulate a packet duplication of 10 percentage on a particular network interface of VSM
      # nic-name: could be eth0, eth1, br0 etc depending on what adapter you would want to target for the fault
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="NETWORK", nw_fault_type='DUPLICATE', percentage=10, nicname='eth0', timeout=120)
                      
                  
    12. NETWORK: PACKET LOSS Fault
                      
      # percentage: (%)value to specify what percentage of the packets should be dropped
                     For e.g: 10 to simulate a packet drop of 10 percentage on a particular network interface of VSM
      # nic-name: could be eth0, eth1, br0 etc depending on what adapter you would want to target for the fault
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="NETWORK", nw_fault_type='LOSS', percentage=10, nicname='eth0', timeout=120)
                      
                  
    13. NETWORK: PACKET CORRUPT Fault
                      
      # percentage: (%)value to specify what percentage of the packets should be corrupted
                     For e.g: 10 to simulate a packet corruption of 10 percentage on a particular network interface of VSM
      # nic-name: could be eth0, eth1, br0 etc depending on what adapter you would want to target for the fault
      
      task_id = mjolnir.inject_generic_fault(machines, fault_type="INFRA", fault_sub_type="NETWORK", nw_fault_type='CORRUPT', percentage=10, nicname='eth0', timeout=120)
                      
                  

Changelog

Changed instruction 2 image

Similar Flings

No similar flings found. Check these out instead...
Mar 27, 2020
fling logo of Horizon View Events Database Export Utility

Horizon View Events Database Export Utility

version 2.2

This utility allows administrators to easily apply very detailed filtering to the data and export it to a .CSV file. You can filter on time range, event severity, event source, session type (application or desktop), usernames and event types.

Apr 15, 2020
fling logo of vSphere Replication Capacity Planning

vSphere Replication Capacity Planning

version 1.0

The vSphere Replication Capacity Planning Fling reveals actual VM traffic consumption and delta size. This helps you perform a capacity planning or estimation of vSphere Replication network bandwidth utilization prior to enabling vSphere Replication for VMs.

Aug 19, 2020
fling logo of VMware Container For Folding@Home

VMware Container For Folding@Home

version 1.0

VMware Container for Folding@ Home is a docker container for running folding at home client. This container is supported on both Docker standalone clients and on a Kubernetes Cluster.

Jun 29, 2022
fling logo of Imager

Imager

version 1.1.0

Imager takes care of creating the VM, silently installing and updating the OS, installing applications and agents, sysprepping and finalizing the image for distribution. There is also a powerful "Continue" feature that allows you to stop the flow at a particular stage, do some manual operations or optimizations on the VM, before running the remaining stages

Feb 25, 2016
fling logo of VNC Server and VNC Client

VNC Server and VNC Client

version 2.0

This Fling is a stand-alone, cross-platform VNC implementation based on the remoting technology found in vSphere and VMware Workstation. It allows remote access to a desktop session running on another native system, or inside of a virtual machine.

Aug 01, 2016
fling logo of Routing Control Plane for OpenStack

Routing Control Plane for OpenStack

version 1.0

This Fling augments the capabilities of OpenStack Neutron, providing an easy way to integrate an existing OpenStack environment into a corporate network that uses routable IP addresses for the Tenants, specifically with VMware Integrated OpenStack and NSX.

View More