Please provide your feedback in this short Flings' survey.
Aug 11, 2021

Hi Chen/Charles(and the other Devs)!

I tried to test vsan performance by HCI bench 2.6 and it was presenting very well.

But my client tried to use the fio directly as he saw HCI bench is calling fio. At the first, it was also running good till he tune the fsync=1 instead of fsync=0 which is default value.

The seq-write and random-write dropped from 20Kiops to about 1300 IOPS when fsync=0.

Is this normal on VSAN? Is that possible to get some document to support why should we use fsync=0?

Thank you very much.

Aug 12, 2021

That would be unexpected. I would suspect another parameter is the source of the change.

Since you are running a custom test profile can you send us the complete file at

Sep 13, 2021

I got the same issue, when I use the vdbench with custom para .

At first, the iops 400,000 iops , but when I reused the same config

it dropped to 200,000 iops.

Jul 22, 2021

Hi Chen/Charles(and the other Devs)!

First of all I want to thank you for this GREAT tool. It helps me accurately test performance of vSAN building blocks in an automated friendly way.
This tool really shows the power of what Flings can be. This tool should be an official fully supported VMware tool and developed even further !

I am currently doing tests on our new VxRail vSAN building blocks. As I am also going to do vSAN resilience tests (host failures etc.), I have provisioned quite some data for my tests (around 66% of the datastore raw capacity filled).
As I'm using the Prepare Virtual Disks with RANDOM option (compression is enabled on the cluster), it took hours to complete this action. After the action it ran the test.

But then on consecutive tests, it destroys my VMs (and all the hard work of RANDOM written data on the datastore :( ) and deploys new ones. I'm unsure why this happens. I have verified that:

- I'm using the exact same HCIBench config as the initial test
- I'm using the exact same Vdbench config as the initial test
- I’ve tried with both “Prepare Virtual Disk before Testing” on NONE and on RANDOM for consecutive tests. Which setting is right here if you re-use VMs and you did the RANDOM writes already during the initial deployment?

The vm-health-check.log shows:

2021-07-21 23:58:43 +0000: Verifying If Folder Exists...
2021-07-21 23:58:43 +0000: Folder Verified...
2021-07-21 23:58:43 +0000: Moving all vms to the current folder
2021-07-21 23:58:44 +0000: no matches for "temp/*"
2021-07-21 23:58:44 +0000: There are 100 VMs in the Folder, 100 out of 100 will be used
2021-07-21 23:58:45 +0000: no matches for "temp/*"
[KMoveIntoFolder temp: running ❲ ❳
[1A [KMoveIntoFolder temp: success
[1A [1B
2021-07-21 23:58:45 +0000: [ERROR] Not enough proper VMs in vSAN_R1PAFHYP0306
2021-07-21 23:58:45 +0000: [ERROR] Existing VMs not Compatible
2021-07-21 23:58:45 +0000: ABORT: VMs are not existing or Existing VMs are not Compatible

Here is my prevalidation check (anonymized the data):

--- ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 58ms
rtt min/avg/max/mdev = 0.460/0.491/0.544/0.039 ms
2021-07-21 11:16:29 +0000: VC IP and Credential Validated
2021-07-21 11:16:29 +0000: Validating Datacenter XX-VCF-XX-XXX...
2021-07-21 11:16:29 +0000: Datacenter XX-VCF-XX-XXX Validated
2021-07-21 11:16:29 +0000: Validating Cluster XXXXXXXXX...
2021-07-21 11:16:29 +0000: Cluster XXXXXXXXX Validated
2021-07-21 11:16:29 +0000: Cluster XXXXXXXXX has DRS mode: fullyAutomated
2021-07-21 11:16:29 +0000: Validating If Any Hosts in Cluster XXXXXXXXX for deployment is in Maintainance Mode...
2021-07-21 11:16:33 +0000: All the Hosts in Cluster XXXXXXXXX are not in Maitainance Mode
2021-07-21 11:16:33 +0000: Validating Network LS-AF-P-S-HCIBench_DI01_Workers...
2021-07-21 11:16:33 +0000: ------------------------------------------------------------------------------
2021-07-21 11:16:33 +0000: Found 4 LS-AF-P-S-HCIBench_DI01_Workers
2021-07-21 11:16:33 +0000: ------------------------------------------------------------------------------

The only "strange" thing I see here is that it finds 4 networks. However that is expected, as this is an NSX-T overlay backed segment which spans multiple VDSes (we have 1 NSX VDS per cluster), so this shouldn't be an issue.

Do you guys have any suggestions on how to fix this? I really need to re-use my VMs for consecutive tests, for my use cases.

Which logs should I forward to you for further troubleshooting? I hope you can help me out with this one guys.

P.S. I've also sent you an e-mail, as then I can attach logs.

Thanks and Regards !


Aug 09, 2021
Jul 16, 2021

Toward the end of a test validation I am getting the following error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xbcae4b]

Aug 09, 2021

What version of HCIBench?

What version of vCenter?

Can you send us the log files at (all files under '/opt/automation/logs')?

Jul 12, 2021

Use version 2.6.0, deployed 10 guest VMs enabled DHCP, got IP from ["", "", "", "", "", "", "", "", "", ""].

Some VM can response PING
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.386 ms
64 bytes from icmp_seq=2 ttl=64 time=0.200 ms
64 bytes from icmp_seq=3 ttl=64 time=0.205 ms
64 bytes from icmp_seq=4 ttl=64 time=0.178 ms
64 bytes from icmp_seq=5 ttl=64 time=0.207 ms

--- ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 81ms

but some VMs are unreachable

PING ( 56(84) bytes of data.
From icmp_seq=1 Destination Host Unreachable
From icmp_seq=2 Destination Host Unreachable
From icmp_seq=3 Destination Host Unreachable
From icmp_seq=4 Destination Host Unreachable
From icmp_seq=5 Destination Host Unreachable

--- ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 101ms
pipe 4
Not able to ping VMs ["hci-fio-datastore-180789-0-2"], try another time...
Can't Ping VMs ["hci-fio-datastore-180789-0-2"] by their IPs [""]

Tried in different vcenter for serval times, always see this issue

Jul 12, 2021

Hi Zhong, you can try looking at the user guide for some good pointers on network troubleshooting. Here are my top questions:

- If you are using standard vSwitches, are all the ports tagged with the same/correct VLAN?
- Are all the physical switch trunk ports tagged with the same/correct VLAN?
- Are the unpingable VM always on the same hosts? If you take a VM with an IP that you can ping and vMotion it to a host with a VM that you can't ping, does it cease to be reachable?
- If you are using NSX you can create an overlay backed segment for the workers. It only needs to be L2.

If you still can't locate the issue, send us an email and we can setup a call to help you out.

Jul 14, 2021

Looks like an environment issue, it works on the new deployed environment.

Jul 15, 2021

Okay, that's great! Reach out if you need anything else.

Jul 13, 2021

Environment is re-installed, will retry and send logs. Used VDS; the unpingalbe VMs were on the same host, but some VMs on the host are pingable.

Jul 12, 2021

could you check whether the one that HCIBench can ping was on the same ESXi host with HCIBench?
this usually is caused by the esxi hosts cant talk to each other through the vLAN you specified.

Jul 13, 2021

Environment is re-installed, will retry and send logs. It can ping each other through the vLAN.