May 09, 2020

Are their any known issues with v 2.3.1 not working with vSAN 7? I have been trying to get a test run on a new cluster, but I get an error on valdiate config check that says:
2020-05-09 17:21:31 -0700: Validating Vdbench binary and the workload profiles...
2020-05-09 17:21:32 -0700: Validating VC IP and Credential...
2020-05-09 17:21:33 -0700: VC IP and Credential Validated
2020-05-09 17:21:33 -0700: Validating Datacenter xxxxxx...
2020-05-09 17:21:33 -0700: ------------------------------------------------------------------------------
2020-05-09 17:21:33 -0700: Datacenter xxxxxxx doesn't exist!

I've deployed this OVE multiple time and even deleted and recreated the datacenter and cluster in hopes it would resolve the issue but no luck.

The automation logs do not help much as all I get is this error:
2020-05-09 17:23:04 -0700: Checking Existing VMs...
2020-05-09 17:23:05 -0700: Deployment Started.
2020-05-09 17:23:06 -0700: [ERROR] Unknown Failed...
2020-05-09 17:23:06 -0700: VMs Preserved for Debugging Purpose, You Can Run /opt/automation/cleanup-vm.sh to Delete VMs Manually
2020-05-09 17:23:06 -0700: Testing Failed, For Details Please Find Logs in /opt/automation/logs
2020-05-09 17:23:06 -0700: Please Cancel And Re-Run The Testing

Any ideas would be appreciated, thanks.

May 10, 2020
May 11, 2020

I get a vmware email rejection trying to send the log output to your email address saying that the recipient is not authorized to accept external email. Both log output is in the body of my message above. Here is the vm-health-check.log output: 2020-05-09 17:23:04 -0700: Verifying If Folder Exists...
2020-05-09 17:23:05 -0700: [ERROR] No VM Folder Found
2020-05-09 17:23:05 -0700: [ERROR] Existing VMs not Compatible
2020-05-09 17:23:05 -0700: ABORT: VMs are not existing or Existing VMs are not Compatible

And the Telegraf.log:# Logfile created on 2020-05-09 17:23:09 -0700 by logger.rb/61378
I, [2020-05-09T17:23:09.996107 #4632] INFO -- : Get the container id of telegraf_vsan: 167b1a65df76f332b215be28e442ef339bcf85a30aa2230b2ee59086ef68f6fa.
I, [2020-05-09T17:23:10.073283 #4632] INFO -- : Get ids of all running telegraf processes.

May 11, 2020
Apr 29, 2020

I downloaded the latest version and was able to import it successfully. On my 3 Host cluster, the validation was successful but the test never finishes. I have included the log that I can see below, if there is a better log for this stage I'm happy to grab that as well.

I can see that there are 6 VMs spread across my 3 hosts.

2020-04-29 22:32:52 -0700: Checking Existing VMs...
2020-04-29 22:35:11 -0700: Existing VMs are Successfully Verified.
2020-04-29 22:35:12 -0700: Virtual Disk Preparation ZERO Started.(May take half to couple of hours depending on the size of VMs deployed)
2020-04-29 23:47:14 -0700: Disk Preparation Finished: 4/6 VMs

Apr 30, 2020

4 of the hosts are around 230GB each of the allocated 128GB, the other 2 are only 50 and 60ish GB. The two that are only 50 and 60 GB are on the same ESXI Host.

All hosts are identical, Supermicro Xeon 8c/16t, 64GB ram, 1TB Samsung 970 Pro NVME (vSAN cache), 6.4TB Intel DC P4600 (vSAN capacity).

Apr 30, 2020

Ok, so I restarted one of the hci-fil VMs and the HCIBench mark tool continued.

The sizes are all still 4x 230GB, 1x 64GB, 1x 49GB.

Apr 30, 2020

I was able to finish the tests after 8 hours. I wanted to see if somebody can help me interpret the results and maybe provide some thoughts on why it's performing so poorly.

Thanks

Apr 30, 2020

If you can upload your results (including vSAN observer data) to a shared drive and send us an email at vsanperformance@vmware.com I'll take a quick look.

Apr 29, 2020

I am having the same issue as Robert below - HCIBench on segment 1, test vm on segment 2. DHCP enabled on both segments, and HCIBench appliance is receiving a DHCP address on both. Test vm deployed on either aren't receiving any ip addresses, regardless of whether using the segment DHCP or the HCIBench appliance as the DHCP server.

Apr 29, 2020

Furthermore, if both the NICs on the HCIBench appliance are connected - I can't browse to the configuration page. Once I disconnect vmnic2 (the VM network nic), I can browse to it just fine.

If both vmnic are connected, I can't reliably ping either, but if either is disconnected, PING requests are returned reliably... I think there is a problem with the internal gateway settings...

Apr 29, 2020

Hi Tom, this is definitely a configuration that we use in our lab so this behavior is unexpected. Can you reach out to me by email at vsanperformance@vmware.com?

Apr 29, 2020

The issue was a blocked port between HCIBench and the ESXi hosts preventing the deployment of the worker VMDK.

Added a firewall rule to permit ICMP and HTTPS(443) from HCIBench to the ESXi hosts solved the issue.

Apr 24, 2020

I'm trying to use HCIBench on a V6.7 environment with VSAN and every time I run prevalidation I get failed IP tests of one kind or another. It looks like there's always one of the test VMs that fails it's ping tests.

2020-04-24 14:43:28 -0700: IP Assignment failed or IP not reachable

From one of the tvm logs:
networks: Pub-Net-1284 = Bench-VLAN
Deploying VM on 10.252.3.64...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 14.4M 100 14.4M 0 0 36.8M 0 --:--:-- --:--:-- --:--:-- 36.8M
Powering on VMs ...
PowerOnVM hci-tvm-vsanDatastore-6-1: success
Waiting for VMs to boot ...
green
hci-tvm-vsanDatastore-6-1: 192.168.3.6
PING 192.168.3.6 (192.168.3.6) 56(84) bytes of data.
From 192.168.2.1 icmp_seq=1 Destination Host Unreachable
From 192.168.2.1 icmp_seq=2 Destination Host Unreachable
... several failed ping attempts...
PowerOffVM hci-tvm-vsanDatastore-6-1: success
Destroy hci-tvm-vsanDatastore-6-1: success
Can't Ping VM VirtualMachine("vm-858") by IP 192.168.3.6
254
2020-04-24 14:43:28 -0700: IP Assignment failed or IP not reachable

Apr 27, 2020

Usually this type of problem is a configuration difference if using standard virtual switches and/or a misconfigured trunk port on the physical switch (e.g. VLAN not allowed).

Create 2 VM: one on host 10.252.3.64 and on on another host. Configure the IP on those VM (e.g. 192.168.3.1/24 and 192.168.3.2/24). If the VM cannot ping each other but if you vMotion the VM from host 10.252.3.64 to other hosts and the pings succeeds, this is a strong confirmation of a network configuration problem.

Apr 24, 2020

The other test hosts have ping results like this:

hci-tvm-vsanDatastore-7-1: 192.168.3.7
deploy-hcitvm-7.log-PING 192.168.3.7 (192.168.3.7) 56(84) bytes of data.
deploy-hcitvm-7.log-64 bytes from 192.168.3.7: icmp_seq=1 ttl=64 time=0.291 ms
deploy-hcitvm-7.log-64 bytes from 192.168.3.7: icmp_seq=2 ttl=64 time=0.161 ms
deploy-hcitvm-7.log-64 bytes from 192.168.3.7: icmp_seq=3 ttl=64 time=0.161 ms
deploy-hcitvm-7.log-64 bytes from 192.168.3.7: icmp_seq=4 ttl=64 time=0.196 ms
deploy-hcitvm-7.log-64 bytes from 192.168.3.7: icmp_seq=5 ttl=64 time=0.163 ms
deploy-hcitvm-7.log-
deploy-hcitvm-7.log:--- 192.168.3.7 ping statistics ---
deploy-hcitvm-7.log-5 packets transmitted, 5 received, 0% packet loss, time 22ms