Introduction:
Inspired by Mike Da Costa’s NSX/NSX-T Troubleshooting Scenarios, I’m putting out an NSX-T troubleshooting challenge to those who know me on Twitter as @spillthensxt. In this scenario, a specific Guest VM has no Distributed Firewall (DFW) rules applied. It’s now time for some NSX-T DFW troubleshooting!
This is a Spill the NSX-T Reader Challenge:
Bragging rights and a cup of tea for the first correct answer posted here or to Twitter!
Serious bonus marks if you can find a VMware reference to back up your answer!
The troubleshooting scenario is as follows, two nearly identical Guest VMs on the same ESXi host, but only one of them has DFW rules applied:
Let’s start with a review of the DFW Configuration:
The setup is relatively straightforward. Here we have a single rule dropping traffic from the Guest-VM-Group to the Google-DNS group:
The Guest-VM-Group consists of VMs that start with the name ‘VM’:
The Guest-VM-Group correctly consists of VM1 and VM2 So far so good:
The Google-DNS group consists of IP addresess 8.8.4.4 and 8.8.8.8:
The two Photon Guest VMs, VM1, and VM2 are similar:
root@VM1 [ ~ ]# cat /etc/photon-release VMware Photon OS 3.0 PHOTON_BUILD_NUMBER=49d932d root@VM2 [ ~ ]# cat /etc/photon-release VMware Photon OS 3.0 PHOTON_BUILD_NUMBER=49d932d
Both Guest VMs reside on the Same ESXi host:
- both Guest VMs are on ESXi host esxcna01-s1: [root@esxcna01-s1:~] net-stats -l PortNum Type SubType SwitchName MACAddress ClientName 50331650 4 0 DvsPortset-0 00:50:56:01:44:05 vmnic0 50331652 4 0 DvsPortset-0 00:50:56:01:10:b9 vmnic1 50331654 3 0 DvsPortset-0 00:50:56:01:44:05 vmk0 50331655 3 0 DvsPortset-0 00:50:56:6a:5e:cf vmk1 50331656 5 9 DvsPortset-0 00:50:56:96:c5:31 VM2.eth0 67108866 4 0 DvsPortset-1 00:50:56:01:10:bb vmnic2 67108868 3 0 DvsPortset-1 00:50:56:6a:99:31 vmk10 67108869 3 0 DvsPortset-1 00:50:56:66:0f:29 vmk50 67108871 5 9 DvsPortset-1 00:50:56:96:98:58 VM1.eth0 [root@esxcna01-s1:~] esxtop n 12:43:47am up 4:22, 754 worlds, 2 VMs, 2 vCPUs; CPU load average: 0.01, 0.01, 0.01 PORT-ID USED-BY TEAM-PNIC DNAME PKTTX/s MbTX/s PSZTX PKTRX/s MbRX/s PSZRX %DRPTX %DRPRX 33554433 Management n/a vSwitch0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50331649 Management n/a DvsPortset-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50331650 vmnic0 - DvsPortset-0 1.31 0.00 73.00 90.51 0.21 302.00 0.00 0.00 50331651 Shadow of vmnic0 n/a DvsPortset-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50331652 vmnic1 - DvsPortset-0 8.42 0.04 591.00 81.72 0.17 274.00 0.00 0.00 50331653 Shadow of vmnic1 n/a DvsPortset-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 50331654 vmk0 vmnic1 DvsPortset-0 2.81 0.01 289.00 4.49 0.00 62.00 0.00 0.00 50331655 vmk1 vmnic1 DvsPortset-0 5.61 0.03 742.00 9.54 0.01 117.00 0.00 0.00 50331656 71259:VM2.eth0 vmnic0 DvsPortset-0 1.31 0.00 70.00 4.49 0.00 60.00 0.00 0.00 67108865 Management n/a DvsPortset-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 67108866 vmnic2 - DvsPortset-1 2.24 0.00 116.00 89.58 0.21 304.00 0.00 0.00 67108867 Shadow of vmnic2 n/a DvsPortset-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 67108868 vmk10 vmnic2 DvsPortset-1 2.24 0.00 66.00 4.30 0.00 60.00 0.00 0.00 67108869 vmk50 void DvsPortset-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 67108870 vdr-vdrPort vmnic2 DvsPortset-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 67108871 69749:VM1.eth0 vmnic2 DvsPortset-1 0.37 0.00 74.00 0.00 0.00 0.00 0.00 0.00
The ESXi host is NSX-T Prepared:
NSX-T VIBs are installed on the Compute clutser ESXi host:
[root@esxcna01-s1:~] esxcli software vib list | grep -iE 'nsx' nsx-adf 2.5.0.0.0-6.5.14664066 VMware VMwareCertified 2019-11-18 nsx-aggservice 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-cli-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-common-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-context-mux 2.5.0.0.0esx65-14664127 VMware VMwareCertified 2019-11-18 nsx-esx-datapath 2.5.0.0.0-6.5.14663999 VMware VMwareCertified 2019-11-18 nsx-exporter 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-host 2.5.0.0.0-6.5.14663975 VMware VMwareCertified 2019-11-18 nsx-metrics-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-mpa 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-nestdb-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-nestdb 2.5.0.0.0-6.5.14664049 VMware VMwareCertified 2019-11-18 nsx-netcpa 2.5.0.0.0-6.5.14664119 VMware VMwareCertified 2019-11-18 nsx-netopa 2.5.0.0.0-6.5.14664039 VMware VMwareCertified 2019-11-18 nsx-opsagent 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-platform-client 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-profiling-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-proxy 2.5.0.0.0-6.5.14664115 VMware VMwareCertified 2019-11-18 nsx-python-gevent 1.1.0-9273114 VMware VMwareCertified 2019-11-18 nsx-python-greenlet 0.4.9-12819723 VMware VMwareCertified 2019-11-18 nsx-python-logging 2.5.0.0.0-6.5.14664066 VMware VMwareCertified 2019-11-18 nsx-python-protobuf 2.6.1-12818951 VMware VMwareCertified 2019-11-18 nsx-rpc-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-sfhc 2.5.0.0.0-6.5.14664082 VMware VMwareCertified 2019-11-18 nsx-shared-libs 2.5.0.0.0-6.5.14100719 VMware VMwareCertified 2019-11-18 nsx-upm-libs 2.5.0.0.0-6.5.14664158 VMware VMwareCertified 2019-11-18 nsx-vdpi 2.5.0.0.0-6.5.14664090 VMware VMwareCertified 2019-11-18 nsxcli 2.5.0.0.0-6.5.14663983 VMware VMwareCertified 2019-11-18 [root@esxcna01-s1:~]
The ESXi hosts overall NSX-T status is completely healthy:
[root@esxcna01-s1:~] /etc/init.d/nsxa status opsAgent is running [root@esxcna01-s1:~] /etc/init.d/nsx-mpa status NSX-Management-Plane-Agent is not running [root@esxcna01-s1:~] /etc/init.d/nsx-proxy status nsx-proxy agent service is running [root@esxcna01-s1:~] /etc/init.d/nsx-nestdb status NSX-NESTDB is running [root@esxcna01-s1:~] /etc/init.d/netcpad status netCP agent service is running [root@esxcna01-s1:~] nsxcli -c get managers 192.168.110.17 Connected (NSX-RPC) 192.168.110.19 Connected (NSX-RPC) 192.168.110.18 Connected (NSX-RPC) * [root@esxcna01-s1:~] nsxcli -c get controllers Controller IP Port SSL Status Is Physical Master Session State Controller FQDN 192.168.110.18 1235 enabled not used false null NA 192.168.110.19 1235 enabled connected true up NA 192.168.110.17 1235 enabled not used false null NA
Both Guests have IP connectivity to the Internet:
- let's test to Internet IP address 9.9.9.9: - both VM1 and VM2 test OK root@VM1 [ / ]# ping -c 2 9.9.9.9 PING 9.9.9.9 (9.9.9.9) 56(84) bytes of data. 64 bytes from 9.9.9.9: icmp_seq=1 ttl=41 time=34.9 ms 64 bytes from 9.9.9.9: icmp_seq=2 ttl=41 time=35.1 ms --- 9.9.9.9 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 3ms rtt min/avg/max/mdev = 34.901/35.016/35.132/0.219 ms root@pVM2 [ / ]# ping -c 2 9.9.9.9 PING 9.9.9.9 (9.9.9.9) 56(84) bytes of data. 64 bytes from 9.9.9.9: icmp_seq=1 ttl=41 time=34.9 ms 64 bytes from 9.9.9.9: icmp_seq=2 ttl=41 time=35.1 ms --- 9.9.9.9 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 3ms rtt min/avg/max/mdev = 34.901/35.016/35.132/0.219 ms
Both Guest VMs appear in the NSX-T Inventory:
VM1 and VM2 appear in the NSX-T inventory;
Here are VM1’s Inventory details:
Here are VM2’s Inventory details:
There is no DFW exclusion list:
[root@esxcna01-s1:~] nsxcli -c get firewall exclusion Firewall Exclusion None
But Only VM1 appears in the Guest-VM-Group address set:
esxcna01-s1.core.hypervizor.com> get firewall vifs Firewall VIFs VIF count: 1 1 83003831-3d64-4dd1-a1ce-c24b132c51bc esxcna01-s1.core.hypervizor.com> get firewall 83003831-3d64-4dd1-a1ce-c24b132c51bc addrsets Firewall Address Sets Address set count : 2 UUID : 27b8439e-c4b4-44ea-a4d0-27fc133983b0 Address count : 2 ip 8.8.4.4/32 <--- Both Google DNS are in the IP Address set as expected ip 8.8.8.8/32 UUID : 949d95ed-3009-4834-b26c-f26b0d6f3607 Address count : 2 ip 192.168.70.100 <--- Only VM1 appears in the Guest-VM-Group mac 00:50:56:96:98:58
VM1 has DFW rules applied, but VM2 does not:
- determine the DFW filer names for each VM: [root@esxcna01-s1:~] summarize-dvfilter | grep -A 2 VM world 69749 vmm0:VM1 vcUuid:'50 16 ec 3c e5 9b 78 15-35 af 0c 18 d3 ea a9 19' . <--- VM1 port 67108871 VM1.eth0 vNic slot 2 name: nic-69749-eth0-vmware-sfw.2 <--- VM1 DFW filter name world 71259 vmm0:VM2 vcUuid:'50 16 9b dd 3b 77 a2 6f-f6 4d 62 fc 38 aa 81 ec' <--- VM2 port 50331656 VM2.eth0 vNic slot 2 name: nic-71259-eth0-vmware-sfw.2 <--- VM2 DFW filter name [root@esxcna01-s1:~] [root@esxcna01-s1:~] vsipioctl getrules -f nic-69749-eth0-vmware-sfw.2 <--- VM1 has DFW rules ruleset mainrs { # generation number: 0 # realization time : 2019-12-12T21:07:57 rule 2049 at 1 inout protocol any from addrset 949d95ed-3009-4834-b26c-f26b0d6f3607 to addrset 27b8439e-c4b4-44ea-a4d0-27fc133983b0 drop; rule 2048 at 2 inout protocol any from any to any accept; rule 2 at 3 inout protocol any from any to any accept; } ruleset mainrs_L2 { # generation number: 0 # realization time : 2019-12-12T21:07:57 rule 1 at 1 inout ethertype any stateless from any to any accept; } [root@esxcna01-s1:~] vsipioctl getrules -f nic-71259-eth0-vmware-sfw.2 No rules. <--- VM2 has no DFW rules!
Without rules applied, VM2 can ping Google-DNS, which in this scenario is undesired:
root@VM1 [ ~ ]# ping -c 2 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. --- 8.8.8.8 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 2ms <--- desired result on VM1 root@VM2 [ ~ ]# ping -c 2 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=41 time=34.2 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=41 time=37.7 ms --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1ms <--- undesired result on VM2 rtt min/avg/max/mdev = 34.232/35.987/37.743/1.765 ms
Why does VM2’s DFW Filter have “No rules.”?
How would you handle this NSX-T with jam scenario?
The solution can be found here: https://spillthensxt.com/nsx-t-with-jam-trouble-with-dfw-solved/
Sounds like VM Tools May not be running in VM2 and the alternative ip discovery profiles have not been configured.
VM2 has no VIF attached, try restarting: /etc/init.d/nsx-opsagent
VM2 is connected to DVS ( DvsPortset-0) hence it is out the scope of DFW hence “No rules.” whereas VM1 is connected to N-VDS (DvsPortset-1) which will be the candidate for DFW rules . 🙂