New enhanced connection troubleshoot for Azure Networking

On the 1st March, 2023, Microsoft announced “New enhanced connection troubleshoot” for Azure Network watcher has gone GA. Previously Azure Network Watcher provided specialised stand alone tools for use with network troubleshooting but these have now been consolidated into one place with additional tests and actionable insights to assist with troubleshooting.

Complex network paths
Network Troubleshooting can be difficult and time consuming.

With customers migrating advanced, high-performance workloads to Azure, it’s essential to have better oversight and management of the intricate networks that support these workloads. A lack of visibility can make it challenging to diagnose issues, leaving customers with limited control and feeling trapped in a “black box.” To enhance your network troubleshooting experience, Azure Network Watcher combines these tools with the following features:

  • Unified solution for troubleshooting all NSG, user defined routes, and blocked ports
  • Actionable insights with step-by-step guide to resolve issues
  • Identifying configuration issues impacting connectivity
  • NSG rules that are blocking traffic
  • Inability to open a socket at the specified source port
  • No servers listening on designated destination ports
  • Misconfigured or missing routes

These new features are not available via the portal at the moment:

connection troubleshoot via portal does not display enhanced connection troubleshoot results
Connection Troubleshooting via the portal

The portal will display that there are connectivity issues, but will not provide the enhanced information. This is accessible via PowerShell, Azure CLI and the Rest API. I will now show the real reason this is not working.

Accessing “enhanced connection troubleshoot” output via PowerShell

I am using the following PowerShell to test the connection between the two machines:

$nw = get-aznetworkwatcher -location australiaeast
$svm = get-azvm -Name Machine1
$dvm = get-azvm -Name Machine2
Test-AzNetworkWatcherConnectivity -NetworkWatcher $nw -SourceId $svm.Id -DestinationId $dvm.Id -DestinationPort 445

This returns the following JSON:

ConnectionStatus : Unreachable
AvgLatencyInMs   :
MinLatencyInMs   :
MaxLatencyInMs   :
ProbesSent       : 30
ProbesFailed     : 30
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "a49b4961-b82f-49da-ae2c-8470a9f4c8a6",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "6c6f06de-ea3c-45e3-8a1d-372624475ced"
                       ],
                       "Issues": [
                         {
                           "Origin": "Local",
                           "Severity": "Error",
                           "Type": "GuestFirewall",
                           "Context": []
                         }
                       ]
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "6c6f06de-ea3c-45e3-8a1d-372624475ced",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": []
                     }
                   ]

As you can see, the issues discovered are explained in more detail, in this case, the local firewall is affecting the communication. If we check the local Defender firewall, we can see there is a specific rule blocking this traffic:

Blocked outbound protocols

If we remove the local firewall rule, connectivity is restored:

ConnectionStatus : Reachable
AvgLatencyInMs   : 1
MinLatencyInMs   : 1
MaxLatencyInMs   : 2
ProbesSent       : 66
ProbesFailed     : 0
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "f1b763a1-f7cc-48b6-aec7-f132d3fdadf8",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "7c9c103c-44ab-4fd8-9444-22354e5f9672"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "7c9c103c-44ab-4fd8-9444-22354e5f9672",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": []
                     }
                   ]

The enhanced connection troubleshoot can detect 6 fault types:

  • Source high CPU utilisation
  • Source high memory utilisation
  • Source Guest firewall
  • DNS resolution
  • Network security rule configuration
  • User defined route configuration

The first four faults are returned by the Network Watcher Agent extension for Windows as demonstrated above. The remaining two faults are from the Azure fabric. As you can see below, when a Network Security Group is misconfigured on the source or destination, our issue returns, but the output displays clearly where and which network security group is at fault:

ConnectionStatus : Unreachable
AvgLatencyInMs   :
MinLatencyInMs   :
MaxLatencyInMs   :
ProbesSent       : 30
ProbesFailed     : 30
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "3cbcbdbe-a6ec-454f-ad2e-946d6731278a",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "29e33dac-45ae-4ea3-8a9d-83dccddcc0eb"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "29e33dac-45ae-4ea3-8a9d-83dccddcc0eb",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": [
                         {
                           "Origin": "Inbound",
                           "Severity": "Error",
                           "Type": "NetworkSecurityRule",
                           "Context": [
                             {
                               "key": "RuleName",
                               "value": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/ConnectivityTest/providers/Microsoft.Network/networkSecurityGroups/Ma
                   chine2-nsg/SecurityRules/DenyAnyCustom445Inbound"
                             }
                           ]
                         }
                       ]
                     }
                   ]

In addition to the fault detection, IP Flow is also a part of the enhanced connection troubleshoot, providing a list of hops to a service. An excerpt of a trace to a public storage account is below:

PS C:\temp> Test-AzNetworkWatcherConnectivity -NetworkWatcher $nw -SourceId $svm.Id -DestinationAddress https://announcementtest.blob.core.windows.net/test1 -DestinationPort 443

ConnectionStatus : Reachable
AvgLatencyInMs   : 1
MinLatencyInMs   : 1
MaxLatencyInMs   : 1
ProbesSent       : 66
ProbesFailed     : 0
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "23eb09fd-b5fa-4be1-83f2-caf09d18ada0",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "78f3961c-9937-4679-97a7-4a19f4d1232a"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "PublicLoadBalancer",
                       "Id": "78f3961c-9937-4679-97a7-4a19f4d1232a",
                       "Address": "20.157.155.128",
                       "NextHopIds": [
                         "574ad521-7ab7-470c-b5aa-f1b4e6088888",
                         "e717c4bd-7916-45bd-b3d1-f8eecc7ed1e3",
                         "cbe6f6a6-4281-402c-a81d-c4e3d30d2247",
                         "84769cde-3f92-4134-8d48-82141f2d9bfd",
                         "aa7c2b73-0892-4d15-96c6-45b9b033829c",
                         "1c3e3043-98f2-4510-b37f-307d3a98a55b",
                         "b97778cb-9ece-4e87-bf6d-71b90fac3847",
                         "cb92d16d-d4fe-4233-b958-a4d3dbe78303",
                         "ec9a2753-3a60-4fce-9d92-7dbbc0d0219d",
                         "df2b1a3e-6555-424c-8e48-5cc0feba3623"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualNetwork",
                       "Id": "574ad521-7ab7-470c-b5aa-f1b4e6088888",
                       "Address": "10.124.144.2",
                       "NextHopIds": [],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualNetwork",
                       "Id": "e717c4bd-7916-45bd-b3d1-f8eecc7ed1e3",
                       "Address": "10.124.146.2",
                       "NextHopIds": [],
                       "Issues": []
                     },

Centralising the troubleshooting tools under one command is obviously a great enhancement, but by also providing increased visibility into configurations or system performance make this a great update for your troubleshooting toolbox.

Leave a Reply

Your email address will not be published. Required fields are marked *