iSCSI connection issues with Hyper-V Server

If you have a Hyper-V Cluster connected through iSCSI to a storage solution and sometimes you get errors that your CSV volumes went offline. In some cases, the Windows Failover Cluster is able to recover and bring online automatically and your virtual machines will not stop. In other situation if your CSV volume is not been recovery you will have to bring online the volume and then start all the virtual machine manually.

If you see the following sequence of events on the event viewer or in the Failover Cluster Manager events:

Error – iScsiPrt 20

None Connection to the target was lost. The initiator will attempt to retry the connection.

This mean that the connection to the target was lost. The initiator will attempt to retry the connection

This event is logged when the initiator loses connection to the target when the connection was in iSCSI Full Feature Phase. This event typically happens when there are network problems, network cable is removed, network switch is shutdown, or target resets the connection. In all cases initiator will attempt to re-establish the TCP connection.

Error – iScsiPrt 7

None The initiator could not send an iSCSI PDU. Error status is given in the dump data.

This mean that the initiator could not send an iSCSI PDU. Error status is given in the dump data.

This event is logged when the initiator could not send an iSCSI PDU to the target.

Information – iScsiPrt 34

None A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name.

This mean that the iSCSI connectivity was restored

How to solve it

The way that I found that solve this events stopping appearing at the event viewer and off course solving the iSCSI issues within my Hyper-V Cluster were:

1. Install following KBs

2. Configuring Network Prioritization

It is possible to customize NP if the cluster does not automatically assign networks to use the traffic pattern that you want, which will change the ranked order, and hence the function.  For example, you may want Cluster Network 3 to be used for “Live Migration Traffic” as it is the fastest, so you would change its Metric to a value between 1000 and 1100, such as 1050, so that it is ranked second on the list.  Once Cluster Network 3 has the second-lowest metric it will be used for Live Migration Traffic.

To change the value of a network metric, run:
$n = Get-ClusterNetwork “Cluster Network 3”
$n.Metric = 1050

This will change the metric of Cluster Network 3 to 1050.

Now you get the following output from running
Get-ClusterNetwork | ft Name, Metric, AutoMetric

Name                       Metric     AutoMetric
—-                       ——     ———-
Cluster Network 1          1000       True
Cluster Network 3          1050       False
Cluster Network 2          1100       True
Cluster Network 4          10000      True
Cluster Network 5          10100      True

You may have noticed that is a property associated with each network called AutoMetric.  This indicates whether the Metric was set using the default values (True) or if it had been later adjusted by an admin (False).  This gives insight into whether NP has been configured on the cluster.  Using this flag, it is actually possible to change the value of a network back to its original and automatically assigned value, by running the cmdlet:
$n = Get-ClusterNetwork “Cluster Network 3”
$n.AutoMetric = $true

3. Disabled the TRIM Feature

The following command needs to be executed to Turn off Windows TRIM feature

fsutil behavior set disabledeletenotify 1

4. Disabled ODX

To disable ODX on the Hyper-V Server, just follow this steps:

1. Open a Windows PowerShell session as an administrator.

2. Check whether ODX is currently enabled (it is by default) by verifying that the FilterSupportedFeaturesMode value in the registry equals 0. To do so, type the following command:

Get-ItemProperty hklm:systemcurrentcontrolsetcontrolfilesystem -Name “FilterSupportedFeaturesMode”

3. Disable ODX support. To do so, type the following command:

Set-ItemProperty hklm:systemcurrentcontrolsetcontrolfilesystem -Name “FilterSupportedFeaturesMode” -Value 1

Cluster Shared Volumes (CSV) errors on Hyper-V Cluster

In a failover cluster, virtual machines can use Cluster Shared Volumes that are on the same LUN (disk), while still being able to fail over (or move from node to node) independently of one another. Virtual machines can use a Cluster Shared Volume only when communication between the cluster nodes and the volume is functioning correctly, including network connectivity, access, drivers, and other factors.

You probably didn’t notice any issues with your VMs, but If you are getting the following events on your Hyper-V Cluster nodes, regarding the CSV volume:

Warning – Disk 153

“The description for Event ID 153 from source disk cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

The following information was included with the event:

DeviceHarddisk5DR5

the message resource is present but the message is not found in the string/message table

Information – Microsoft-Windows-FailoverClustering 5121

“The description for Event ID 5121 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Volume2

Cluster Disk 2 – Volume2

Error – Microsoft-Windows-FailoverClustering 5120

“The description for Event ID 5120 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Volume1

Cluster Disk 1 – Volume1

STATUS_DEVICE_BUSY(80000011)

That means there has been an interruption to communication between a cluster node and a volume in Cluster Shared Volumes. This interruption may be short enough that it is not noticeable, or long enough that it interferes with services and applications using the volume

How to resolve it

CSV – Review events related to communication with the volume

To perform the following procedure, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

To open Event Viewer and view events related to failover clustering:

1. If Server Manager is not already open, click Start, click Administrative Tools, and then click Server Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

2. In the console tree, expand Diagnostics, expand Event Viewer, expand Windows Logs, and then click System.

3. To filter the events so that only events with a Source of FailoverClustering are shown, in the Actions pane, click Filter Current Log. On the Filter tab, in the Event sources box, select FailoverClustering. Select other options as appropriate, and then click OK.

4. To sort the displayed events by date and time, in the center pane, click the Date and Time column heading.

CSV – Check storage and network configuration

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Gathering information about the condition and configuration of a disk in Cluster Shared Volumes

To gather information about the condition and configuration of a disk in Cluster Shared Volumes:

1. Scan appropriate event logs for errors that are related to the disk.

2. Review information available in the interface for the storage and if needed, contact the vendor for information about the storage.

3. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

4. In the Failover Cluster Manager snap-in, expand the console tree and click Cluster Shared Volumes. In the center pane, expand the listing for the volume that you are gathering information about. View the status of the volume.

5. Still in the center pane, to prepare for testing a disk in Cluster Shared Volumes, right-click the disk, click Take this resource offline, and then if prompted, confirm your choice. Repeat this action for any other disks that you want to test.

6. Right-click the cluster containing the Cluster Shared Volumes, and then click Validate This Cluster.

7. On the Testing Options page, select Run only tests I select.

8. On the Test Selection page, clear the check boxes for System Configuration and Network. This leaves the tests for Cluster Configuration, Inventory, and Storage. You can run all these tests, or you can select only the specific tests that appear relevant to your situation.

NOTE: If you run the Storage tests you will have downtime in your Cluster. Not recommend if you are troubleshooting on a production environment.

9. Follow the instructions in the wizard to run the tests.

10. On the Summary page, click View Report.

11. Under Results by Category, click Storage, click any test that is not labelled as Success, and then view the results.

12. Scroll back to the top of the report, and under Results by Category, click Cluster Configuration, and then click List Cluster Network Information. Confirm that any network that you intend for communication between nodes and Cluster Shared Volumes is labelled either Internal use or Internal and client use. Confirm that other networks (for example, networks used only for iSCSI and not for cluster network communication) do not have these labels.

13. If the information in the report shows that one or more networks are not configured correctly, return to the Failover Cluster Manager snap-in and expand Networks. Right-click the network that you want to modify, click Properties, and then make sure that the settings for Allow the cluster to use this network and Allow clients to connect through this network are configured as intended.

14. To bring disks back online, click Cluster Shared Volumes and, in the center pane, right-click a disk, and then click Bring this resource online. Repeat this action for any other disks that you want to bring online again.

Verifying settings for a network designated for network communication with Cluster Shared Volumes

To verify settings for a network designated for network communication with Cluster Shared Volumes:

1. Click Start, click Control Panel, click Network and Internet, and then click Network and Sharing Center.

2. In the Tasks pane, click Change adapter settings.

3. Right-click the connection you want, and then click Properties.

4. Make sure that the following check boxes are selected:

  • Client for Microsoft Networks
  • File and Printer Sharing for Microsoft Networks

Verifying that the required NTLM authentication is allowed

1. On a node in the cluster, to see the security policies that are in effect locally, click Start, click Administrative Tools, and then click Local Security Policy.

2. Navigate to Security SettingsLocal PoliciesSecurity Options.

3. In the center pane, click the Policy heading to sort the policies alphabetically.

4. Review Network security: Restrict NTLM: Add remote server exceptions for NTLM authentication and the items that follow it. If items related to “server exceptions” are marked Disabled, or other items have specific settings, a policy may be in place that is interfering with NTLM authentication on this server. If this is the case, contact an appropriate administrator (for example, your administrator for Active Directory or security) to ensure that NTLM authentication is allowed for cluster nodes that are using Cluster Shared Volumes.

Opening Event Viewer and viewing events related to failover clustering

To open Event Viewer and view events related to failover clustering:

1. If Server Manager is not already open, click Start, click Administrative Tools, and then click Server Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. In the console tree, expand Diagnostics, expand Event Viewer, expand Windows Logs, and then click System.

3. To filter the events so that only events with a Source of FailoverClustering are shown, in the Actions pane, click Filter Current Log. On the Filter tab, in the Event sources box, select FailoverClustering. Select other options as appropriate, and then click OK.

4. To sort the displayed events by date and time, in the center pane, click the Date and Time column heading.

Finding more information about the error codes that some event messages contain

To find more information about the error codes that some event messages contain:

1. View the event, and note the error code.

2. Look up more information about the error code in one of two ways:

NET HELPMSG errorcode

How to verify it

Confirm that the Cluster Shared Volume can come online. If there have been recent problems with writing to the volume, it can be appropriate to monitor event logs and monitor the function of the corresponding clustered virtual machine, to confirm that the problems have been resolved.

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Confirming that a Cluster Shared Volume can come online

To confirm that a Cluster Shared Volume can come online:

1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. In the Failover Cluster Manager snap-in, if the cluster you want to manage is not displayed, in the console tree, right-click Failover Cluster Manager, click Manage a Cluster, and then select or specify the cluster that you want.

3. If the console tree is collapsed, expand the tree under the cluster you want to manage, and then click Cluster Shared Volumes.

4. In the center pane, expand the listing for the volume that you are verifying. View the status of the volume.

5. If a volume is offline, to bring it online, right-click the volume and then click Bring this resource online.

Using a Windows PowerShell command to check the status of a resource in a failover cluster

To use a Windows PowerShell command to check the status of a resource in a failover cluster:

1. On a node in the cluster, click Start, point to Administrative Tools, and then click Windows PowerShell Modules. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. Type: Get-ClusterSharedVolume

If you run the preceding command without specifying a resource name, status is displayed for all Cluster Shared Volumes in the cluster.

Updates on Hyper-V Server? How to see what had been installed?

When do you need to see what updates had been installed on your Hyper-V server, you can use this PowerShell cmdlet to list all the updates.

Get-WmiObject -Class Win32_QuickFixEngineering | select description,hotfixid,installedon

clip_image002

Then if you need to compare if all the nodes of your cluster or if your Hyper-V server have the same patch level just run this cmdlet remotely through PowerShell ISE.

If you need help run remotely PowerShell cmdlet, just see one of my previous post (Managing Hyper-V Server remotely through PowerShell)

Integration between Hyper-V Network Virtualization and Windows Server Gateway

One of features on Windows Server 2012 R2 that was improved since the last version is network virtualization. Virtual networks are created by using Hyper-V Network Virtualization, which is a technology that was introduced in Windows Server 2012.

In Windows Server 2012 R2 have now a service that help enable datacenters and clouds networks traffic been routed between virtual and physical networks, including the Internet. The service responsible for routing all the traffic is Windows Server Gateway. Windows Server Gateway is a vm-based software router that is able to route network traffic effectively between different datacenters or between datacenters and cloud.

How it works

Hyper-V Network Virtualization provides the concept of a virtual machine (VM) network that is independent of the underlying physical network. With this concept of VM networks, which are composed of one or more virtual subnets, the exact physical location of an IP subnet is decoupled from the virtual network topology. As a result, organizations can easily move their subnets to the cloud while preserving their existing IP addresses and topology in the cloud. This ability to preserve infrastructure allows existing services to continue to work, unaware of the physical location of the subnets. That is, Hyper-V Network Virtualization enables a seamless hybrid cloud.

In both private and hybrid cloud environments using Windows Server 2012, however, it was difficult to provide connectivity between VMs on the virtual network and resources on physical networks at local and remote sites, creating a circumstance where virtual subnets were islands separated from the rest of the network.

In Windows Server 2012 R2, Windows Server Gateway routes network traffic between the physical network and VM network resources, regardless of where the resources are located. You can use Windows Server Gateway to route network traffic between physical and virtual networks at the same physical location or at many different physical locations.

One example is, if you have both a physical network and a virtual network at the same physical location, you can deploy a server running Hyper-V that is configured with a Windows Server Gateway VM to act as a forwarding gateway and route traffic between the virtual and physical networks.

Another example is, if your virtual networks exist in the cloud, your cloud can deploy a Windows Server Gateway so that you can create a virtual private network (VPN) site-to-site connection between your VPN server and the cloud’s Windows Server Gateway; when this link is established you can connect to your virtual resources in the cloud over the VPN connection.

Integration between Hyper-V Network Virtualization and Windows Server Gateway

Windows Server Gateway is integrated with Hyper-V Network Virtualization, and is able to route network traffic effectively in circumstances where there are many different tenants – who have isolated virtual networks in the same datacenter.

Multi-tenancy is the ability of a cloud infrastructure to support the virtual machine workloads of multiple tenants, but isolate them from each other, while all of the workloads run on the same infrastructure. The multiple workloads of an individual tenant can interconnect and be managed remotely, but these systems do not interconnect with the workloads of other tenants, nor can other tenants remotely manage them.

How to use

There are different way that you can use Windows Server Gateway in your organization. It will depend what is overall solution that you want to achieve. You can use Windows Server Gateway in this situation:

  • Windows Server Gateway as a forwarding gateway for private cloud environments
  • Windows Server Gateway as a site-to-site VPN gateway for hybrid cloud environments
  • Multitenant Network Address Translation (NAT) for VM Internet access
  • Multitenant remote access VPN connections

Windows Server Gateway as a forwarding gateway for private cloud environments

For Enterprises that deploy an on-premises private cloud, Windows Server Gateway can act as a forwarding gateway and route traffic between virtual networks and the physical network.

If you have created virtual networks for one or more of your clouds, but many of your key resources (such as Active Directory Domain Services, SharePoint, or DNS) are on your physical network, Windows Server Gateway can route traffic between the virtual network and the physical network to provide users working on the virtual network with all of the services that they need.

In the illustration below, the physical and virtual networks are at the same physical location. Windows Server Gateway is used to route traffic between the physical network and virtual networks.

clip_image001

Windows Server Gateway as a site-to-site VPN gateway for hybrid cloud environments

If your infrastructure is a hybrid cloud, Windows Server Gateway provides a multitenant gateway solution that allows your tenants to access and manage their resources over site-to-site VPN connections from remote sites, and that allows network traffic flow between virtual resources in your datacenter and their physical network.

In the illustration below, a Cloud Service Provider (example Azure) provides datacenter network access to multiple tenants, some of whom have multiple sites across the Internet. In this example, tenants use third party VPN servers at their corporate sites, while the CSP uses Windows Server Gateway for the site-to-site VPN connections.

clip_image002

Multitenant Network Address Translation (NAT) for VM Internet access

In the illustration below, a home user running a Web browser on their computer makes a purchase on the Internet from a Contoso Web server that is a VM on the Contoso Virtual Network. During the purchasing process, the Web app verifies the credit card information provided by the home user by connecting to a Financial Services company on the Internet. This ability to connect from the virtual network to Internet resources is provided when NAT is enabled on the CSP Windows Server Gateway.

clip_image003

Multitenant remote access VPN connections

In the illustration below, Administrators use VPN dial-in connections to administer VMs on their corporate virtual networks. The Administrator from Contoso initiates the VPN connection from an Internet-enabled branch office, and connects through the CSP Windows Server Gateway to the Contoso Virtual Network.

Similarly, the Northwind Traders Administrator establishes a VPN connection from a residence office to manage VMs on the Northwind Traders Virtual Network.

clip_image004