SCOM Alert: WinRM is not functional or not the right version

On System Center 2012 R2 Operations Manager (SCOM) (with UR2) starting to appear Critical Alerts coming from System Center Advisor (SCA) saying that my System Center 2012 R2 Virtual Machine Manager (SCVMM) (with UR2) “WinRM is not functional or not the right version”.

The alert description is:

An issue was found with Windows Remote Management (WinRM) on the specified server.
The specified server cannot be used for VMM server roles such as host/library/PXE server/WSUS server/VMM management server until the issue is resolved. See the troubleshooting article for more information.
Path: /VMM Server/NAME_OF_SERVER
Details: Exception attempting to access WinRM service on agent
Agent FQDN: NAME_OF_SERVER

After view the solution/KB Article that SCA indicates (see here) and follow the troubleshooting steps, I did conclude that the version that I have on SCVMM it was newer that the version that the SCA alert is reporting.

On this environment I have cluster with Hyper-V 2012 and Hyper-V 2012 R2 managed by SCVMM 2012 R2. The version on SCVMM server and the Hyper-V 2012 R2 is the same (see picture below).

clip_image003

On the other hand the version of Hyper-V 2012 are different because of the OS (see picture bellow)

clip_image006

In this case I did move on and insert safely this alert under the ignore alert list on SCA. I highly recommend to ignore this alert, but only, for that specified Hyper-V server and not the entire alert. Using this option you will continue to monitor this rule, but not for those Hyper-V Servers.

iSCSI connection issues with Hyper-V Server

If you have a Hyper-V Cluster connected through iSCSI to a storage solution and sometimes you get errors that your CSV volumes went offline. In some cases, the Windows Failover Cluster is able to recover and bring online automatically and your virtual machines will not stop. In other situation if your CSV volume is not been recovery you will have to bring online the volume and then start all the virtual machine manually.

If you see the following sequence of events on the event viewer or in the Failover Cluster Manager events:

Error – iScsiPrt 20

None Connection to the target was lost. The initiator will attempt to retry the connection.

This mean that the connection to the target was lost. The initiator will attempt to retry the connection

This event is logged when the initiator loses connection to the target when the connection was in iSCSI Full Feature Phase. This event typically happens when there are network problems, network cable is removed, network switch is shutdown, or target resets the connection. In all cases initiator will attempt to re-establish the TCP connection.

Error – iScsiPrt 7

None The initiator could not send an iSCSI PDU. Error status is given in the dump data.

This mean that the initiator could not send an iSCSI PDU. Error status is given in the dump data.

This event is logged when the initiator could not send an iSCSI PDU to the target.

Information – iScsiPrt 34

None A connection to the target was lost, but Initiator successfully reconnected to the target. Dump data contains the target name.

This mean that the iSCSI connectivity was restored

How to solve it

The way that I found that solve this events stopping appearing at the event viewer and off course solving the iSCSI issues within my Hyper-V Cluster were:

1. Install following KBs

2. Configuring Network Prioritization

It is possible to customize NP if the cluster does not automatically assign networks to use the traffic pattern that you want, which will change the ranked order, and hence the function.  For example, you may want Cluster Network 3 to be used for “Live Migration Traffic” as it is the fastest, so you would change its Metric to a value between 1000 and 1100, such as 1050, so that it is ranked second on the list.  Once Cluster Network 3 has the second-lowest metric it will be used for Live Migration Traffic.

To change the value of a network metric, run:
$n = Get-ClusterNetwork “Cluster Network 3”
$n.Metric = 1050

This will change the metric of Cluster Network 3 to 1050.

Now you get the following output from running
Get-ClusterNetwork | ft Name, Metric, AutoMetric

Name                       Metric     AutoMetric
—-                       ——     ———-
Cluster Network 1          1000       True
Cluster Network 3          1050       False
Cluster Network 2          1100       True
Cluster Network 4          10000      True
Cluster Network 5          10100      True

You may have noticed that is a property associated with each network called AutoMetric.  This indicates whether the Metric was set using the default values (True) or if it had been later adjusted by an admin (False).  This gives insight into whether NP has been configured on the cluster.  Using this flag, it is actually possible to change the value of a network back to its original and automatically assigned value, by running the cmdlet:
$n = Get-ClusterNetwork “Cluster Network 3”
$n.AutoMetric = $true

3. Disabled the TRIM Feature

The following command needs to be executed to Turn off Windows TRIM feature

fsutil behavior set disabledeletenotify 1

4. Disabled ODX

To disable ODX on the Hyper-V Server, just follow this steps:

1. Open a Windows PowerShell session as an administrator.

2. Check whether ODX is currently enabled (it is by default) by verifying that the FilterSupportedFeaturesMode value in the registry equals 0. To do so, type the following command:

Get-ItemProperty hklm:systemcurrentcontrolsetcontrolfilesystem -Name “FilterSupportedFeaturesMode”

3. Disable ODX support. To do so, type the following command:

Set-ItemProperty hklm:systemcurrentcontrolsetcontrolfilesystem -Name “FilterSupportedFeaturesMode” -Value 1

Cluster Shared Volumes (CSV) errors on Hyper-V Cluster

In a failover cluster, virtual machines can use Cluster Shared Volumes that are on the same LUN (disk), while still being able to fail over (or move from node to node) independently of one another. Virtual machines can use a Cluster Shared Volume only when communication between the cluster nodes and the volume is functioning correctly, including network connectivity, access, drivers, and other factors.

You probably didn’t notice any issues with your VMs, but If you are getting the following events on your Hyper-V Cluster nodes, regarding the CSV volume:

Warning – Disk 153

“The description for Event ID 153 from source disk cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

The following information was included with the event:

DeviceHarddisk5DR5

the message resource is present but the message is not found in the string/message table

Information – Microsoft-Windows-FailoverClustering 5121

“The description for Event ID 5121 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Volume2

Cluster Disk 2 – Volume2

Error – Microsoft-Windows-FailoverClustering 5120

“The description for Event ID 5120 from source Microsoft-Windows-FailoverClustering cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

Volume1

Cluster Disk 1 – Volume1

STATUS_DEVICE_BUSY(80000011)

That means there has been an interruption to communication between a cluster node and a volume in Cluster Shared Volumes. This interruption may be short enough that it is not noticeable, or long enough that it interferes with services and applications using the volume

How to resolve it

CSV – Review events related to communication with the volume

To perform the following procedure, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

To open Event Viewer and view events related to failover clustering:

1. If Server Manager is not already open, click Start, click Administrative Tools, and then click Server Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

2. In the console tree, expand Diagnostics, expand Event Viewer, expand Windows Logs, and then click System.

3. To filter the events so that only events with a Source of FailoverClustering are shown, in the Actions pane, click Filter Current Log. On the Filter tab, in the Event sources box, select FailoverClustering. Select other options as appropriate, and then click OK.

4. To sort the displayed events by date and time, in the center pane, click the Date and Time column heading.

CSV – Check storage and network configuration

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Gathering information about the condition and configuration of a disk in Cluster Shared Volumes

To gather information about the condition and configuration of a disk in Cluster Shared Volumes:

1. Scan appropriate event logs for errors that are related to the disk.

2. Review information available in the interface for the storage and if needed, contact the vendor for information about the storage.

3. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

4. In the Failover Cluster Manager snap-in, expand the console tree and click Cluster Shared Volumes. In the center pane, expand the listing for the volume that you are gathering information about. View the status of the volume.

5. Still in the center pane, to prepare for testing a disk in Cluster Shared Volumes, right-click the disk, click Take this resource offline, and then if prompted, confirm your choice. Repeat this action for any other disks that you want to test.

6. Right-click the cluster containing the Cluster Shared Volumes, and then click Validate This Cluster.

7. On the Testing Options page, select Run only tests I select.

8. On the Test Selection page, clear the check boxes for System Configuration and Network. This leaves the tests for Cluster Configuration, Inventory, and Storage. You can run all these tests, or you can select only the specific tests that appear relevant to your situation.

NOTE: If you run the Storage tests you will have downtime in your Cluster. Not recommend if you are troubleshooting on a production environment.

9. Follow the instructions in the wizard to run the tests.

10. On the Summary page, click View Report.

11. Under Results by Category, click Storage, click any test that is not labelled as Success, and then view the results.

12. Scroll back to the top of the report, and under Results by Category, click Cluster Configuration, and then click List Cluster Network Information. Confirm that any network that you intend for communication between nodes and Cluster Shared Volumes is labelled either Internal use or Internal and client use. Confirm that other networks (for example, networks used only for iSCSI and not for cluster network communication) do not have these labels.

13. If the information in the report shows that one or more networks are not configured correctly, return to the Failover Cluster Manager snap-in and expand Networks. Right-click the network that you want to modify, click Properties, and then make sure that the settings for Allow the cluster to use this network and Allow clients to connect through this network are configured as intended.

14. To bring disks back online, click Cluster Shared Volumes and, in the center pane, right-click a disk, and then click Bring this resource online. Repeat this action for any other disks that you want to bring online again.

Verifying settings for a network designated for network communication with Cluster Shared Volumes

To verify settings for a network designated for network communication with Cluster Shared Volumes:

1. Click Start, click Control Panel, click Network and Internet, and then click Network and Sharing Center.

2. In the Tasks pane, click Change adapter settings.

3. Right-click the connection you want, and then click Properties.

4. Make sure that the following check boxes are selected:

  • Client for Microsoft Networks
  • File and Printer Sharing for Microsoft Networks

Verifying that the required NTLM authentication is allowed

1. On a node in the cluster, to see the security policies that are in effect locally, click Start, click Administrative Tools, and then click Local Security Policy.

2. Navigate to Security SettingsLocal PoliciesSecurity Options.

3. In the center pane, click the Policy heading to sort the policies alphabetically.

4. Review Network security: Restrict NTLM: Add remote server exceptions for NTLM authentication and the items that follow it. If items related to “server exceptions” are marked Disabled, or other items have specific settings, a policy may be in place that is interfering with NTLM authentication on this server. If this is the case, contact an appropriate administrator (for example, your administrator for Active Directory or security) to ensure that NTLM authentication is allowed for cluster nodes that are using Cluster Shared Volumes.

Opening Event Viewer and viewing events related to failover clustering

To open Event Viewer and view events related to failover clustering:

1. If Server Manager is not already open, click Start, click Administrative Tools, and then click Server Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. In the console tree, expand Diagnostics, expand Event Viewer, expand Windows Logs, and then click System.

3. To filter the events so that only events with a Source of FailoverClustering are shown, in the Actions pane, click Filter Current Log. On the Filter tab, in the Event sources box, select FailoverClustering. Select other options as appropriate, and then click OK.

4. To sort the displayed events by date and time, in the center pane, click the Date and Time column heading.

Finding more information about the error codes that some event messages contain

To find more information about the error codes that some event messages contain:

1. View the event, and note the error code.

2. Look up more information about the error code in one of two ways:

NET HELPMSG errorcode

How to verify it

Confirm that the Cluster Shared Volume can come online. If there have been recent problems with writing to the volume, it can be appropriate to monitor event logs and monitor the function of the corresponding clustered virtual machine, to confirm that the problems have been resolved.

To perform the following procedures, you must be a member of the local Administrators group on each clustered server, and the account you use must be a domain account, or you must have been delegated the equivalent authority.

Confirming that a Cluster Shared Volume can come online

To confirm that a Cluster Shared Volume can come online:

1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. In the Failover Cluster Manager snap-in, if the cluster you want to manage is not displayed, in the console tree, right-click Failover Cluster Manager, click Manage a Cluster, and then select or specify the cluster that you want.

3. If the console tree is collapsed, expand the tree under the cluster you want to manage, and then click Cluster Shared Volumes.

4. In the center pane, expand the listing for the volume that you are verifying. View the status of the volume.

5. If a volume is offline, to bring it online, right-click the volume and then click Bring this resource online.

Using a Windows PowerShell command to check the status of a resource in a failover cluster

To use a Windows PowerShell command to check the status of a resource in a failover cluster:

1. On a node in the cluster, click Start, point to Administrative Tools, and then click Windows PowerShell Modules. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

2. Type: Get-ClusterSharedVolume

If you run the preceding command without specifying a resource name, status is displayed for all Cluster Shared Volumes in the cluster.

MPIO on Hyper-V Server

On the previous version of Windows Server (prior Windows Server 2012) you have to download and install MultiPath I/O (MPIO). Since Windows Server 2012 MPIO is a feature that you can enable. Because it’s a feature that comes with the server, means that you will have the PowerShell cmdlets available.

Use of the MPIO module in Windows PowerShell requires an “elevated” PowerShell window, opened with Administrator privileges.

How to do it

 

Installing MPIO using the GUI

If you have Hyper-V Servers, you don’t have GUI on the server, but you can do it remotely from other server or from you RSAT installed on Windows 8.1, using the Server Manager Console. Just follow the steps.

1. Open Server Manager Console

2. Browse the Hyper-V Server that you want to enable the MPIO. To do that click on All Servers and then click on the Hyper-V Server.

image

3. Right-Click on the Hyper-V Server and click on Add Roles and Features

4. Click 4 times Next (to go to features windows)

image

5. On the Select features window, select Multipath I/O and click next.

image

6. Click Install to enable the feature.

Installing and Managing MPIO using PowerShell

Enable or Disable the MPIO Feature

If the MPIO feature is not currently installed, use the following command to enable the MPIO feature:

Enable-WindowsOptionalFeature –Online –FeatureName MultiPathIO

clip_image007

To disable the MPIO feature, use the following command

Disable-WindowsOptionalFeature –Online –FeatureName MultiPathIO

Listing commands available in the MPIO module

The commands available in the MPIO module can be listed using get-command as shown below

clip_image009

Full help and example content for the MPIO module is available via the following method:

  • In PowerShell, after importing the MPIO module or using any MPIO cmdlet, updated help can be downloaded from the internet by running the following command:
    • Update-Help

Tips and Tricks

Configuring MPIO using PowerShell

If these steps are performed prior to connecting devices of the desired BusType, you can typically avoid the need for a restart.

  • Install the MPIO feature on a new Windows Server 2012 installation.
  • Configure MPIO to automatically claim all iSCSI devices.
  • Configure the default Load Balance policy for Round Robin.
  • Set the Windows Disk timeout to 60 seconds.

Here is what this script would look like:

# Enable the MPIO Feature

Enable-WindowsOptionalFeature –Online –FeatureName MultiPathIO

# Enable automatic claiming of ISCSI devices for MPIO

Enable-MSDSMAutomaticClaim -BusType iSCSI

# Set the default load balance policy of all newly claimed devices to Round Robin

Set-MSDSMGlobalLoadBalancePolicy -Policy RR

# Set the Windows Disk timeout to 60 seconds

Set-MPIOSetting -NewDiskTimeout 60

Hyper-V Best Practices Analyzer

Sometimes when you deploy an Hyper-V Server you don’t know if you miss any configuration or if you are following the best practices regarding security, configuration or even supportability of Hyper-V Server in case you need some help from Microsoft Support. To help us Microsoft has created a few rules to help us improve our environments — these are referred to as best practices. However, it is not easy to know all of them and to make sure your Hyper-V servers are compliant with all of these practices.

To make this job easier, Windows Server comes with the Best Practices Analyzer (BPA). It has a set of best practices and rules which it will compare against all the components of your server and it will then generate a report with all the problems that are found during the scan. The report will provide helpful details such as problems, impact, and resolutions for possible issues.

Windows Server comes with best practices for almost all the roles as well as a specific one only for Hyper-V with all the practices to analyze your host server, configuration, and virtual machines.

The Hyper-V Best Practices Analyzer works only with the pre-installed Hyper-V Role. Make sure that Hyper-V is installed and as a best practice, run the BPA after every server installation and configuration is performed.

How to do it

By following these steps, you will see how to run the best practices analyzer for Hyper-V and explore its results:

1. Open the Server Manager from the Windows Taskbar.

2. From the Server Manager window, click on Hyper-V on the pane on the left-hand side. Then use the scroll bar on the right-hand side to scroll down until the best practices analyzer option can be seen.

3. Under Best Practices Analyzer, navigate to Tasks | Start BPA Scan, as shown in the following screenshot:

clip_image002

4. In the Select Servers window, select the Hyper-V servers that you want to scan and click on Start Scan.

5. The scan will start on all the selected servers. When the scan has finished, the BPA results will be shown in Server Manager, under Best Practices Analyzer.

6. When completed, the scan results will be listed in three columns—Server Name, Severity, and Title. Use the filters above each column to organize the information based on your queries.

7. Click on one of the results to see the information provided by BPA. The following screenshot shows an example of a warning scan result and its description:

image

8. Open the results and analyze the problem, impact, and resolution for each server.

9. Use the filter at the top to find only warnings and errors.

10. After identifying the results, you can apply the resolutions provided by the Hyper-V BPA.

BPA on PowerShell

All of Windows Best Practices are available through PowerShell as well. You can scan, filter, get the results, and extract reports using the PowerShell commandlets. To start a scan using the Hyper-V BPA, type the following command:

Invoke-BpaModel –BestPracticesModelId Microsoft/Windows/Hyper-V

After invoking the Hyper-V BPA, you can use the Get-BPAResult command to analyze the results. The following command shows the BPA scan results:

Get-BpaResult –BestPracticesModelId Microsoft/Windows/Hyper-V

The following screenshot is an example of how the Get-BPAResult output could look:

clip_image006

If you want to filter only the warnings and the errors by using PowerShell, you can also use the following command:

Get-BpaResult -BestPracticesModelId Microsoft/Windows/Hyper-V | Where-Object {$_.Severity –eq “Warning” –or $_.Severity –eq “Error”}

Summary

The Best Practice Analyzer for Hyper-V has 74 scans to identify which settings are not configured, based on the Microsoft documentation and practices. It is enabled automatically when the Hyper-V role is installed.

When BPA scans the servers, it shows the results for every scan, providing helpful details about what was scanned, the impact, and even how to resolve any problems it finds. It will also give you the option to apply the necessary changes for your server in compliance with the best practices.

BPA is available through Server Manager and can be used at any time. The recommendation is to scan every server after their final configurations and also on a monthly basis after that.

Hyper-V BPA will also display information about Microsoft Support. If the server has a configuration that is not supported by Microsoft, it will inform you of this through the reports.

After running and applying the recommended settings, you can then be sure that your servers have all the best practices, currently recommended by Microsoft.

 

Tips and Tricks

Using PowerShell to create HTML reports with the BPA results to improve the PowerShell results it is possible to produce a BPA HTML report using the following command. This following script uses the previous Get-BpaResult filter example to show only the warning and the error results:

$head = ‘<style>

BODY{font-family:Verdana; background-color:lightblue;} TABLE{border-width: 1px;border-style: solid;border-color: black;bordercollapse: collapse;} TH{font-size:1.3em; border-width: 1px;padding: 2px;border-style: solid;border-color: black;background-color:#FFCCCC} TD{border-width: 1px;padding: 2px;border-style: solid;border-color: black;background-color:yellow}

</style>’

$header = “<H1>Hyper-V BPA Errors and Warnings Results</H1>”

$title = “Hyper-V BPA”

Get-BpaResult -BestPracticesModelId Microsoft/Windows/Hyper-V | Where-Object {$_.Severity -eq “Error” -or $_.Severity -eq “Warning”} | ConvertTo-HTML -head $head -body $header -title $title |

Out-File report.htm .report.htm

The following screenshot shows the output file that is created after running the script:

clip_image008