Knowledge Base#

January 23, 2023
in Troubleshooting, VM
1 min read

Windows is Unable to Detect a Virtual Disk Drive

Virtual Disk Drive not detected in Windows

In most cases this is because the disk drive Interface Type is set to virtIO-SCSI (which is the default value) on any virtual machine that is being created. If that is the case, you will need to load the virtIO drivers during the Windows installation process. Windows does not natively recognize virtIO interfaces, so it cannot see the virtual disk.

Open Source virtIO drivers can be downloaded from here courtesy of Fedora Linux.

If Redhat virtIO drivers are required due to signed driver needs within Windows, a Redhat account will need to be made on their website and the drivers downloaded directly from them.

Installing virtIO Drivers during Windows Install

During a Windows installation, users can use the console interface tools to change the CD-ROM image to the newly downloaded virtIO drivers iso. Information on using the console interface tools can be found in the inline help within the category titled VDI under the section Using the Console.

Alternatively, you can choose to change the virtual disk drive Interface type to SATA and Windows will find the virtual disk and continue with the installation. !! info "This will come with a performance impact due to SATA drivers being software emulated."

Document Information

Last Updated: 2024-08-29
vergeOS Version: 4.12.6

January 23, 2023
in Troubleshooting, VM
1 min read

Windows - Slow to Format a New Disk

Formatting a Virtual Disk with Windows 2012 (and later) Hosts May Take Longer Than Expected

Windows Server 2012 (and later) hosts will, by default, issue SCSI TRIM and Unmap commands equivalent to the entire size of the virtual disk. This behavior is the same even if the "Perform a quick format" option is checked, which significantly slows down the format process.

It is possible to disable the SCSI TRIM and Unmap feature on the host for the duration of the format.

To Disable TRIM and Unmap

Using a Windows CMD window on the host, issue the following command:

fsutil behavior set DisableDeleteNotify 1

To Re-enable the Feature

Use the following command to re-enable the Trim and Unmap feature:

fsutil behavior set DisableDeleteNotify 0

To Verify the Current Setting

You can verify the current Trim and Unmap setting by issuing the following command:

fsutil behavior query DisableDeleteNotify

The output will show one of the following:

DisableDeleteNotify=0 - The 'Trim and Unmap' feature is on (enabled).
DisableDeleteNotify=1 - The 'Trim and Unmap' feature is off (disabled).

Affected Versions

Only Windows Server 2012 and later hosts are affected. All earlier versions (e.g., Windows 2008) do not exhibit the same issue.

Non-server Versions

Non-server versions of Windows (e.g., Windows 8.x and 10.x) do not support the DisableDeleteNotify parameter.

Document Information

Last Updated: 2024-08-29
VergeOS Version: 4.12.6

January 23, 2023
in Troubleshooting, VM
3 min read

Windows Restored VM Not Bootable

After restoring a copy of a virtual machine from a recent snapshot, the restored copy may fail to boot properly. The VM may stop with a blue screen message which reads:

Your PC ran into a problem and needs to restart. We're just collecting some error info, and then we'll restart for you.

There are several guest-level issues that can cause a VM running Windows to not start successfully. Below are the most common causes and their corresponding solutions.

Common Causes and Solutions

1. Non-Quiesced Snapshots

One of the most frequent causes of a restored VM failing to boot is that the snapshot was not taken in a clean (Quiesced) state. A quiesced snapshot ensures that the VM's memory and disk I/O are in a stable state, making the restored VM more likely to boot successfully. Without quiescing, the snapshot could have captured an unstable or inconsistent state.

Solution:

Ensure that when snapshots are taken, they are quiesced. Quiescing allows the OS to pause I/O operations, flush memory, and ensure that no incomplete transactions are saved in the snapshot.
For future backups, enable the Quiesce Snapshots option when scheduling backups for Windows VMs. This feature ensures that the system is in a stable state before taking a snapshot.

2. Pending or Partially Installed Windows Updates

If Windows updates were partially installed or in progress when the snapshot was taken, the restored VM might experience issues booting due to an incomplete or corrupted update state.

Solution:

Boot the VM into Safe Mode and complete any pending updates.
You can also attempt to disable the Windows Update service temporarily to allow the VM to boot without applying incomplete updates. Once booted, manually re-enable and check for updates.
Review the Windows Update Logs using Event Viewer to identify any problematic updates that might need to be rolled back or reinstalled.

3. Driver Incompatibility or Missing Drivers

Sometimes, the VM's hardware configuration in VergeOS (e.g., disk controllers, network adapters) may differ from the original environment, causing issues with booting due to incompatible or missing drivers. This is especially common when restoring VMs from a different hypervisor.

Solution:

Boot the VM using Windows Recovery and attempt to repair the system automatically.
Verify that the appropriate Virtio or SCSI drivers are installed, especially if the VM is using Virtio interfaces for storage or networking.
If the issue persists, boot into Safe Mode and manually update the VM's drivers from the Device Manager.

4. Corrupted Boot Loader

If the Windows bootloader was corrupted in the snapshot, the restored VM will not boot properly. This could happen if the system was performing a critical task related to the boot process (like an update or disk operation) when the snapshot was taken.

Solution:

Use the Windows Recovery Environment (WinRE) to repair the bootloader: 1. Boot the VM using a Windows installation disk or recovery media. 2. Select Repair your computer > Troubleshoot > Advanced options > Startup Repair. 3. If Startup Repair doesn’t work, open a Command Prompt and run the following commands:
```
bootrec /fixmbr
bootrec /fixboot
bootrec /rebuildbcd
```
These commands will repair the Master Boot Record (MBR) and rebuild the Boot Configuration Data (BCD).

5. Hardware Configuration Changes

Changes to the VM’s hardware configuration, such as CPU count, memory allocation, or disk type, may cause instability or prevent the VM from booting.

Solution:

Verify that the VM’s hardware configuration in VergeOS matches the original configuration from when the snapshot was taken.
If you made any changes, such as increasing memory or changing the number of CPUs, try reverting to the original configuration to see if the VM boots properly.

Best Practice: Manage Windows Updates in Guest VMs

A guest VM running Windows OS, and experiencing an unexpected restart, is often found to be caused by the Microsoft Windows Update service being configured to automatically apply updates that frequently require a restart.

Recommendations: - Schedule snapshot creation during maintenance windows when Windows updates are not being applied. - Configure Windows Update settings to avoid automatic installations or reboots, especially on critical VMs. Instead, use a manual update process during scheduled maintenance periods. - Regularly review the Windows Update logs in Event Viewer to detect potential issues related to updates that could affect the stability of the VM.

Document Information

Last Updated: 2024-09-03
VergeOS Version: 4.12.6

January 23, 2023
in Troubleshooting, Migration
1 min read

Workloads Failing to Migrate

Reasons That Workloads May Fail to Migrate

A workload is any process that is running on a node. Common workloads include Virtual Machines (VM), NAS Services, Networking, and Tenant Nodes.

The main reasons a workload fails to migrate from one node to another in the system are:

Insufficient available resources: There may not be enough resources (such as RAM) on the target node to run the workload you're trying to migrate. Check the amount of RAM consumed by the workload (VM or Tenant node), then review the resources available on the target node.
Pinned VM configuration: A VM may be pinned to a specific node. Review the VM’s settings and check the CPU Type setting. If the CPU Type is set to Host Processor, the VM will be unable to migrate. In this case, the VM must be powered off before it can be migrated successfully.
Tenant node migration issues: Tenant nodes may also face migration issues for the same reasons as listed above. Log into the Tenant User Interface, and check the following:
Verify that each Tenant node has sufficient available resources to host the migrating tenant workloads.
Verify that each Tenant VM is not configured with the CPU Type set to Host Processor.

Document Information

Last Updated: 2024-08-29
vergeOS Version: 4.12.6

January 23, 2023
in Best Practices, Troubleshooting
2 min read

Proper Power Sequence

Proper Shutdown Sequence for a VergeOS Environment

To power off a cluster (a collection of two or more nodes) follow these steps:

Check any running workloads on each node of the cluster. Navigate to the node dashboard for each node and review the Running Machines section.
If there are tenants running on any of the nodes, log into those tenant environments and gracefully shut down all running workloads.
Power off all running workloads on each node, including VMs, tenant nodes, VMware backup services, and NAS services (if applicable).

vNet Containers

There is no need to manually stop any running vNet containers; they will be gracefully stopped automatically in the subsequent steps.

After stopping all running workloads, navigate to the Cluster dashboard for the cluster you wish to power off.
Select Power Off from the left-hand menu to begin shutting down each node in the cluster.
Finally, navigate to System -> Clusters and select Power Off in the left menu to power off the entire cluster.

IMPORTANT

If an environment contains multiple clusters, ALWAYS shut down the cluster containing the controller nodes (Node1 & Node2) LAST.

Proper Power On Sequence for a VergeOS Environment

To properly power on a VergeOS environment, perform the following steps:

Power on Node1.
Once Node1 is online, power on Node2.
Power on all other nodes, waiting approximately 1 minute between power actions.
On the main dashboard, verify that the environment is Green and Online.

Document Information

Last Updated: 2024-08-29
vergeOS Version: 4.12.6