Skip to content

Knowledge Base#

Windows - Slow to Format a New Disk

Formatting a Virtual Disk with Windows 2012 (and later) Hosts May Take Longer Than Expected

Windows Server 2012 (and later) hosts will, by default, issue SCSI TRIM and Unmap commands equivalent to the entire size of the virtual disk. This behavior is the same even if the "Perform a quick format" option is checked, which significantly slows down the format process.

It is possible to disable the SCSI TRIM and Unmap feature on the host for the duration of the format.

To Disable TRIM and Unmap

Using a Windows CMD window on the host, issue the following command:

fsutil behavior set DisableDeleteNotify 1

To Re-enable the Feature

Use the following command to re-enable the Trim and Unmap feature:

fsutil behavior set DisableDeleteNotify 0

To Verify the Current Setting

You can verify the current Trim and Unmap setting by issuing the following command:

fsutil behavior query DisableDeleteNotify

The output will show one of the following:

  • DisableDeleteNotify=0 - The 'Trim and Unmap' feature is on (enabled).
  • DisableDeleteNotify=1 - The 'Trim and Unmap' feature is off (disabled).

Affected Versions

  • Only Windows Server 2012 and later hosts are affected. All earlier versions (e.g., Windows 2008) do not exhibit the same issue.

Non-server Versions

  • Non-server versions of Windows (e.g., Windows 8.x and 10.x) do not support the DisableDeleteNotify parameter.

Document Information

  • Last Updated: 2024-08-29
  • VergeOS Version: 4.12.6

Windows Restored VM Not Bootable

After restoring a copy of a virtual machine from a recent snapshot, the restored copy may fail to boot properly. The VM may stop with a blue screen message which reads:

Your PC ran into a problem and needs to restart. We're just collecting some error info, and then we'll restart for you.

There are several guest-level issues that can cause a VM running Windows to not start successfully. Below are the most common causes and their corresponding solutions.

Common Causes and Solutions

1. Non-Quiesced Snapshots

One of the most frequent causes of a restored VM failing to boot is that the snapshot was not taken in a clean (Quiesced) state. A quiesced snapshot ensures that the VM's memory and disk I/O are in a stable state, making the restored VM more likely to boot successfully. Without quiescing, the snapshot could have captured an unstable or inconsistent state.

Solution:
  • Ensure that when snapshots are taken, they are quiesced. Quiescing allows the OS to pause I/O operations, flush memory, and ensure that no incomplete transactions are saved in the snapshot.
  • For future backups, enable the Quiesce Snapshots option when scheduling backups for Windows VMs. This feature ensures that the system is in a stable state before taking a snapshot.

2. Pending or Partially Installed Windows Updates

If Windows updates were partially installed or in progress when the snapshot was taken, the restored VM might experience issues booting due to an incomplete or corrupted update state.

Solution:
  • Boot the VM into Safe Mode and complete any pending updates.
  • You can also attempt to disable the Windows Update service temporarily to allow the VM to boot without applying incomplete updates. Once booted, manually re-enable and check for updates.
  • Review the Windows Update Logs using Event Viewer to identify any problematic updates that might need to be rolled back or reinstalled.

3. Driver Incompatibility or Missing Drivers

Sometimes, the VM's hardware configuration in VergeOS (e.g., disk controllers, network adapters) may differ from the original environment, causing issues with booting due to incompatible or missing drivers. This is especially common when restoring VMs from a different hypervisor.

Solution:
  • Boot the VM using Windows Recovery and attempt to repair the system automatically.
  • Verify that the appropriate Virtio or SCSI drivers are installed, especially if the VM is using Virtio interfaces for storage or networking.
  • If the issue persists, boot into Safe Mode and manually update the VM's drivers from the Device Manager.

4. Corrupted Boot Loader

If the Windows bootloader was corrupted in the snapshot, the restored VM will not boot properly. This could happen if the system was performing a critical task related to the boot process (like an update or disk operation) when the snapshot was taken.

Solution:
  • Use the Windows Recovery Environment (WinRE) to repair the bootloader: 1. Boot the VM using a Windows installation disk or recovery media. 2. Select Repair your computer > Troubleshoot > Advanced options > Startup Repair. 3. If Startup Repair doesn’t work, open a Command Prompt and run the following commands:
    bootrec /fixmbr
    bootrec /fixboot
    bootrec /rebuildbcd
    
  • These commands will repair the Master Boot Record (MBR) and rebuild the Boot Configuration Data (BCD).

5. Hardware Configuration Changes

Changes to the VM’s hardware configuration, such as CPU count, memory allocation, or disk type, may cause instability or prevent the VM from booting.

Solution:
  • Verify that the VM’s hardware configuration in VergeOS matches the original configuration from when the snapshot was taken.
  • If you made any changes, such as increasing memory or changing the number of CPUs, try reverting to the original configuration to see if the VM boots properly.

Best Practice: Manage Windows Updates in Guest VMs

A guest VM running Windows OS, and experiencing an unexpected restart, is often found to be caused by the Microsoft Windows Update service being configured to automatically apply updates that frequently require a restart.

Recommendations: - Schedule snapshot creation during maintenance windows when Windows updates are not being applied. - Configure Windows Update settings to avoid automatic installations or reboots, especially on critical VMs. Instead, use a manual update process during scheduled maintenance periods. - Regularly review the Windows Update logs in Event Viewer to detect potential issues related to updates that could affect the stability of the VM.


Document Information

  • Last Updated: 2024-09-03
  • VergeOS Version: 4.12.6

Workloads Failing to Migrate

Reasons That Workloads May Fail to Migrate

A workload is any process that is running on a node. Common workloads include Virtual Machines (VM), NAS Services, Networking, and Tenant Nodes.

The main reasons a workload fails to migrate from one node to another in the system are:

  • Insufficient available resources: There may not be enough resources (such as RAM) on the target node to run the workload you're trying to migrate. Check the amount of RAM consumed by the workload (VM or Tenant node), then review the resources available on the target node.

  • Pinned VM configuration: A VM may be pinned to a specific node. Review the VM’s settings and check the CPU Type setting. If the CPU Type is set to Host Processor, the VM will be unable to migrate. In this case, the VM must be powered off before it can be migrated successfully.

  • Tenant node migration issues: Tenant nodes may also face migration issues for the same reasons as listed above. Log into the Tenant User Interface, and check the following:

  • Verify that each Tenant node has sufficient available resources to host the migrating tenant workloads.
  • Verify that each Tenant VM is not configured with the CPU Type set to Host Processor.

Document Information

  • Last Updated: 2024-08-29
  • vergeOS Version: 4.12.6

Proper Power Sequence

Proper Shutdown Sequence for a VergeOS Environment

To power off a cluster (a collection of two or more nodes) follow these steps:

  1. Check any running workloads on each node of the cluster. Navigate to the node dashboard for each node and review the Running Machines section.
  2. If there are tenants running on any of the nodes, log into those tenant environments and gracefully shut down all running workloads.
  3. Power off all running workloads on each node, including VMs, tenant nodes, VMware backup services, and NAS services (if applicable).

vNet Containers

There is no need to manually stop any running vNet containers; they will be gracefully stopped automatically in the subsequent steps.

  1. After stopping all running workloads, navigate to the Cluster dashboard for the cluster you wish to power off.
  2. Select Power Off from the left-hand menu to begin shutting down each node in the cluster.
  3. Finally, navigate to System -> Clusters and select Power Off in the left menu to power off the entire cluster.

IMPORTANT

If an environment contains multiple clusters, ALWAYS shut down the cluster containing the controller nodes (Node1 & Node2) LAST.


Proper Power On Sequence for a VergeOS Environment

To properly power on a VergeOS environment, perform the following steps:

  1. Power on Node1.
  2. Once Node1 is online, power on Node2.
  3. Power on all other nodes, waiting approximately 1 minute between power actions.
  4. On the main dashboard, verify that the environment is Green and Online.

main-dash-stoplights.png


Document Information

  • Last Updated: 2024-08-29
  • vergeOS Version: 4.12.6

KB Template



Document Information

  • Last Updated: 2024-08-29
  • vergeOS Version: 4.12.6