Skip to content

Storage#

vSAN Tier Status (Journal Walks)

Overview

This page is designed to help you understand VergeFS status metrics provided on the vSAN Tier Dashboard. These metrics provide insight related to Journal Walks, the processes that continually monitor and support vSAN data integrity.

Monitoring vSAN tier status information covered on this page is typically unnecessary during normal operation (general vSAN health and activity can be monitored on the Main Dashboard). The following details are intended for troubleshooting or for users interested in viewing Journal Walk activity specifics. This dashboard is most useful when investigating an issue or tracking the progress of a Journal Walk, such as during an update process.

Journal Walks

VergeFS employs a process called Journal Walks (also referred to as "Walks") to continually verify storage fidelity and safeguard against risks like hardware failures, silent bitrot, power disruptions, and misleading device write confirmations. These walks are automatically triggered, scanning each node to verify possession of its expected data blocks. In the event of any missing data blocks, which may result from: device issues, planned node reboots, or environmental disruptions, VergeFS proactively performs repairs to restore consistency.

Journal Walks operate as a background process; system operations proceed normally while a Journal Walk is in progress.

The system executes three types of Journal Walks:

  • Partial (differential) Walk - targets data changed since last walk transaction for quicker validation
  • Full Walk - scans all data across all nodes
  • Mixed Walk - occurs when a non-controller node reboots; only that node is fully scanned, while other nodes are differentially scanned.

Accessing vSAN Tier Status Information

Navigate to: Main Dashboard > vSAN Tiers > double-click the desired tier. This displays the dashboard for the selected vSAN tier. Refer to the Status tile on this page.

Status Data

  • Redundant: (checkbox) Reflects whether the vSAN tier is currently verified as redundant. If unchecked, maintenance mode will be disabled to prevent disruption. The box may appear unchecked during a full Journal Walk until redundancy is confirmed. It also remains unchecked if redundancy cannot be verified, such as when a node is offline after the Journal Walk completes.

  • Encrypted: (checkbox) Shows whether data in the vSAN tier is encrypted. Encryption status is set during installation and remains fixed; this setting cannot be modified after deployment.

  • Working: (checkbox) Indicates that a Journal Walk is actively running for this tier. If no snapshots or data changes are occurring, walks may complete too quickly to register as “working” in the UI.

  • Full Walk: (checkbox) Flags whether a full Journal Walk is in progress. Full walks are triggered by events such as controller startup or topology changes (e.g., node offline or added, drive failure, etc.).

When a node other than the active controller reboots, a Mixed Walk is triggered instead.

  • Walk Progress: Displays the current Journal Walk’s progress as a percentage, or shows “Idle” if no walk is active.

  • Last Walk Time (ms): Duration in milliseconds of the most recent Journal Walk.

  • Last Full Walk Time (ms): Duration in milliseconds of the most recent Full Journal Walk.

  • Current Transaction: A unique ID representing the latest transaction. This value increments with each Journal Walk, whether full, mixed, or differential.

  • Transaction Start Time: Timestamp indicating when the current or most recent Journal Walk began. Useful for diagnosing prolonged or stalled operations. (see Journal Walk Duration below).

  • Repairs: Displays the current count of missing data blocks detected on the tier. It’s normal to see a non-zero value after events such as node failures, maintenance operations, or updates. VergeFS Journal Walks automatically identify and work to correct these detected blocks using redundant data stored on other nodes. If redundancy fails (e.g. double node failure), the system will try to retrieve blocks from a configured repair server. Persistent repair counts (i.e. after several transaction increments) may indicate manual resolution is needed, and contacting VergeIO Support is recommended in such cases.

If missing data blocks have already been detected and a repair server isn’t yet configured, it’s not too late. Setting up a repair server now allows VergeFS to automatically attempt recovery of those blocks during subsequent Journal Walks.

  • Bad Drives: Indicates the number of drives missing since the current Journal Walk began. It’s common to see a non-zero value here after node reboots, maintenance, or updates; this doesn’t automatically signal a drive failure. Missing drives are typically related to offline nodes or detection delays at walk start. If no nodes are offline and this field shows a count, review drive and node status via the Main Dashboard for further insight.

Journal Walk Duration

Walk timespans are variable, with several factors that can affect durations, including:

  • Use of NVME Tier 0 for metadata
  • Available memory on controller nodes
  • Quantity of data on the tier
  • Amount of data changes since the last transaction

Walk Time Considerations

  • Updates involve full walks and mixed walks, hence the time it takes for these operations will affect necessary maintenance windows.
  • The time it ultimately takes to make large deletions and data tier migrations (e.g. from one tier to another) will be reliant on differential walk times.
  • Systems that follow published sizing and design recommendations should experience acceptable walk durations. For example, walks triggered during update operations generally fit within standard maintenance windows.

Walk Time Optimization

Walk times depend on the tier size and rate of data change. Adequate resources and proper network design significantly impact walk performance.

Tips to Optimize Journal Walk Times
  • Follow recommended Node Sizing Requirements (e.g. dedicated tier 0 using NVME drives, right-sizing controller memory for your environment)
  • Implement Network Design recommendations (e.g. adequate internode bandwidth of at least 10Gb, isolated, dedicated core networks)
  • Avoid overprovisioning workload RAM on compute-and-storage (HCI) nodes.
  • When possible, schedule maintenance operations that trigger Full or Mixed Walks during scheduled maintenance windows, while avoiding concurrent heavy I/O operations.

If you have questions or concerns about the timeframe of walk transactions, please contact our support team for assistance.

Adding Tier 0 to an Existing System

Overview

Key Points

  • Tier 0 is normally configured during initial installation
  • This procedure is for special cases requiring post-installation configuration
  • Requires careful attention to device paths and hardware compatibility

This guide outlines the process for adding Tier 0 storage to an existing VergeOS system. While Tier 0 is typically configured during installation, these steps provide a method for adding it to production systems that cannot be reinstalled.

Critical Warning

  • This procedure should only be performed by qualified VergeOS engineers or under direct support guidance
  • Selected devices will be formatted and all existing data will be destroyed
  • Incorrect device path selection can seriously damage your system

Prerequisites

Before beginning this procedure, ensure:

  • Storage devices are physically installed in the system
  • Tier 0 devices are consistent across controller nodes
  • Hardware meets specifications from the Node Sizing Guide

Steps

1. Identify Device Paths

  1. Navigate to System > vSAN Diagnostics from the Main Dashboard
  2. Select Get Node Device List from the Query dropdown
  3. Click Send
  4. Identify unused devices (marked as "vsan = false")
  5. Note the device paths (/dev/sd*) for each controller node

Tip

Verify current vSAN drive assignments by checking vSAN Tiers > [select tier] > Drives to avoid selecting drives already in use.

2. Add Drives to Tier 0

For each drive:

  1. In vSAN Diagnostics:
    • Set Query to Add Drive to vSAN
    • Select the appropriate Node (node0 or node1)
    • Enter the correct Path for the device
    • Set Tier to Tier 0
    • Configure Swap setting

Swap Configuration

  • Enable swap on only ONE storage tier
  • If swap is enabled on another tier, disable it for Tier 0
  • Contact VergeOS Support for guidance on swap configuration if needed
  1. Enter the verification phrase: Yes I know what I'm doing
  2. Click Send to execute

3. Verify Configuration

  1. Monitor the system dashboard for tier status - Status will show "online-no redundancy" during meta migration
  2. Refresh node information: - Navigate to each controller node's dashboard - Select Refresh > Drives & NICs

Post-Configuration

Monitor the vSAN tier status in the system dashboard. The tier should transition from "online-no redundancy" to "online" once meta migration completes.

Additional Resources


Document Information

  • Last Updated: 2024-11-25
  • VergeOS Version: 4.13

Setting Up Storware on VergeOS

This guide outlines the steps for configuring Storware on VergeOS to protect your virtual machines.

For more comprehensive information on Storware's capabilities and additional backup configuration options, visit the Storware Backup and Recovery Documentation.

Prerequisites

  • VergeOS on version 4.13 or higher.
  • Access to a Storware Backup and Recovery instance on version 7 or higher.
  • Credentials for an account with the appropriate permissions to configure both VergeOS and Storware.

Setup a dedicated Verge NAS Service for Storware

  1. Deploy the NAS Service:
  1. Configure NFS Settings:
  • Before powering on the NAS service, click on Edit NFS Settings.
  • Enable NFSv4 by selecting the checkbox for this option.
  • Click Submit to save the changes.
  • Power on the NAS service.

Depending on the size of your environment you may want to increase the amount of CPU and RAM for the NAS Service. Storware recomends 8 cores and 12 GB of RAM as a good starting point


Adding Your VergeOS System to Storware

  1. Log in to Storware:
  • Access the Storware Backup and Recovery management console.
  1. Add VergeOS as a Virtual Environment:
  • Navigate to Virtual Environments > Virtualization Providers and click Create.
  • Select VergeOS as the Virtualization Provider.
  1. Configure the Connection Details:
  • General Tab:

    • URL: Enter the VergeOS URL in the format https://<VERGE_IP>.
    • Username: Provide the username for VergeOS.
    • Password: Enter the password for the specified user.
    • Verge Settings Tab:
    • Enter the name of the NAS service created in the previous step.
  1. Test the Connection:
  • Select the newly added Verge system from the list
  • Click Test Connectivity to verify that Storware can successfully communicate with the VergeOS environment.

Important Notes

NFS Version Selection

Enabling NFSv4 on VergeOS ensures compatibility with modern backup solutions like Storware, providing improved security and performance.

Snapshot Optimization

Using Storware's snapshot management in conjunction with VergeOS’s built-in vSAN capabilities allows for efficient incremental backups, reducing the time and storage required for VM protection.


Feedback

Need Help?

If you have any questions or encounter issues while setting up Storware on VergeOS, please reach out to our support team for assistance.


Document Information

  • Last Updated: 2024-11-07
  • VergeOS Version: 4.13