VergeOS vSAN Block-Level Architecture and Data Distribution#
Overview#
VergeOS vSAN employs a sophisticated block-level architecture that forms the foundation of its distributed storage system. This architecture enables efficient data distribution, high availability, and optimal performance across the entire storage infrastructure.
Related Documentation#
- Scale Out Guide - Detailed instructions for adding nodes to expand capacity
- Scaling Up a vSAN - Guide for increasing resources on existing nodes
Block-Level Operations#
Data Block Management#
- Block Creation:
- VM disks are divided into multiple blocks
- Each block is assigned a unique cryptographic hash
- Block size is optimized for performance and efficiency
- Metadata tracks block relationships and locations
Hash-Based Distribution#
-
Block Identification:
- Each data block receives a cryptographic hash value
- Hash serves as a unique identifier for the block
- Used for both location mapping and deduplication
-
Distribution Algorithm:
- Blocks are distributed based on hash values
- Ensures even distribution across available nodes
- Prevents hot spots in the storage system
- Facilitates efficient data retrieval
Data Distribution Architecture#
Primary Storage#
- Block Placement:
- Primary copy of each block stored on optimal node
- Placement determined by hash-based algorithm
- Considers storage tier requirements
- Optimizes for performance and capacity
Primary Storage#
- Access Patterns:
- Reads prioritize single-copy access for efficiency
- System defaults to reading from primary copy
- Automatically reads from redundant copy if primary is slow/unresponsive
- Optimizes by reading from local redundant copy when on same node
- Write operations always update both primary and redundant copies
- Automatic redistribution as needed
Data Access#
-
Read Operations:
- Quick block location lookup via hash
- Intelligent source selection:
- Prioritizes primary copy
- Uses local redundant copy when on same node
- Fails over to redundant copy if primary is unresponsive
- Optimized for minimal network traffic
- Performance optimization through locality awareness
-
Write Operations:
- New block hash generation
- Simultaneous update of primary and redundant copies
- Guaranteed write consistency across copies
- Metadata updates
- Consistency maintenance
Redundant Storage#
-
Redundancy Management:
- Secondary copies maintained for data protection
- Distribution across different nodes
- Automatic synchronization of copies
- Configurable redundancy levels
-
Failover Handling:
- Automatic failover to redundant copies
- Transparent to applications and VMs
- Immediate availability during node failures
- Self-healing capabilities
Hash Map Functionality#
Core Components#
-
Hash Map Structure:
- Maps block hashes to physical locations
- Maintains block metadata
- Tracks redundant copies
- Handles version control
-
Location Tracking:
- Real-time block location updates
- Efficient lookup mechanisms
- Optimized for large-scale systems
- Supports dynamic redistribution
Cross-Node Distribution#
Distribution Mechanics#
-
Node Management:
- Dynamic node addition and removal
- Automatic rebalancing
- Workload distribution
- Resource optimization
-
Data Flow:
- Inter-node communication protocols
- Efficient data transfer
- Bandwidth optimization
- Latency management
Performance Optimization#
Data Access Optimization#
-
Caching:
- Block-level cache management
- Frequently accessed data optimization
- Cache coherency maintenance
- Performance acceleration
-
I/O Path:
- Optimized read/write paths
- Minimal hop routing
- Direct block access
- Reduced latency
Efficiency Features#
- Deduplication:
- Block-level deduplication
- Hash-based identification
- Space efficiency
- Performance impact management
Compression
VergeOS vSAN does not perform inline compression on stored data. Compression is only applied when syncing data between sites over the network to optimize bandwidth usage.
System Resilience#
Fault Tolerance#
-
Node Failures:
- Automatic failure detection
- Immediate failover
- Data accessibility maintenance
- Recovery initiation
-
Network Issues:
- Path redundancy
- Alternative route selection
- Communication reliability
- Performance maintenance
Data Integrity#
-
Block Validation:
- Continuous integrity checking
- Hash validation
- Corruption detection
- Automatic repair initiation
-
Consistency Maintenance:
- Transaction consistency
- Data coherency
- Version control
- Synchronization management
Scaling Considerations#
Horizontal Scaling (Scaling Out)#
-
Node Addition:
- Seamless integration of new nodes
- Requires minimum of two nodes per cluster for redundancy
- New nodes must match existing cluster configuration:
- Processor type
- Memory configuration
- Physical disk drive configuration
- Maintains N+1 redundancy for high availability
- Automatic data redistribution
- Performance optimization
- Capacity expansion
-
Cluster Expansion:
- Linear scalability
- Option to create new clusters if matching nodes unavailable
- Each new cluster requires minimum of two matching nodes
- Resource optimization
- Performance maintenance
- Balanced distribution
Vertical Scaling (Scaling Up)#
- Resource Enhancement:
- Storage capacity increase:
- Requires equal drive additions across all cluster nodes
- Maintains balanced storage distribution
- Memory expansion:
- Requires maintenance mode before power off
- Ensures graceful workload migration
- Performance improvement
- Capability expansion
- Efficiency optimization
- Storage capacity increase:
Important
Consult with our support team to determine the optimal expansion strategy for your specific environment.