Node Diagnostics Guide#
Overview#
The Node Diagnostics tool provides comprehensive hardware and system-level troubleshooting capabilities for individual VergeOS nodes. These diagnostic commands enable system administrators to perform detailed hardware analysis, monitor system performance, and troubleshoot node-specific issues directly from the VergeOS interface.
What You'll Learn
- How to access and use node diagnostic tools
- Understanding of each diagnostic command and its hardware focus
- Best practices for node-level troubleshooting
- When to use specific diagnostic commands for hardware issues
Prerequisites#
- Access to VergeOS interface with node management privileges
- Basic understanding of server hardware components
- Knowledge of networking and storage concepts
Node Access
Node diagnostics provide deep hardware-level access and should be used by experienced system administrators.
Accessing Node Diagnostics#
-
Navigate to Node Diagnostics: - From the Main Dashboard, click Nodes in the left menu - Select the desired node from the list - Click Diagnostics in the left menu
-
Using Diagnostic Commands: - Select desired command from the Query dropdown menu - Configure available parameters on the right side - Click Send → to execute the command
Command Visibility
Enable "Show Command" to view the exact command being executed, useful for SSH execution or script automation.
Diagnostic Commands Reference#
ARP Scan#
Purpose: Discovers active devices on the node's network interfaces using ARP packets.
When to Use:
- Network connectivity verification from node perspective
- Identifying devices on node's local network segments
- Troubleshooting node network configuration
Parameters:
- Interface: Network interface to scan from
CLI Syntax:
ARP Table#
Purpose: Displays the node's ARP cache showing IP-to-MAC address mappings.
When to Use:
- Troubleshooting node network connectivity
- Verifying network neighbor discovery
- Checking for ARP conflicts affecting the node
CLI Syntax:
Bridge Addresses#
Purpose: Shows MAC address tables for network bridges on the node.
When to Use:
- Troubleshooting network bridge configuration
- Verifying MAC learning on virtual switches
- Network topology analysis
CLI Syntax:
Clear Persistent Storage#
Purpose: Clears persistent storage caches and temporary data on the node.
When to Use:
- Resolving storage-related performance issues
- Clearing corrupted cache data
- Preparing for storage reconfiguration
Data Impact Warning
This command may affect system performance and should only be used under support guidance.
CLI Syntax:
DMI Table#
Purpose: Displays Desktop Management Interface (DMI) information including hardware details.
When to Use:
- Hardware inventory and identification
- Verifying system specifications
- Troubleshooting hardware compatibility
CLI Syntax:
DNS Lookup#
Purpose: Tests DNS resolution from the node's perspective.
When to Use:
- Troubleshooting node DNS configuration
- Verifying external connectivity from node
- Testing name resolution services
Parameters:
- Hostname: Target hostname to resolve
- DNS Server: Specific DNS server to query
CLI Syntax:
Ethernet Tool#
Purpose: Provides detailed ethernet interface information and configuration.
When to Use:
- Diagnosing network interface issues
- Checking link speed and duplex settings
- Troubleshooting physical network connectivity
Parameters:
- Interface: Network interface to examine
CLI Syntax:
Fabric Configuration#
Purpose: Displays VergeOS fabric network configuration for the node.
When to Use:
- Troubleshooting inter-node communication
- Verifying fabric network setup
- Diagnosing cluster connectivity issues
CLI Syntax:
IP#
Purpose: Advanced IP command access for interface and routing diagnostics.
When to Use:
- Detailed network interface analysis
- Routing table examination
- Advanced network troubleshooting
Parameters:
- Command: IP command options (route, addr, link, etc.)
CLI Syntax:
IPMI BMC Info#
Purpose: Displays Baseboard Management Controller information.
When to Use:
- Hardware management troubleshooting
- Verifying IPMI configuration
- Remote management diagnostics
CLI Syntax:
IPMI Chassis Status#
Purpose: Shows physical chassis status and power information.
When to Use:
- Power management troubleshooting
- Hardware status verification
- Physical system diagnostics
CLI Syntax:
IPMI FRU Info#
Purpose: Displays Field Replaceable Unit information from IPMI.
When to Use:
- Hardware inventory management
- Component identification
- Warranty and support information
CLI Syntax:
IPMI LAN Info#
Purpose: Shows IPMI network configuration details.
When to Use:
- IPMI network troubleshooting
- Remote management connectivity issues
- BMC network configuration verification
CLI Syntax:
IPMI MC Reset#
Purpose: Resets the Management Controller (BMC).
When to Use:
- Resolving IPMI communication issues
- BMC troubleshooting
- Management interface recovery
Management Impact
This will temporarily disrupt IPMI/BMC functionality during reset.
CLI Syntax:
IPMI Sensor Data Repository#
Purpose: Displays comprehensive sensor data repository information.
When to Use:
- Detailed hardware monitoring
- Sensor configuration verification
- Hardware diagnostics and analysis
CLI Syntax:
IPMI Sensors#
Purpose: Shows current sensor readings (temperature, voltage, fans, etc.).
When to Use:
- Hardware health monitoring
- Temperature and power diagnostics
- Environmental troubleshooting
CLI Syntax:
IPMI System Event Logs#
Purpose: Displays system event logs from IPMI.
When to Use:
- Hardware error analysis
- System event troubleshooting
- Historical hardware issue investigation
CLI Syntax:
LED Control (Drive)#
Purpose: Controls LED indicators on storage drives for physical identification.
When to Use:
- Physical drive identification
- Drive replacement procedures
- Hardware maintenance tasks
Parameters:
- Drive: Target drive identifier
- Action: LED on/off/blink
CLI Syntax:
Logs#
Purpose: Displays node system logs and journal entries.
When to Use:
- System troubleshooting
- Error analysis
- Performance issue investigation
CLI Syntax:
Network Bonding#
Purpose: Shows network bonding configuration and status.
When to Use:
- Network redundancy troubleshooting
- Bond interface diagnostics
- Link aggregation verification
CLI Syntax:
OpenSSL Speed#
Purpose: Tests cryptographic performance of the node's CPU.
When to Use:
- Performance benchmarking
- Cryptographic capability testing
- Hardware acceleration verification
CLI Syntax:
Ping#
Purpose: Tests network connectivity from the node.
When to Use:
- Basic connectivity testing
- Network path verification
- Latency measurement
Parameters:
- Destination: Target IP or hostname
- Count: Number of ping packets
CLI Syntax:
RAS Query#
Purpose: Queries Reliability, Availability, and Serviceability information.
When to Use:
- Hardware reliability assessment
- Error correction status
- Memory and processor diagnostics
CLI Syntax:
S.M.A.R.T. Diagnostic Test#
Purpose: Initiates SMART diagnostic tests on storage drives.
When to Use:
- Proactive drive health testing
- Storage troubleshooting
- Preventive maintenance
Parameters:
- Drive: Target drive for testing
- Test Type: Short, long, or conveyance test
CLI Syntax:
Test Duration
SMART tests can take significant time to complete, especially long tests.
S.M.A.R.T. Information#
Purpose: Displays SMART attributes and health information for drives.
When to Use:
- Drive health assessment
- Predictive failure analysis
- Storage performance monitoring
Parameters:
- Drive: Target drive identifier
CLI Syntax:
Show Block Devices#
Purpose: Lists all block devices visible to the node.
When to Use:
- Storage device inventory
- Drive recognition troubleshooting
- Storage configuration verification
CLI Syntax:
TCP Connection Test#
Purpose: Tests TCP connectivity to remote hosts and ports.
When to Use:
- Service connectivity verification
- Firewall rule testing
- Network troubleshooting
Parameters:
- Host: Target hostname or IP
- Port: TCP port number
CLI Syntax:
TCP Dump#
Purpose: Captures network packets for detailed traffic analysis.
When to Use:
- Network troubleshooting
- Security analysis
- Protocol debugging
Parameters:
- Interface: Network interface to monitor
- Filter: Packet filter expression
CLI Syntax:
Performance Impact
Packet capture can affect node performance. Use carefully in production.
Top CPU Usage#
Purpose: Shows processes consuming the most CPU resources.
When to Use:
- Performance troubleshooting
- Resource utilization analysis
- Process monitoring
CLI Syntax:
Top Network Usage#
Purpose: Displays processes with highest network utilization.
When to Use:
- Network performance analysis
- Bandwidth utilization troubleshooting
- Network-intensive process identification
CLI Syntax:
Trace Route#
Purpose: Traces network path from the node to a destination.
When to Use:
- Network routing troubleshooting
- Path analysis
- Connectivity issue diagnosis
Parameters:
- Destination: Target IP or hostname
CLI Syntax:
What's My IP#
Purpose: Shows the node's external IP address.
When to Use:
- External connectivity verification
- NAT configuration troubleshooting
- Network configuration validation
CLI Syntax:
Best Practices#
Hardware Diagnostics Workflow#
- System Overview: Start with DMI Table and IPMI Chassis Status
- Health Check: Review IPMI Sensors and System Event Logs
- Storage Analysis: Use S.M.A.R.T. Information and tests
- Network Verification: Check network bonding and interface status
- Performance Assessment: Monitor CPU and network usage
Preventive Maintenance#
- Regular Health Checks: Monitor IPMI sensors and SMART data
- Log Review: Regularly check system event logs
- Performance Monitoring: Track CPU and network utilization trends
- Drive Health: Schedule periodic SMART diagnostic tests
Safety Considerations#
- Impact Assessment: Consider the impact of diagnostic commands on production systems
- Support Coordination: Involve VergeOS support for complex hardware issues
- Documentation: Keep records of diagnostic results for trend analysis
- Change Control: Follow change management procedures for hardware modifications
Troubleshooting Common Node Issues#
Hardware Problems#
- Check IPMI Sensors for temperature, voltage, and fan issues
- Review IPMI System Event Logs for hardware errors
- Use S.M.A.R.T. Information to assess drive health
Network Issues#
- Verify Network Bonding status and configuration
- Use Ethernet Tool to check interface settings
- Test connectivity with Ping and Trace Route
Performance Issues#
- Monitor Top CPU Usage for resource constraints
- Check Top Network Usage for bandwidth utilization
- Review Logs for performance-related errors
Storage Issues#
- Use Show Block Devices to verify drive recognition
- Run S.M.A.R.T. Diagnostic Tests for comprehensive drive testing
- Check RAS Query for memory and storage error information
Next Steps#
After mastering node diagnostics, consider exploring:
- Advanced hardware monitoring and alerting
- Automated diagnostic scripting
- Integration with external monitoring systems
- Preventive maintenance scheduling
For complex hardware issues or unusual diagnostic results, contact VergeOS Support with detailed diagnostic output and system information.