Hot-Add CPU and Memory are powerful features in VMware vSphere that allow you to add virtual CPUs or memory to a running Virtual Machine (VM) without requiring a reboot. While this offers significant flexibility, enabling these features is not without drawbacks and requires careful consideration. Based on the potential risk, Expedient has chosen not to enable this by default for VMs but can enable it as-requested by clients.
✅ Pros of Enabling Hot-Add
Enabling Hot-Add capabilities provides several clear advantages for operational flexibility and VM uptime:
Zero Downtime Scaling: The primary benefit is the ability to increase vCPU or RAM resources on a live, running VM. This is critical for environments where services cannot afford a reboot to scale.
Performance Troubleshooting: It allows you to quickly add resources during a peak load event or performance investigation to see if resource exhaustion is the bottleneck, without interrupting the running application.
Simplified Resource Management: Reduces the need for scheduled maintenance windows simply to adjust a VM's compute resources.
❌ Cons and Risks of Enabling Hot-Add
While convenient, enabling Hot-Add CPU and Memory carries a few important caveats:
Feature | Consequence of Enabling |
|---|---|
Hot-Add CPU | Disables vNUMA Topology: Enabling Hot-Add CPU causes the VM to lose its optimal Non-Uniform Memory Access (NUMA) locality awareness. This means the guest OS may no longer be able to efficiently allocate memory close to the CPUs it uses, potentially leading to performance degradation in CPU-intensive or highly-threaded workloads |
Hot-Add Memory | Memory Reservation: To enable Hot-Add Memory, all of the VM's configured memory must be reserved. This can impact overall resource availability on the host and may lead to admission control failures when powering on other VMs. |
Both | One-Way Street: Resources can only be added, not removed, while the VM is running. A VM reboot is required to remove hot-added resources or to disable the Hot-Add feature. |
⚠️ Caveats and Compatibility Issues
Beyond the performance impact, administrators must be aware of potential conflicts and guest OS limitations related to Hot-Add:
vNUMA Performance Impact: The disabling of vNUMA when Hot-Add CPU is enabled can lead to serious performance issues, particularly on large VMs running NUMA-aware applications like SQL Server or Oracle. This is a well-documented concern in VMware's documentation.
Reference: Search the Broadcom/VMware knowledge base for "vNUMA is disabled if vCPU hotplug is enabled" or related performance white papers.
Virtualization Based Security (VBS) Conflict: On modern Windows Server and Windows 10/11 guests, enabling Virtualization Based Security (VBS)—which powers features like Credential Guard—will disable the functionality of Hot-Add CPU, Hot-Add Memory, and PCI Passthrough.
Reference: Consult the VMware Guest OS Compatibility Matrix or search for known conflicts between VBS and Hot-Add.
Guest OS Driver Issues: Some specific operating system versions may have issues correctly recognizing hot-added CPUs or require manual intervention:
Windows Server 2016: May not display hot-added processors in Device Manager and can show a non-functioning "HID Button over Interrupt Driver." This typically requires a manual driver update or moving to Windows Server 2019/later.
Older Linux Kernels: While modern kernels support Hot-Add, older Linux distributions may not recognize the resources until a reboot. Always verify support for your specific kernel version.
USB Passthrough Device Disconnection: Hot-adding vCPUs to a running VM will momentarily disconnect and reconnect any USB passthrough devices connected to that VM. This can interrupt services relying on that hardware.
🧐 Recommendation and Best Practices
The decision to enable Hot-Add should be based on a clear understanding of the VM's purpose and performance requirements:
High-Performance/Tier 1 Workloads (Databases, VDI): Keep Hot-Add CPU DISABLED. The potential performance degradation from losing optimal vNUMA is too great. Size these VMs generously and schedule downtime for resource changes.
General Purpose/Tier 3 Workloads (Web Servers, File Servers): If the application isn't highly sensitive to CPU/memory latency and the ability to scale without downtime is critical, Hot-Add may be enabled.
Hot-Add Memory Consideration: If you enable Hot-Add Memory, understand that you must fully reserve the VM's configured memory, which directly reduces the host's memory available for overcommitment.