NAME
gyptazy.ch

OPTIONS

CONTENT
ProxLB - (Re)Balance VM Workloads Across Nodes in Proxmox Clusters. (2024-07-06):
ProxLB (PLB) is an open-source Proxmox loadbalancer, but different! ProxLB is an application created to optimize the distribution of virtual machines (VMs) across Proxmox cluster nodes for significantly enhancing efficiency and performance. Utilizing the Proxmox API, ProxLB gathers and analyzes a comprehensive set of resource metrics from both the cluster nodes and the running VMs, including CPU usage, memory consumption, and local disk utilization.

A key feature of ProxLB is its intelligent rebalancing capability, which redistributes VMs based on their memory, disk, or CPU usage. In those cases, the real memory consumption from the VM is taken instead of the potential maximum usage. This ensures no single node is overburdened while others remain underutilized, significantly enhancing cluster performance and reliability. By evenly distributing resources, ProxLB helps prevent performance bottlenecks and improves the overall stability of the cluster. Efficient rebalancing leads to better utilization of available resources, potentially reducing the need for additional hardware investments and lowering operational costs. Moreover, automated rebalancing reduces the need for manual interventions, allowing operators to focus on other critical tasks, thereby increasing operational efficiency.

Features
* Rebalance the cluster by:
    * Memory
    * Disk (only local storage)
    * CPU
* Performing
    * Periodically
    * One-shot Solution
* Filter
     * Exclude nodes
     * Exclude virtual machines
* Balancing
     * Based on current resource usage (efficincy)
     * Based on assigned resources (avoid overprovisioning)
* Grouping
     * Include groups (VMs that are rebalanced to nodes together)
     * Exclude groups (VMs that must run on different nodes)
     * Ignore groups (VMs that should be untouched)
* Dry-run support
     * Human readable output in CLI
     * JSON output for further parsing
     * Migrate VM workloads away (e.g. maintenance preparation)
* Fully based on Proxmox API
* Usage
     * One-Shot (one-shot)
     * Periodically (daemon)
     * Proxmox Web GUI Integration (optional)
* API Interface
     * Providing best/optimal new node (for automated new VM placing)
     * Current VM & node statistics
     * Rebalanced VM & node statistics
* Docker Support

The cluster rebalancing process optimizes memory, local disk storage, and CPU usage, and can be executed either periodically or as a one-time solution. It includes filtering options to exclude specific nodes and virtual machines. Grouping mechanisms enable rebalancing groups of VMs together, ensuring certain groups run on different nodes, and keeping other groups untouched.

Dry-run support provides human-readable output in the CLI and JSON output for further parsing. The system also facilitates migrating VM workloads away from nodes, useful for maintenance preparation. The entire process is fully based on the Proxmox API.

Rebalancing can be initiated either as a one-time action or periodically as a daemon, with optional integration into the Proxmox Web GUI. This approach also includes features similar to Broadcom's DRS for ESXi.

ProxLB integrated in the Proxmox Web UI ProxLB can also be used from from the Proxmox Web UI where rebalancing can be triggered.
A frequently and recurring rebalancing of VMs can also be done automatically by using
the shipped systemd unit file. This process ensures to keep the VM workloads
efficiencly distributed across all nodes in the Proxmox cluster.
Installation
ProxLB support various operating systems like Debian, Ubuntu, RedHat, CentOS, RockyLinux but also FreeBSD, NetBSD and OpenBSD systems. ProxLB is fully written in Python and only needs Python3 and the proxmoxer library. Therefore, it can be used on any system that can reach the Proxmox API. Ready to use packages can be found in the resources chapter. Let us have a short look how easy it is to use ProxLB:

  #> wget https://cdn.gyptazy.ch/files/amd64/debian/proxlb/proxlb_0.9.9_amd64.deb
  #> dpkg -i proxlb_0.9.9_amd64.deb
  #> # Adjust your config
  #> vi /etc/proxlb/proxlb.conf
  #> systemctl restart proxlb
  #> systemctl status proxlb

Integrating ProxLB into the Proxmox Web UI requires the additional package pve-proxlb-ui-plugin package which is only availavle for Debian based systems and must be installed on all Proxmox nodes. Afterwards, the services pvedaemon and pve-proxy should be restartet. A new menu item in the web ui is availavle. The new menu item can be found on: Datacenter -> HA -> Rebalancing.

Usage
There a multiple different ways to use and also different usecases for ProxLB. ProxLB can be used as a one-shot solution that just rebalances you VM workloads across the cluster once. You can also use the shipped systemd unit proxlb to rebalance the vm workloads more frequently where the interval can be defined witihn the config file. This ensures to have a balanced cluster all the time where also dynamic changes in resource usages of vms are continously being rebalanced. For click-ops is also an optional Proxmox web ui integration available where this can also be triggered on a graphical interface.

Beside this, there are many use cases to run ProxLB:

Rebalancing:
A key feature of ProxLB is its intelligent rebalancing capability, which redistributes VMs based on their memory, disk, or CPU usage. In those cases, the real memory consumption from the VM is taken instead of the potential maximum usage. This ensures no single node is overburdened while others remain underutilized, significantly enhancing cluster performance and reliability. By default, the balancing will be done by memory but there might also be additional use cases, like databases or AI workloads which might have higher consumption of cpu/gpu workloads. Also a case might be that storage VMs are being served which have a huge consumption on the filesystem. However, disk balancing might only make sense when using local storage and should not be used with shared storage.

Balancing Modes:
Offering two different modes for rebalancing the resources, ProxLB offers the full flexibility to schdule resources across you Proxmox cluster. Rebalancing can be done by the current used resources of virtual machines. This ensures to always use the underlying resources of the nodes in the most efficient way and might become interesting when running overprovisioned clusters. A different approach is rebalancing by the assigned resources. In this case, it rebalances the resources by the assigned resources of the virtual machines and tries to avoid any overprovisioning. Of course, this is only possible if the overall assigned resources do not exceed the cluster's overall resources. This option might be interesting to guarantee the defined resources to virtual machines.

Maintenance Preparation:
ProxLB also allows you to move all workloads away from a specific node to other nodes in the cluster by using the filter options. This might become handy for maintenance preparations for (a) specific node(s).

Filters:
Filters allow you to exclude one or more host nodes or one or more VMs from being relocated. This might make sense when there are dedicated special nodes present (e.g. with a bigger CPU for database workloads, different storage setups like NVMe-oF etc.) or VMs should be prevented from being relocated (e.g. data policies, cpu pinning, etc.).

Example:
In this example all VMs are placed on the virt01 node. After running ProxLB the VMs are being rebalanced across all nodes in the cluster to ensure that all nodes have an equal resource usage.


Motivation
As a developer managing a cluster of virtual machines for my projects (e.g. BoxyBSD.com), I often faced the challenge of resource imbalance. Nodes within the cluster would become unevenly loaded, leading to inefficiencies, performance bottlenecks, and increased operational costs. Frustrated by the lack of an adequate solution, I developed ProxLB (PLB) to ensure better resource distribution across my clusters. My primary motivation for creating PLB stemmed from my work on the BoxyBSD project, where maintaining balanced nodes while running various VM workloads was a constant struggle. The absence of an efficient rebalancing mechanism made it challenging to achieve optimal performance and stability. Recognizing the necessity for a tool that could gather and analyze resource metrics from both the cluster nodes and the running VMs, I embarked on developing ProxLB.

As an advocate of the open-source philosophy, I believe in the power of community and collaboration. By sharing solutions like ProxLB, I aim to contribute to the collective knowledge and tools available to developers facing similar challenges. Open source fosters innovation, transparency, and mutual support, enabling developers to build on each other's work and create better solutions together.

Resources:
* ProxLB Source GitHub
* Docker Image: cr.gyptazy.ch/proxlb/proxlb:latest
* Package: ProxLB (Debian)
* Package: ProxLB (Ubuntu)
* Package: ProxLB (RedHat/RockyLinux)