MVM Clusters

Important

MVM clusters are currently in a closed beta with a selected subset of the customer base. It is not for production use and is not available to appliances that are not part of the closed beta program. Expect that MVM Cluster capabilities, as well as this documentation, will change significantly over the coming months.

MVM virtualization solution is a hypervisor clustering technology utilizing KVM. Beginning with just a few basic Ubuntu boxes, Morpheus can create a cluster of hypervisor hosts complete with monitoring, failover, easy migration of workloads across the cluster, and zero-downtime maintenance access to hypervisor host nodes. All of this is backed by Morpheus Tenant capabilities, a highly-granular RBAC and policy engine, and Instance Type library with automation workflows.

Features

Host Features

  • Automated MVM cluster provisioning

  • Ceph storage configuration for multi-node clusters

  • Cluster and individual host monitoring

  • Add hosts to existing clusters

  • Console support for cluster hosts

  • Add, edit and remove networks and data stores from clusters

  • Gracefully take hosts out of service with maintenance mode

  • Migration of workloads across hosts

  • Configurable automatic failover of running workloads when a host is lost

  • Integration with Morpheus costing

  • Governance through Morpheus RBAC, Tenancy, and Policies

VM Features

  • Workload provisioning and monitoring

  • Console support for running workloads

  • Migration of VMs across hosts

  • Configure automatic failover for individual VMs in the event a host is lost

  • Reconfigure running workloads to resize plan, add/remove disks, and add/remove network interfaces

  • Backup and restore MVM workloads

  • Take snapshots and revert to snapshots

  • Morpheus library and automation support

  • Integration with Morpheus costing features

Base Cluster Details

An MVM cluster using the hyperconvered (HCI) Layout consists of at least three hosts. Physical hosts are recommended to experience full performance of the MVM solution. In smaller environments, it is possible to create an MVM cluster with three nested virtual machines, a single physical host (non-HCI only), or a single nested virtual machine (non-HCI only) though performance may be reduced. With just one host it won’t be possible to migrate workloads between hosts or take advantage of automatic failover. Currently, a host must be a pre-existing Ubuntu 22.04 box with environment and host system requirements contained in this section. Morpheus handles cluster configuration by providing the IP address(es) for your host(s) and a few other details. Details on adding the cluster to Morpheus are contained in the next section.

Hardware Requirements

  • Operating System: Ubuntu 22.04

  • CPU: One or more 64-bit x86 CPUs, 1.5 GHz minimum with Intel VT or AMD-V enabled

  • Memory: 4 GB minimum. For non-converged Layouts, configure MVM hosts to use shared external storage, such as an NFS share or iSCSI target. Converged Layouts utilize Ceph for clustered storage and require a 4 GB minimum memory per Ceph disk

  • Disk Space: For converged storage, a data disk of at least 500 GB is required for testing. More storage will be needed for production clusters. An operating system disk of 15 GB is also required. Clusters utilizing non-converged Layouts can configure external storage (NFS, etc.) while Morpheus will configure Ceph for multi-node clusters

  • Network Connectivity: MVM hosts must be assigned static IP addresses. They also need DNS resolution of the Morpheus appliance and Internet access in order to download and install system packages for MVM dependencies, such as KVM, Open vSwitch (OVS), and more

Note

Ubuntu 22.04 uses netplan for networking. To configure a static IP address, change into the directory holding the config files (cd /etc/netplan) and edit the existing configuration file (/etc/netplan/50-cloud-init.yaml or /etc/netplan/00-installer-config.yaml or /etc/netplan/01-netcfg.yaml). If desired, backup the existing configuration prior to editing it (cp /etc/netplan/<file-name>.yaml /etc/netplan/<file-name>.yaml.bak). For additional information on configuration file formatting, refer to netplan documentation. Once the configuration is updated, validate and apply it (netplan try). The try command will validate the configuration and apply it if it’s valid. If invalid, it will automatically be rolled back.

Note

Clustered storage needs as much network bandwidth as possible. Network interfaces of at least 10 Gbps with jumbo frames enabled are required for clustered storage and for situations when all traffic is running through the management interface (when no compute or storage interface is configured). It’s highly likely that performance will be unacceptable with any lower configurations.

MVM Network Communication Ports

Description

Source

Destination

Port

Protocol

Morpheus Agent communication with the Morpheus appliance

MVM Host

Morpheus appliance server

443

TCP

MVM host configuration and management

Morpheus appliance server

MVM Host

22

TCP

MVM interhost communication for clustered deployments

MVM Host

MVM Host

22

TCP

Morpheus server SSH access for deployed virtual machines

Morpheus appliance server

MVM-hosted virtual machines

22

TCP

Morpheus server WinRM (HTTP) access for deployed virtual machines

Morpheus appliance server

MVM-hosted virtual machines

5985

TCP

Morpheus server WinRM (HTTPS) access for deployed virtual machines

Morpheus appliance server

MVM-hosted virtual machines

5986

TCP

Ceph Storage

MVM Host

MVM Host

3300

TCP

Ceph Storage

MVM Host

MVM Host

6789

TCP

Ceph MDS/ODS

MVM Host

MVM Host

6800-7300

TCP

Example Cluster Deployment

In this example cluster, each host box consists of:

  • 4 vCPU

  • 16 GB memory

  • 20 GB OS boot disk

  • 250 GB data disk (deployed to /dev/dsb)

  • 3 network interfaces for management, storage, and compute traffic (set to eth0, eth1, and eth2, respectively)

Note

250 GB data disks used in this example are simply for demonstration purposes. A typical test cluster should consist of at least 500 GB storage and more will be required for production. Do not raid disks on physical servers. Multiple disks may be used and they will be added to the total Ceph storage in one large volume. In the DATA DEVICE configuration during cluster setup, give a comma-separated list of disk devices if required.

MVM clusters must also live in Morpheus-type Clouds (See Infrastructure > Clouds). A pre-existing Morpheus Cloud may be used or a new Cloud could be created to handle MVM management.

Provisioning the Cluster

As mentioned in the previous section, this example is starting with three provisioned Ubuntu 22.04 boxes. I also have a Morpheus-type Cloud to house the cluster. Begin the cluster creation process from the Clusters list page (Infrastructure > Clusters). Click + ADD CLUSTER and select “MVM Cluster”.

../../_images/createCluster.png

Morpheus gives the option to select a hyperconverged infrastructure (HCI) LAYOUT or non-HCI. In this example, the HCI Layout is used (requires a three-node minimum). Next, configure the names and IP addresses for the host boxes (SSH HOST). The SSH HOST name configuration is simply a display name in Morpheus, it does not need to be a hostname. By default, configuration space is given for three hosts which is what this example cluster will have. You must at least configure one and it’s possible to add more by clicking the (+) button. The SSH PORT is pre-configured for port 22, change this value if applicable in your environment. Next, set a pre-existing user on the host boxes (SSH USERNAME and SSH PASSWORD) and SSH KEY. Use a regular user with sudo access.

../../_images/createClusterTop.png

In the next part of the modal, you’ll configure the storage devices and network interfaces. When Ceph initializes, it needs to be pointed to an initial data device (or devices). Configure this in the DATA DEVICE field. Multiple devices may be given in a comma-separated list and will be added to the total Ceph storage as one large volume. Find your disk names, if needed, with the lsblk command. In my case, the target device is located at /dev/sdb.

Though not strictly required, it’s recommended to have separate network interfaces to handle cluster management, storage traffic, and compute. In this example case, eth0 is configured as the MANAGEMENT NET INTERFACE which handles communication between the cluster hosts. eth1 is configured as the STORAGE NET INTERFACE and eth2 is configured as the COMPUTE NET INTERFACE. The COMPUTE VLANS field can take a single value (ex. 1) or a range of values (ex. 22-25). This will create OVS port group(s) selectable as networks when provisioning workloads to the cluster. If needed, you can find your network interface names with the ip a command.

Finally, only one CPU TYPE is currently supported (x86_64) though this may change in the future. For CPU MODEL configuration, we surface the entire database of model configurations from libvirt. If unsure or if you don’t know of a specific reason to choose one or the other, select host-model which is the default option.

../../_images/createClusterBottom.png

At this point we’ve kicked off the process for configuring the cluster nodes. Drill into the Cluster detail page and click on the History tab. Here we can monitor the progress of configuring the cluster. Morpheus will run scripts to install KVM, install Ceph, install OVS, and to prepare the cluster. In just a short time, the cluster provisioning should complete and the cluster will be ready to deploy workloads.

Provisioning a Workload

At this point, the cluster is ready for workloads to be provisioned to it. The system default Ubuntu Instance Type contains a compatible Layout for MVM deployment. Add an Instance from the Instances list page (Provisioning > Instances). After selecting the Instance Type, choose a Group that allows for selection of the Morpheus-type Cloud containing the MVM cluster.

../../_images/groupCloud.png

After moving to the next tab, select a Plan based on resource needs. From the RESOURCE POOL field, select the desired MVM cluster. When configuring VOLUMES for the new workload, note that space can be claimed from the Ceph volume. Within NETWORKS, we can add the new workload to one of the VLANS set up as part of cluster creation. Finally, note that we can choose the HOST the workload should run on.

../../_images/configureTab.png

Review and complete the provisioning wizard. After a short time, the workload should be up and running. With a workload now running on the cluster, we can take a look at some of the monitoring, migration, failover, and other actions we can take for workloads running on MVM clusters.

Monitoring the Cluster

With the server provisioned and a workload running, take a look at the monitoring and actions capabilities on the cluster detail page (Infrastructure > Clusters, then click on the new MVM cluster). View cluster performance and resource usage (Summary and Monitoring tabs), drill into individual hosts (Hosts tab), see individual workloads (VMs tab), and more.

Moving Workloads Between Hosts

To manually move workloads between hosts, drill into the detail page for the VM (from the VMs tab of the cluster detail page). Click ACTIONS and select “Manage Placement”. Choose a different host and select from the following placement strategies:

  • Auto: Manages VM placement based on load

  • Failover: Moves VMs only when failover is necessary

  • Pinned: Will not move this workload from the selected host

../../_images/managePlacement.png

Within a short time, the workload is moved to the new host.

Adding hosts

The process of adding hosts to a pre-existing cluster is very similar to the process of provisioning the cluster initially. The requirements for the new worker node will be identical to the nodes initially added when the cluster was first provisioned. See the earlier sections in this guide for additional details on configuring the worker nodes.

Too add the host, begin from the MVM Cluster detail page (selected from the list at Infrastructure > Clusters). From the Cluster detail page, click ACTIONS and select “Add Worker”. Configurations required are the same as those given when the cluster was first created. Refer to the section above on “Provisioning the Cluster” for a detailed description of each configuration.

Once Morpheus has completed its configuration scripts and joined the new worker node to the cluster, it will appear in a ready state within the Hosts tab of the Cluster detail page. When provisioning workloads to this Cluster in the future, the new node will be selectable as a target host for new Instances. It will also be an available target for managing placement of existing VMs running on the cluster.

Note

It’s useful to confirm all scripts related to creating the new host and joining the new host to the cluster completed successfully. To confirm, navigate to the detail page for the new host (Infrastructure > Clusters > Selected Cluster > Hosts Tab > Selected Host) and click on the History tab. Confirm all scripts, even those run on the pre-existing hosts, completed successfully as it’s possible the new host was added successfully (green status) but failed in joining the cluster. When such a situation occurs it may appear adding the new host was successful though it will not be possible to provision workloads onto it due to not joining the cluster successfully.

../../_images/addHost.png

Maintenance Mode

MVM cluster hosts can be easily taken out of service for maintenance when needed. From the host detail page, click ACTIONS and then click “Enter Maintenance.” When entering maintenance mode, the host will be removed from the pool. Live VMs that can be migrated will be moved to new hosts. VMs that are powered off will also be moved when possible. When a live VM cannot be moved (such as if it’s “pinned” to the host), the host will not go into maintenance mode until that situation is cleared. You could manually move a VM to a new host or you could power it down if it’s non-essential. After taking that action, attempt to put the host into maintenance mode once again. Morpheus UI provides a helpful dialog which shows you which VMs live on the host are to be moved as the host goes into maintenance mode. When maintenance has finished, go back to the ACTIONS menu and select “Leave Maintenance.”

../../_images/enterMaintenance.png

Failover

MVM supports automatic failover of running workloads in the event of the loss of a host. Administrators can control the failover behavior through the “Manage Placement” action on any running VM. From the VM detail page, click ACTIONS and select “Manage Placement”. Any VM with a placement strategy of “Auto” or “Failover” will be eligible for an automatic move in the event its host is lost. When the loss of a host does occur, the workload will be up and running from a different cluster host within just a short time if it’s configured to be moved during an automatic failover event. Any VMs pinned to a lost host will not be moved and will not be accessible if the host is lost. When the host is restored, those VMs will be in a stopped state and may be restarted if needed.

This three-node cluster has three VMs running on the first host:

../../_images/threeRunningWorkloads.png

Each of these VMs is configured for a different failover strategy. When the host is lost, we should expect to see the first two VMs moved to an available host (since they have the “Auto” and “Failover” placement strategies, respectively). We should not see the third VM moved.

../../_images/auto.png ../../_images/failover.png ../../_images/pinned.png

After loss of the host these three VMs were running on, we can see the lost host still has one associated VM in a stopped state. The other two VMs are running on a second host which is still available.

../../_images/lostNode.png ../../_images/liveNode.png

When the lost host returns, the moved VMs will come back to their original host. The third VM is associated with this host as well and is in a stopped state until it is manually restarted.

Decommissioning a CEPH-backed Host

Morpheus MVM clusters utilize global pools and for that reason, we need to remove the object storage daemon (OSD) from each host manually prior to decommissioning the host and removing it from the cluster.

First, put the host into maintenance mode and allow time for any running VMs to be migrated to other hosts. See the section above, if needed, for additional details on maintenance mode.

Outing the OSDs

Begin by checking the cluster health. If the cluster is not in a healthy state, an OSD should not be removed:

ceph -s

You should see something similar to the following:

$ ceph -s
  cluster:
    id:       bxxxx-bxxxxx-4xxx...
    health:   HEALTH_OK

Important

Do not remove an OSD if the cluster health does not return HEALTH_OK.

Get the OSD IDs. The following command will return a map of OSDs and their ID values:

ceph osd df tree

We’re now ready to out the OSD, do so with the following command:

ceph osd out osd.<osd-id>

Wait for the cluster to rebalance. Do not remove any additional OSDs until the cluster has rebalanced. As above, you can use ceph -s to check cluster status. Wait until something like this:

data:
  volumes: 1/1 healthy
  pools:   5 pools, 593 pgs
  objects: 6.69k objects, 19 GiB
  usage:   48 GiB used, 2.9 TiB / 2.9 TiB avail
  pgs:     677/20079 objects degraded (3.372%)
           1115/20079 objects misplaced (5.553%)
           567 active+clean
           13  active+recovery_wait+degraded
           6   active+remapped+backfill_wait
           6   active+recovery_wait+undersized+degraded+remapped
           1   active+recovering+undersized+degraded+remapped

…becomes something like this:

data:
  volumes: 1/1 healthy
  pools:   5 pools, 593 pgs
  objects: 6.69k objects, 19 GiB
  usage:   53 GiB used, 2.9 TiB / 2.9 TiB avail
  pgs:     593 active+clean

This process must be completed for each OSD that is to be removed. Once again, wait for the cluster to rebalance between each OSD removal.

Stopping OSD service

We can now stop and remove the OSD service for each OSD that should be removed. Stop the OSD service:

systemctl stop ceph-osd@<osd-id>.service

Remove the OSD service:

systemctl disable ceph-osd@<osd-id>.service

Removing OSDs from the CRUSH map

Remove the OSDs from the CRUSH map:

ceph osd crush remove ods.<osd-id>

This must be repeated for each OSD that should be removed. Next, validate the removal:

ceph osd crush tree

At this point once again, wait for the cluster rebalance to complete. Run ceph -s and look for a healthy state similar to the following:

data:
  volumes: 1/1 healthy
  pools:   5 pools, 593 pgs
  objects: 6.69k objects, 19 GiB
  usage:   53 GiB used, 2.9 TiB / 2.9 TiB avail
  pgs:     593 active+clean

Remove the Ceph Monitor (ceph-mon) service

First find the service:

systemctl --type=service --state=running | grep ceph-mon

The service should look something like: ceph-mon@<hostname provided at cluster provision time>.service

Stop the service:

systemctl stop ceph-mon@<hostname>.service

Remove the monitor by its ID. The ID is the part between “ceph-mon@” and “.service”. Generally, this is the hostname.

ceph mon remove <hostname>

Remove the hostname from CRUSH

ceph osd crush rm <hostname>

Check the cluster health once again to confirm the cluster is in a healthy state:

ceph -s

Final Steps

Cleanup the OSD auth. Repeat this step for each OSD that must be removed.

ceph auth del osd.<osd-id>

Validate the removal:

ceph auth list

Remove the last of the data and repeat this step for each OSD that should be removed:

ceph osd rm <osd-id>

Important

Note that the above command does not prepend “osd.” before the OSD ID.

At this point you can now delete the host cluster from Morpheus.