LTS

    Innovation Version

      VM Live Migration

      Introduction

      Overview

      When a VM is running on a physical machine, the physical machine may be overloaded or underloaded due to uneven resource allocation. In addition, operations such as hardware replacement, software upgrade, networking adjustment, and troubleshooting are performed on the physical machine. Therefore, it is important to complete these operations without interrupting services. The VM live migration technology implements load balancing or the preceding operations on the premise of service continuity, improving user experience and work efficiency. VM live migration is to save the running status of the entire VM and quickly restore the VM to the original or even different hardware platforms. After the VM is restored, it can still run smoothly without any difference to users. Because the VM data can be stored on the current host or a shared remote storage device, openEuler supports shared and non-shared storage live migration.

      Application Scenarios

      Shared and non-shared storage live migration applies to the following scenarios:

      • When a physical machine is faulty or overloaded, you can migrate the running VM to another physical machine to prevent service interruption and ensure normal service running.
      • When most physical machines are underloaded, migrate and integrate VMs to reduce the number of physical machines and improve resource utilization.
      • When the hardware of a physical server becomes a bottleneck, such as the CPU, memory, and hard disk, replace the hardware with better performance or add devices. However, you cannot stop the VM or stop services.
      • Server software upgrade, such as virtualization platform upgrade, allows the VM to be live migrated from the old virtualization platform to the new one.

      Non-shared storage live migration can also be used in the following scenarios:

      • If a physical machine is faulty and the storage space is insufficient, migrate the running VM to another physical machine to prevent service interruption and ensure normal service running.
      • When the storage device of the physical machine is aged, the performance cannot support the current service data processing and becomes the bottleneck of the system performance. In this case, a storage device with higher performance needs to be used, but the VM cannot be shut down or stopped. The running VM needs to be migrated to a physical machine with better performance.

      Precautions and Restrictions

      • During the live migration, ensure that the network is in good condition. If the network is interrupted, live migration is suspended until the network is recovered. If the network connection times out, live migration fails.

      • During the migration, do not perform operations such as VM life cycle management and VM hardware device management.

      • During VM migration, ensure that the source and destination servers are not powered off or restarted unexpectedly. Otherwise, the live migration fails or the VM may be powered off.

      • Do not shut down, restart, or restore the VM during the migration. Otherwise, the live migration may fail. If you perform live migration when the VM is rebooted in ACPI mode, the VM will be shut down.

      • Only homogeneous live migration is supported. That is, the CPU models of the source and destination must be the same.

      • A VM can be successfully migrated across service network segments. However, network exceptions may occur after the VM is migrated to the destination. To prevent this problem, ensure that the service network segments to be migrated are the same.

      • If the number of vCPUs on the source VM is greater than that on the destination physical machine, the VM performance will be affected after the migration. Ensure that the number of vCPUs on the destination physical machine is greater than or equal to that on the source VM.

      Precautions for live migration of non-shared storage:

      • The source and destination cannot be the same disk image file. You need to perform special processing on such migration to prevent image damage caused by data overwriting.
      • Shared disks cannot be migrated. You need to perform foolproof operations on such migration.
      • The destination image supports only files and does not support raw devices. You need to perform foolproof processing on the migration of raw devices.
      • The size and number of disk images created on the destination must be the same as those on the source. Otherwise, the migration fails.
      • In hybrid migration scenarios, the disks to be migrated must not include shared and read-only disks.

      Live Migration Operations

      Prerequisites

      • Before live migration, ensure that the source and destination hosts can communicate with each other and have the same resource permissions. That is, the source and destination hosts can access the same storage and network resources.

      • Before VM live migration, perform a health check on the VM and ensure that the destination host has sufficient CPU, memory, and storage resources.

      (Optional) Setting Live Migration Parameters

      Before live migration, run the virsh migrate-setmaxdowntime command to specify the maximum tolerable downtime during VM live migration. This is an optional configuration item.

      For example, to set the maximum downtime of the VM named openEulerVM to 500 ms, run the following command:

      # virsh migrate-setmaxdowntime openEulerVM 500
      

      In addition, you can run the virsh migrate-setspeed command to limit the bandwidth occupied by VM live migration. This prevents VM live migration from occupying too much bandwidth and affecting other VMs or services on the host. This operation is also optional for live migration.

      For example, to set the live migration bandwidth of the VM named openEulerVM to 500 Mbit/s, run the following command:

      # virsh migrate-setspeed openEulerVM --bandwidth 500
      

      You can run the migrate-getspeed command to query the maximum bandwidth during VM live migration.

      # virsh migrate-getspeed openEulerVM
      500
      

      Live Migration Operations (Shared Storage Scenario)

      1. Check whether the storage device is shared.

        # virsh domblklist <VMInstanse>
         Target   Source
        --------------------------------------------
        sda      /dev/mapper/open_euleros_disk
        sdb      /mnt/nfs/images/openeuler-test.qcow2
        

        Run the virsh domblklist command to query the storage device information of the VM. For example, the preceding query result shows that the VM is configured with two storage devices: sda and sdb. Then, check whether the backend storage of the two devices is local storage or remote storage, if all storage devices are on the remote shared storage, the VM is a shared storage VM. Otherwise, the VM is a non-shared storage VM.

      2. Run the following command for VM live migration:

        For example, run the virsh migrate command to migrate VM openEulerVM to the destination host.

        # virsh migrate --live --unsafe openEulerVM qemu+ssh://<destination-host-ip>/system
        

        <destination-host-ip> indicates the IP address of the destination host. Before live migration, SSH authentication must be performed to obtain the source host management permission.

        In addition, the virsh migrate command has the --auto-converge and --timeout sub-options to ensure successful migration.

        Related sub-options:

        The --unsafe command forcibly performs live migration and skips the security check step.

        The --auto-converge command reduces the CPU frequency to ensure that the live migration process can be converged.

        The --timeout command specifies the live migration timeout period. If the live migration exceeds the specified period, the VM is forcibly suspended to reduce the live migration.

      3. After the live migration is complete, the VM is running properly on the destination host.

      Live Migration Operations (Non-Shared Storage Scenario)

      1. Query the VM storage device list to ensure that the VM uses non-shared storage.

        For example, the virsh domblklist command output shows that the VM to be migrated has a disk sda in qcow2 format. The XML configuration of sda is as follows:

            <disk type='file' device='disk'>
              <driver name='qemu' type='qcow2' cache='none' io='native'/>
              <source file='/mnt/sdb/openeuler/openEulerVM.qcow2'/>
              <target dev='sda' bus='scsi'/>
              <address type='drive' controller='0' bus='0' target='0' unit='0'/>
            </disk>
        

        Before live migration, create a virtual disk file in the same disk directory on the destination host. Ensure that the disk format and size are the same.

        # qemu-img create -f qcow2 /mnt/sdb/openeuler/openEulerVM.qcow2 20G
        
      2. Run the virsh migrate command on the source to perform live migration. During the migration, the storage is also migrated to the destination.

        # virsh migrate --live  --unsafe --copy-storage-all --migrate-disks sda \
        openEulerVM qemu+ssh://<dest-host-ip>/system
        
      3. After the live migration is complete, the command output indicates that the VM is running properly on the destination host and the storage device is migrated to the destination host.

      Bug Catching

      Buggy Content

      Bug Description

      Submit As Issue

      It's a little complicated....

      I'd like to ask someone.

      PR

      Just a small problem.

      I can fix it online!

      Bug Type
      Specifications and Common Mistakes

      ● Misspellings or punctuation mistakes;

      ● Incorrect links, empty cells, or wrong formats;

      ● Chinese characters in English context;

      ● Minor inconsistencies between the UI and descriptions;

      ● Low writing fluency that does not affect understanding;

      ● Incorrect version numbers, including software package names and version numbers on the UI.

      Usability

      ● Incorrect or missing key steps;

      ● Missing prerequisites or precautions;

      ● Ambiguous figures, tables, or texts;

      ● Unclear logic, such as missing classifications, items, and steps.

      Correctness

      ● Technical principles, function descriptions, or specifications inconsistent with those of the software;

      ● Incorrect schematic or architecture diagrams;

      ● Incorrect commands or command parameters;

      ● Incorrect code;

      ● Commands inconsistent with the functions;

      ● Wrong screenshots.

      Risk Warnings

      ● Lack of risk warnings for operations that may damage the system or important data.

      Content Compliance

      ● Contents that may violate applicable laws and regulations or geo-cultural context-sensitive words and expressions;

      ● Copyright infringement.

      How satisfied are you with this document

      Not satisfied at all
      Very satisfied
      Submit
      Click to create an issue. An issue template will be automatically generated based on your feedback.
      Bug Catching
      编组 3备份