This topic describes container migration and the external persistent storage pool
used when migrating containers between hosts in HPE Ezmeral Container Platform
deployments that implement EPIC.
In deployments of HPE Ezmeral Container Platform that implement EPIC, you may
specify an external persistent storage pool for tenants and/or AI/ML projects using the
Application Persistent Storage tab of the System
Settings screen (see Application Persistent Storage
Tab). Persistent storage exists either on hyper-converged resources or on a
remote storage resource that is pointed to but not managed by HPE Ezmeral Container Platform. The persistent storage pool can then be used when
migrating virtual containers among hosts.
Persistent storage can be implemented across some or all hosts, as shown here:
You can create, expand, and shrink storage capacity just as you would any other
resource. This feature allows you to migrate containers between hosts
by (default) preserving the following critical container folders for ongoing
use:
-
/usr
-
/opt
-
/var
-
/etc
-
/home
The contents of other folders can be preserved during container migration by specifying
their names in the metadata JSON file when creating a new application image generated
using the App Workbench.
Use Cases
Big Data applications such as Hadoop and Spark offer robust high availability
capabilities; however, some enterprise customers have operational requirements that
require the ability to move virtual nodes/containers from one host to another. These
operational requirements include:
- A host crashing and the containers that were running on that host must be
redeployed on other working hosts with minimal downtime and no additional
configuration required.
- One or more hosts needs to be replaced for maintenance and/or as part of a
server refresh cycle. In this scenario, the containers running on those hosts
must be seamlessly moved to other hosts that are not being replaced, with
minimal downtime to the applications running in the containers.
- Resolving a condition where there is an inability to meet the SLA for an
application running in a containerized cluster (e.g. Spark or Hadoop) due
to poor performance (CPU, network, or storage). This is a resource contention
(bottleneck) condition that requires a re-balancing of virtual nodes/containers onto hosts
with more available resources.
Enabling Container Migration
Once you have enabled persistent storage, the next step is to create one or more
flavors that include at least 20GB of persistent storage, as described in Creating a New Flavor and Editing an Existing Flavor.
Containers created using a flavor with persistent storage enabled will be preserved
as described above. You may also assign a persistent storage quota to tenants, as
described in Tenant and Project Quotas.
Note:
Containers created using a flavor that does not have
persistent storage enabled will not benefit from this feature,
even if you later edit the flavor to enable persistent
storage.
Note: You may create a flavor that specifies persistent storage even if no
persistent storage has been defined in the Application Persistent
Storage tab; however, an error will be returned if you attempt to
use this flavor before enabling persistent storage.
Note: The persistent storage resource must have enough free capacity to
accommodate the sum of all tenant persistent storage quotas or to accommodate the
amount of persistent storage specified in all applicable flavors times the number of
containers that use the flavors, whichever is greater. Further, if you specify a
per-tenant persistent storage quota, then that quota must be large enough to
accommodate the flavor-defined persistent storage times the number of containers
using the applicable flavors.
There are two ways to use persistent storage to migrate a container:
- Worker Vacate (EPIC hosts only): If an EPIC Worker host
goes down and some or all of the containers on that host use persistent storage,
then you can click the Worker Vacate button (moving
dolly) for the desired host in the EPIC Hosts Installation
screen (see The
EPIC Hosts Installation Screen) to perform a Worker vacate function.
All jobs running on the affected containers will end, but the containers
themselves will be recovered as follows:
- No new containers will be placed on the affected host.
- The protected containers are removed from the affected host.
- Containers automatically migrate to one or more new hosts, provided that
there are sufficient available resources, including any applicable
placement constraints, as described in About Tags and Tenant/Project Tags
and Constraints.
- Node Migration: This use case applies to a scenario where
the hosts are functioning properly but are overburdened. In this case, the
Tenant Administrator or Platform Administrator can add new hosts as described in
EPIC Worker Installation
Overview. Containers can then be migrated to the new hosts on a
container-by-container basis, as described in Viewing and Migrating
Virtual Nodes. Placement constraints apply to this type of container
migration as well.
The following storage systems are supported for persistent storage:
- CEPH RBD
- NFS
- ScaleIO
- Local MapR
Migrating a container/virtual node has the following effects:
- The cluster to which the container belongs will be impacted because the
container will not be executing jobs during the migration process.
- Any jobs or ActionScripts running on a cluster with one or more migrating
containers will be lost and must be run again after completing the
migration.
- Any data residing in non-persistent storage directories of a container being
migrated will be lost.
- Any external hosts that have access to HPE Ezmeral Container Platform
will be unable to access the affected containers until the migration process
completes.
- Migrated containers maintain their configuration and IP addresses.