The Controllers tab of the Controllers/Upgrades screen (see
The Controllers/Upgrades Screen) allows
the Platform Administrator to enable or disable platform High Availability
protection, which protects against the failure of a single host (Primary Controller,
Shadow Controller, or Arbiter).
Please see the following:
Enabling HA Protection
Note:
HPE recommends enabling platform High Availability before
adding a large number of Kubernetes or EPIC hosts.
To enable platform High Availability:
- Ensure that all of the requirements listed in High Availability Requirements are
met.
- If you have any existing virtual clusters, then those clusters must be deleted
and then recreated after enabling High Availability.
- Enter site lockdown as described in Lockdown Mode.
- Check the Enable HA check box.
- Proceed as follows:
- If the Controller and Standby Controller hosts are
in different subnets, then you must leave the Cluster IP field
blank. By leaving both the Cluster IP and Cluster Name
fields blank, you may access the web interface by navigating to
http://<gateway_ip> or https://<gateway_ip>, as appropriate, where <gateway_ip> is the IP address of a Gateway
host. See Gateway
Hosts.
- If the Controller and Standby Controller hosts are
on the same subnet, then you can enter an available IP address to use as
the cluster IP address in the Cluster IP field. This IP address
must be in the same subnet as the Controller host and cannot be in use
by any other resource. If you do not supply a cluster IP address, then
you may access the web interface by navigating to
http://<gateway_ip> or https://<gateawy_ip>, as appropriate, if you have defined
a cluster name in the Cluster Name field. This cluster name must
be mapped to the cluster IP address via a user-accessible DNS
server.
Note: For same-subnet scenario, the
external switch connecting the hosts to the network must
support gratuitous arp in order for the cluster IP
address to function
correctly.
Note: After enabling HA, you may use
either the cluster IP address or cluster name to log
into the web interface, because this will automatically
connect you to the Controller host (during normal
operation) or the Shadow Controller host (when HA
protection has been triggered by Controller host
failure). If the Controller host fails, then you will
not be able to access the web interface using the IP
address of that
host.
- Select the hosts to use as the Shadow Controller, and Arbiter Node using the
Shadow Controller and Arbiter Node pull-down menus. If the
deployment has three hosts, then the host remaining after assigning the Shadow
Controller host will be the Arbiter, and vice versa. If there are more than
three hosts, then you may select one of the remaining Worker hosts to be the
Arbiter after selecting the Shadow Controller, and vice-versa. You cannot remove
or modify the Shadow Controller or Arbiter host after enabling High Availability
protection.
Note: Hosts that did not have
internal storage defined when being added as described
in
Step 4: Select
Hard Drives cannot be used as a Shadow
Controller or
Arbiter.
- Click Submit. The Controllers tab displays the message
HA Setup in progress. This process may take up to 30 minutes to
complete depending on a number of factors. If desired, you may click the
Details button to open the HA Setup Details popup, which
provides detailed information about the HA setup process.
- A message appears in the upper right corner of the web interface once the
process completes informing you have HPE Ezmeral Container Platform is now
running in High Availability mode and reminding you to begin using the cluster
IP address or name that you entered in Step 4 to log in to the web
interface going forward. Clicking the Click here to migrate to Cluster
Name link in this message logs you out of the web interface and returns
you to the Login screen using the cluster IP address.
- Log into the web interface using your normal username and password.
- Exit site lockdown as described in Lockdown Mode.
- Create any needed virtual cluster(s).
If enabling High Availability fails, then the fields in the
Controllers tab will reappear, and the deployment will continue
running in its previous non-High Availability state. Please contact HPE Technical
Support for assistance.
Note:
If you installed the Network Manager service while
installing the base operating system on the hosts, then
this service will stop because it conflicts with the High
Availability monitoring services.
Note:
If you are using ISSI, then be to use a mechanism
(e.g. Puppet, Chef, etc.) that will replicate the ISSI
configuration to the Shadow Controller host. See
Implementation-Specific
Script Interface.
Disabling HA Protection
If platform HA is enabled, then you may disable it as follows:
- Enter site lockdown as described in Lockdown Mode.
- Clear the Enable HA button. A confirmation popup
appears. Click OK to confirm.
- Click Submit to begin disabling HA protection. You will be automatically
logged out and then migrated to the Controller IP address.
- Log in to the web interface via the Controller IP address using your
normal username and password. The HA tab will display the message HA
disable in progress. Disabling HA may require up to 30 minutes to
complete. If desired, you may click the Details button to open the HPE
HA Disable Details popup, which provides additional information about
the HA disable process.
- After the HA disable process finishes, exit site lockdown as described in Lockdown Mode.
Disabling platform HA protection has the following effects:
- The Shadow Controller and Arbiter hosts become Worker hosts.
- The cluster IP address is disabled; you must log in using the Controller IP
address
- The HPE Container Platform platform is no longer protected against Controller
host failure.
Note:
You may change the hosts used for the Shadow Controller and
Arbiter roles by disabling HA protection and then re-enabling it
and using the updated IP addresses/hostnames.
Note:
Please contact HPE Technical Support if disabling HA
fails.
Platform HA can be disabled under one of the following situations:
All Hosts Online
If all hosts are online, then HA protection will start being disabled when
the Platform Administrator submits this task, and the deployment will be placed into
a non-HA state.
Some Hosts Offline
If one or more host(s) are offline, then he disable HA task will be rejected,
and the Platform Administrator will be notified that one or more host(s) are
offline. This can happen for one of several reasons:
- Worker offline: If a Worker host is offline, then that host will retain
its HA configuration when it comes back up, which will not be consistent with
the platform status. You can simply delete the Worker host as described in Decommissioning/Deleting an EPIC
Worker and then repeat the HA disable process.
- Shadow Controller failure: Normally, disabling HA protection will delete
the database replicas on the Shadow Controller and Arbiter hosts. If the Shadow
Controller host is offline due to a hardware failure, then database deletion
will fail. When the faulty hardware is replaced and HA protection is re-enabled,
the original Arbiter host should be re-designated as the Arbiter, and the new
Shadow Controller host must use the same IP address as the previous Shadow
Controller host.
- Arbiter failure: Normally, disabling HA protection will delete the
database replicas on the Shadow Controller and Arbiter hosts. If the Arbiter
host is offline due to a hardware failure, then database deletion will fail.
When the faulty hardware is replaced and HA protection is re-enabled, the
original Shadow Controller host should be re-designated as the Shadow
Controller, and the new Arbiter host must use the same IP address as the
previous Arbiter host.
Please contact HPE Technical Support for assistance if you need to disable HA
protection while the Shadow Controller or Arbiter host is down. The Support team
will perform some manual operations to allow the management service to ignore the
offline host and allow you to proceed with disabling HA Protection. When you replace
the hardware, be sure to follow the instructions listed in the previous bullets, as
applicable.