Before work can begin, all cluster functions must be checked. If any part of the cluser, for example the DRBD Replication on acd-store is not functioning properly, a failover will not work and a systemwide shutdown will be the result.
Information can be found here: System Health Check
Introduction
A controlled fail-over might be required when one or more VM-Hosts in the cluster must be taken down for maintenance.
Prerequisites
Your VM-Cluster must have at least two nodes, in which the jtel cluster is built. Before shutting down, an active VM-Host and the active jtel vms within must be chosen. All activity on the jtel ACD is moved to the active VM-Host, which enables you to shutdown the inactive Host containing only standby-machines from your redundant jtel ACD cluster
A redundant jtel Cluster within your VM-Hosts may look like the example architecture from Shutdown/Startup Procedure - Large V3 - Redundant Databases + Load Balancing + Storage
Example
Explanation
To begin shutdown procedures, the active side is chosen. In the table below the active side is VM-Host 1. We will move it to VM-Host 2 to illustrate a controlled fail-over.
For the purposes of illustrating the procedure in this page, acd-telN and acd-jbN are replaced with multiple machines. jtel acd-chat and acd-api servers are also active, as well as a presence-aggregator and IMAP and exchange mail connectors
Example System Parameters/Expected Normal Operation Status
This table describes the systems expected status under fully redundant operation
VM-Host | machine | Active |
---|---|---|
1 | acd-tel1 | Yes |
2 | acd-tel2 | Yes |
1 | acd-jb1 | Yes |
1 | acd-jb2 | Yes |
1 | acd-jb3 | Yes |
2 | acd-jb4 | Yes |
2 | acd-jb5 | Yes |
2 | acd-jb6 | Yes |
1 | acd-chat1 | Yes |
2 | acd-chat2 | Yes |
1 | acd-api1 | Yes |
2 | acd-api2 | Yes |
1 | acd-dbs1/dbr1 | Yes |
2 | acd-dbs2/dbr2 | Yes |
1 | acd-dbm1 | Yes |
2 | acd-dbm2 | Standby |
1 | acd-lb1 | Yes |
2 | acd-lb2 | Standby |
1 | acd-store1 | Yes |
2 | acd-store2 | Standby |
Shutdown
Various steps are required before the virtual machines can be shutdown.
Step 1 - Backups
At least a backup of database on acd-dbm1 is required. If the capacity on your VM-Hosts is enough, snapshots of critical machines are also good, but not essential. The critical machines are all acd-dbm, acd-lb and acd-store
Step 2 - Deactivate Monitoring
If monitoring is installed on your system, schedule a downtime for 2 hours for the machines on VM-Host 2. The downtime for the machines on VM-Host 1 should be set to however long they will be inactive.
Step 3 - Shutting down all software
Machine(s) | Stop what | It is installed if you are using | How to stop |
---|---|---|---|
acd-telN | 8-Server | ACD / IVR | X the cmd file starter window. For service installations, stop the robot5 service. |
Platform UDP Listener | ACD / IVR | X the cmd file starter window. For service installations, stop the jtel Platform UDP Listener service. | |
Presence Aggregator | A PBX or presence connector which uses the presence aggregator:
| X the cmd file starter window. For service installations, stop the jtel Presence Aggregator service. | |
Telephony Connector | A PBX which uses a custom connector:
| X the cmd file starter window. For service installations, stop the service, for example the jtel TAPI service or jtel Innovaphone Service. | |
Exchange Connector | E-Mail with an Exchange or Office 365 Server | Stop the jtelEWSMailService service. | |
IMAP Connector | E-Mail with an IMAP(S) Server | Stop the jtelIMAPMailService service. | |
acd-jbN | Wildfly | Anything | sudo systemctl stop wildfly For installations not using systemctl: sudo service wildfly stop |
acd-chatN | Chat Server | CHAT | sudo systemctl stop jtel-clientmessenger For installations not using systemctl: sudo service jtel-clientmessenger stop |
acd-apiN | REST API | REST | sudo systemctl stop jtelrest For installations not using systemctl: sudo service jtelrest stop |
acd-dbmN | Platform UDP Listener | SOAP | sudo systemctl stop jtel-listener For installations not using systemctl: sudo service jtel-listener stop |
Step 4 - Check for active sessions
Checking for active database-sessions on the database master is a precautionary but necessary step to ensure that all services are stopped and no activity is present on the entire system.
Checking can either be done on the HaProxy admin page on acd-lb1, by checking the current session activity on acd-dbm1, or it can be done on acd-dbm1 within the MySQL terminal, by typing the following command:
# The expected output contains only replication status events SHOW PROCESSLIST \G
Only continue when no sessions are active
Step 5 - Manual Failover to acd-store2
To execute a manual failover, the pcs cluster node acd-store1 is temporarily set into standby, which will cause acd-store2 to become the primary node. After acd-store2 is primary, an unstandby command is executed. After this, acd-store2 will be the primary node and acd-store1 will be secondary.
Execute the following commands on acd-store2
# Set acd-store1 to standby pcs cluster standby acd-store1 # Check if acd-store2 was switched to primary pcs status # Set acd-lb1 back to unstandby pcs cluster unstandby acd-lb1 # Check if acd-lb2 is primary, and acd-lb2 is secondary pcs status
Step 6 - Manual Failover to acd-lb2
To execute a manual failover, the pcs cluster node acd-lb1 is temporarily set into standby, which will cause acd-lb2 to become the primary node. After acd-lb2 is primary, an unstandby command is executed. After this, acd-lb2 will be the primary node and acd-lb1 will be secondary.
Execute the following commands on acd-lb1
# Set acd-lb1 to standby pcs cluster standby acd-lb1 # Check if acd-lb2 was switched to primary pcs status # Set acd-lb1 back to unstandby pcs cluster unstandby acd-lb1 # Check if acd-lb2 is primary, and acd-lb2 is secondary pcs status
Step 7 - Configure HaProxy
Access the HaProxy admin page for both acd-lb1 and acd-lb2. The primarily important machine to configure is acd-lb2, but in case a fail-over happens during the entire procedure, the configuration will have been the same.
Configure the status "MAINT" for all machines on VM-Host 1
When the machine is booted and the HaProxy on acd-lb1 starts again, all "MAINT" statuses will be reset to default. No machine will be in the status "MAINT".
Step 8 - Check the AcdGroupDistribute Daemon
If the Daemon AcdGroupDistribute.r5 is running on acd-tel1, it must be started on acd-tel2 to ensure that calls to acd-groups will still be routed by the routing-algorithm.
Step 9 - Shutdown the virtual machines on VM-Host 1
The correct order in which to shutdown must still be maintained. The following table displays the order
First shutdown acd-tel1, acd-jb1-3 as well as acd-chat1 and acd-api1 in no particular order. You do not have to wait until acd-tel1, acd-jb1-3 as well as acd-chat1 and acd-api1 are down before shutting acd-dbs1/dbr1 down. Wait until acd-dbs1/dbr1 is down until shutting acd-dbm1 down. Wait until acd-dbm1 is down before shutting down acd-lb1. Wait until acd-lb1 is down before shutting down acd-store1
VM-Host | machine | Active |
---|---|---|
1 | acd-tel1 | Yes |
1 | acd-jb1 | Yes |
1 | acd-jb2 | Yes |
1 | acd-jb3 | Yes |
1 | acd-chat1 | Yes |
1 | acd-api1 | Yes |
1 | acd-dbs1/dbr1 | Yes |
1 | acd-dbm1 | Yes |
1 | acd-lb1 | Yes |
1 | acd-store1 | Yes |
Step 10 - Start Software
Machine(s) | Start what | It is installed if you are using | How to stop |
---|---|---|---|
acd-tel2 | 8-Server | ACD / IVR | Explorer to shell:startup - start the link to startup_launcher.cmd For service installations, start the robot5 service. |
Platform UDP Listener | ACD / IVR | Explorer to shell:startup - start the link to startListener.bat For service installations, start the jtel Platform UDP Listener service. | |
Presence Aggregator | A PBX or presence connector which uses the presence aggregator:
| Explorer to shell:startup - start the link to start-presence-aggregator.cmd For service installations, start the jtel Presence Aggregator service. | |
Telephony Connector | A PBX which uses a custom connector:
| Explorer to shell:startup - start the link to JTELInnovaphonePBXService.exe or jtelTAPIMonitorService.exe For service installations, start the service, for example the jtel TAPI service or jtel Innovaphone Service. | |
Exchange Connector | E-Mail with an Exchange or Office 365 Server | Start the jtelEWSMailService service. | |
IMAP Connector | E-Mail with an IMAP(S) Server | Start the jtelIMAPMailService service. | |
acd-jb4-6 | Wildfly | Anything | sudo systemctl start wildfly For installations not using systemctl: sudo service wildfly start |
acd-chat2 | Chat Server | CHAT | sudo systemctl start jtel-clientmessenger For installations not using systemctl: sudo service jtel-clientmessenger start |
acd-api2 | REST API | REST | sudo systemctl start jtelrest For installations not using systemctl: sudo service jtelrest start |
acd-dbm2 | Platform UDP Listener | SOAP | sudo systemctl start jtel-listener For installations not using systemctl: sudo service jtel-listener start |
Step 11 - Ensure system functionality
Information can be found here: System Health Check
If all tests are successful, the system is now running only on VM Host 2 and fully operational.
Startup
Table
VM-Host 1
Steps 1 to N | VM-Host | Shutdown | Startup |
---|---|---|---|
1 | acd-telN | acd-store2 | |
2 | acd-jbN | acd-store1 | |
3 | acd-dbs1/dbr1 | acd-lb2 | |
4 | acd-dbs2/dbr2 | acd-lb1 | |
5 | acd-dbm1 | acd-dbm2 | |
6 | acd-dbm2 | acd-dbm1 | |
7 | acd-lb1 | acd-dbs2/dbr2 | |
8 | acd-lb2 | acd-dbs1/dbr1 | |
9 | acd-store1 | acd-jbN | |
10 | acd-store2 | acd-telN |