High Availability Configuration¶
Overview¶
Morpheus provides a wide array of options when it comes to deployment architectures. It can start as a simple one machine instance where all services run on the same machine, or it can be split off into individual services per machine and configured in a high availability configuration, either in the same region or cross-region. Naturally, high availability can grow more complicated, depending on the configuration you want to do and this article will cover the basic concepts of the Morpheus HA architecture that can be used in a wide array of configurations.
There are four primary tiers of services represented within the Morpheus appliance. They are the App Tier, Transactional Database Tier, Non-Transactional Database Tier, and Message Tier. Each of these tiers have their own recommendations for High availability deployments that we need to cover.

Important
This is a sample configuration only. Customer configurations and requirements will vary.
Transactional Database Tier¶
The Transactional database tier usually consists of a MySQL compatible database. It is recommended that a lockable clustered configuration be used (Currently Percona XtraDB Cluster is the most recommended in Permissive Mode). There are several documents online related to configuring and setting up an XtraDB Cluster but it most simply can be laid out in a many master configuration. There can be some nodes setup with replication delay as well as some with no replication delay. It is common practice to have no replication delay within the same region and allow some replication delay cross region. This does increase the risk of job run overlap between the 2 regions however, the concurrent operations typically self-correct and this is a non-issue.
Non-Transactional Database Tier¶
The Non-Transactional tier consists of an ElasticSearch (version 5.6.10) cluster. Elastic Search is used for log aggregation data and temporal aggregation data (essentially stats, metrics, and logs). This enables for a high write throughput at scale. ElasticSearch is a Clustered database meaning all nodes no matter the region need to be connected to each other over what they call a “Transport” protocol. It is fairly simple to get setup as all nodes are identical. It is also a java based system and does require a sizable chunk of memory for larger data sets. (8gb) is recommended and more nodes can be added to scale either horizontally or vertically.
Messaging Tier¶
The Messaging tier is an AMQP based tier along with STOMP Protocol (used for agent communication). The primary model recommended is to use RabbitMQ for queue services. RabbitMQ is also a clustered based queuing system and needs at least 3 instances for HA configurations. This is due to elections in the failover scenarios rabbitmq can manage. If doing a cross-region HA rabbitmq cluster it is recommended to have at least 3 rabbit queue clusters per region. Typically to handle HA a RabbitMQ cluster should be placed between a load balancer and the front-end application server to handle cross host connections. The ports necessary to forward in a Rabbit MQ cluster are (5672, and 61613). A rabbitmq cluster can run on smaller memory machines depending on how frequent large requests bursts occur. 4–8gb of Memory is recommended to start.
Application Tier¶
The application tier is easily installed with the same debian or yum repository package that Morpheus is normally distributed with. Advanced configuration allows for the additional tiers to be skipped and leave only the “stateless” services that need run. These stateless services include Nginx, Tomcat, and Redis (to be phased out at a later date). These machines should also have at least 8gb of Memory. They can be configured across all regions and placed behind a central load-balancer or Geo based load-balancer. They typically connect to all other tiers as none of the other tiers talk to each other besides through the central application tier. One final piece when it comes to setting up the Application tier is a shared storage means is necessary when it comes to maintaining things like deployment archives, virtual image catalogs, backups, etc. These can be externalized to an object storage service such as amazon S3 or Openstack Swiftstack as well. If not using those options a simple NFS cluster can also be used to handle the shared storage structure.

Database Tier¶
Installation and configuration of Percona XtraDB Cluster on CentOS/RHEL 7
Important
This is a sample configuration only. Customer configurations and requirements will vary.
Requirements¶
Percona requires the following ports for the cluster nodes. Please create the appropriate firewall rules on your Percona nodes.
- 3306
- 4444
- 4567
- 4568
Percona also recommends setting the selinux policy to permissive. You can temporarily set the permission to permissive by running
sudo setenforce 0
You will need to edit the selinux configuration file if you want the permission to take affect permanently which can be found in /etc/selinux/config
Add Percona Repo¶
Add the percona repo to your Linux Distro.
sudo yum install http://www.percona.com/downloads/percona-release/redhat/0.1-4/percona-release-0.1-4.noarch.rpm
Check the repo by running the below command.
sudo yum list | grep percona
The below commands will clean the repos and update the server.
sudo yum clean all sudo yum update -y
Installing Percona XtraDB Cluster¶
The below command will install the Percona XtraDB Cluster software and it’s dependences.
sudo yum install Percona-XtraDB-Cluster-57
Note
During the installation you will receive the below message. Accept the Percona PGP key to install the software.
retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Percona Importing GPG key 0xCD2EFD2A: Userid : "Percona MySQL Development Team <mysql-dev@percona.com>" Fingerprint: 430b df5c 56e7 c94e 848e e60c 1c4c bdcd cd2e fd2a Package : percona-release-0.1-4.noarch (installed) From : /etc/pki/rpm-gpg/RPM-GPG-KEY-Percona Is this ok [y/N]: y
Next we need enable the mysql service so that the service started at boot.
sudo systemctl enable mysql
Next we need to start mysql
sudo systemctl start mysql
Next we will log into the mysql server and set a new password. To get the temporary root mysql password you will need to run the below command.The command will print the password to the screen. Copy the password.
sudo grep 'temporary password' /var/log/mysqld.log
Login to mysql
mysql -u root -p password: `enter password copied above`
Change the root user password to the mysql db
ALTER USER 'root'@'localhost' IDENTIFIED BY 'MySuperSecurePasswordhere';
Create the sstuser user and grant the permissions.
mysql> CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'M0rpheus17';
Note
The sstuser and password will be used in the /etc/my.cnf configuration.
mysql> GRANT RELOAD, LOCK TABLES, PROCESS, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost'; mysql> FLUSH PRIVILEGES;
Exit mysql then stop the mysql services:
mysql> exit Bye $ sudo systemctl stop mysql.service
Now install the Percona software on to the other nodes using the same steps.
Once the service is stopped on all nodes move onto the next step.
Add [mysqld] to my.cnf in /etc/¶
Copy the below contents to
/etc/my.cnf
. The node_name and node_address needs to be unique on each of the nodes. The first node does not require the gcomm value to be set.$ sudo vi /etc/my.cnf
[mysqld] wsrep_provider=/usr/lib64/galera3/libgalera_smm.so wsrep_cluster_name=popeye wsrep_cluster_address=gcomm:// #Leave blank for Master Node. The other nodes require this field. Enter the IP address of the primary node first then remaining nodes. Separating the ip addresses with commas like this 10.30.20.196,10.30.20.197,10.30.20.198## wsrep_node_name=morpheus-node01 wsrep_node_address=10.30.20.57 wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:M0rpheus17 pxc_strict_mode=PERMISSIVE binlog_format=ROW default_storage_engine=InnoDB innodb_autoinc_lock_mode=2
Save
/etc/my.cnf
Bootstrapping the first Node in the cluster¶
Important
Ensure mysql.service is stopped prior to bootstrap.
To bootstrap the first node in the cluster run the below command.
systemctl start mysql@bootstrap.service
Note
The mysql service will start during the boot strap.
To verify the bootstrap, on the master node login to mysql and run
show status like 'wsrep%';
# mysql -u root -p mysql> show status like 'wsrep%'; +----------------------------------+--------------------------------------+ | Variable_name | Value | +----------------------------------+--------------------------------------+ | wsrep_local_state_uuid | 591179cb-a98e-11e7-b9aa-07df8a228fe9 | | wsrep_protocol_version | 7 | | wsrep_last_committed | 1 | | wsrep_replicated | 0 | | wsrep_replicated_bytes | 0 | | wsrep_repl_keys | 0 | | wsrep_repl_keys_bytes | 0 | | wsrep_repl_data_bytes | 0 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 2 | | wsrep_received_bytes | 141 | | wsrep_local_commits | 0 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 1 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.000000 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 2 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.500000 | | wsrep_local_cached_downto | 0 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_interval | [ 100, 100 ] | | wsrep_flow_control_interval_low | 100 | | wsrep_flow_control_interval_high | 100 | | wsrep_flow_control_status | OFF | | wsrep_cert_deps_distance | 0.000000 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 0.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 0.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 0 | | wsrep_cert_bucket_count | 22 | | wsrep_gcache_pool_size | 1320 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_ist_receive_status | | | wsrep_ist_receive_seqno_start | 0 | | wsrep_ist_receive_seqno_current | 0 | | wsrep_ist_receive_seqno_end | 0 | | wsrep_incoming_addresses | 10.30.20.196:3306 | | wsrep_desync_count | 0 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 07c8c8fe-a998-11e7-883e-06949cfe5af3 | | wsrep_cluster_conf_id | 1 | | wsrep_cluster_size | 1 | | wsrep_cluster_state_uuid | 591179cb-a98e-11e7-b9aa-07df8a228fe9 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 0 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 3.22(r8678538) | | wsrep_ready | ON | +----------------------------------+--------------------------------------+ 67 rows in set (0.01 sec)
A table will appear with the status and rows.
Next Create the Database you will be using with morpheus.
mysql> CREATE DATABASE morpheusdb; mysql> show databases;
Next create your morpheus database user. The user needs to be either at the IP address of the morpheus application server or use
@'%'
within the user name to allow the user to login from anywhere.mysql> CREATE USER 'morpheusadmin'@'%' IDENTIFIED BY 'Cloudy2017';
Next Grant your new morpheus user permissions to the database.
mysql> GRANT ALL PRIVILEGES ON * . * TO 'morpheusadmin'@'%' IDENTIFIED BY 'Cloudy2017' with grant option; mysql> FLUSH PRIVILEGES;
Checking Permissions for your user.
SHOW GRANTS FOR 'morpheusadmin'@'%';
Bootstrap the Remaining Nodes¶
To bootstrap the remaining nodes into the cluster run the following command on each node:
sudo systemctl start mysql.service
The services will automatically connect to the cluster using the sstuser we created earlier.
Note
Bootstrap failures are commonly caused by misconfigured /etc/my.cnf files.
Verification¶
To verify the cluster, on the master login to mysql and run
show status like 'wsrep%';
$ mysql -u root -p mysql> show status like 'wsrep%'; +----------------------------------+-------------------------------------------------------+ | Variable_name | Value | +----------------------------------+-------------------------------------------------------+ | wsrep_local_state_uuid | 591179cb-a98e-11e7-b9aa-07df8a228fe9 | | wsrep_protocol_version | 7 | | wsrep_last_committed | 4 | | wsrep_replicated | 3 | | wsrep_replicated_bytes | 711 | | wsrep_repl_keys | 3 | | wsrep_repl_keys_bytes | 93 | | wsrep_repl_data_bytes | 426 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 10 | | wsrep_received_bytes | 774 | | wsrep_local_commits | 0 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 1 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.000000 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 2 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.100000 | | wsrep_local_cached_downto | 2 | | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_interval | [ 173, 173 ] | | wsrep_flow_control_interval_low | 173 | | wsrep_flow_control_interval_high | 173 | | wsrep_flow_control_status | OFF | | wsrep_cert_deps_distance | 1.000000 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 1 | | wsrep_cert_bucket_count | 22 | | wsrep_gcache_pool_size | 2413 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_ist_receive_status | | | wsrep_ist_receive_seqno_start | 0 | | wsrep_ist_receive_seqno_current | 0 | | wsrep_ist_receive_seqno_end | 0 | | wsrep_incoming_addresses | 10.30.20.196:3306,10.30.20.197:3306,10.30.20.198:3306 | | wsrep_desync_count | 0 | | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0/0/0/0/0 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 07c8c8fe-a998-11e7-883e-06949cfe5af3 | | wsrep_cluster_conf_id | 3 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | 591179cb-a98e-11e7-b9aa-07df8a228fe9 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 1 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 3.22(r8678538) | | wsrep_ready | ON | +----------------------------------+-------------------------------------------------------+
Verify that you can login to the MSQL server by running the below command on the Morpheus Application server(s).
mysql -u morpheusadmin -p -h 192.168.10.100
Note
This command requires mysql client installed. If you are on a windows machine you can connect to the server using mysql work bench which can be found here https://www.mysql.com/products/workbench/
RabbitMQ Cluster¶
RabbitMQ Installation and Configuration¶
Important
This is a sample configuration only. Customer configurations and requirements will vary.
Prerequisites¶
yum install epel-release
yum install erlang
Install RabbitMQ on the 3 nodes¶
wget https://dl.bintray.com/rabbitmq/rabbitmq-server-rpm/rabbitmq-server-3.6.12-1.el7.noarch.rpm
rpm --import https://www.rabbitmq.com/rabbitmq-release-signing-key.asc
yum -y install rabbitmq-server-3.6.12-1.el7.noarch.rpm
chkconfig rabbitmq-server on
rabbitmq-server -detached
On Nodes 2 & 3:¶
Overwrite
/var/lib/rabbitmq/.erlang.cookie
with value from previous step and change its permissions using the follow commands.chown rabbitmq:rabbitmq /var/lib/rabbitmq/* chmod 400 /var/lib/rabbitmq/.erlang.cookie
edit
/etc/hosts
file to refer to shortname of node 1example:
10.30.20.100 rabbit-1
Run the commands to join each node to the cluster
rabbitmqctl stop rabbitmq-server -detached rabbitmqctl stop_app rabbitmqctl join_cluster rabbit@<<node 1 shortname>> rabbitmqctl start_app
On Node 1:¶
rabbitmqctl add_user <<admin username>> <<password>>
rabbitmqctl set_permissions -p / <<admin username>> ".*" ".*" ".*"
rabbitmqctl set_user_tags <<admin username>> administrator
On All Nodes:¶
rabbitmq-plugins enable rabbitmq_stomp
Elasticsearch¶
Install 3 node Elasticsearch Cluster on Centos 7
Important
This is a sample configuration only. Customer configurations and requirements will vary.
Requirements¶
Three Existing CentOS 7+ nodes accessible to the Morpheus Appliance
Install Java on each node
You can install the latest OpenJDK with the command:
sudo yum install java-1.8.0-openjdk.x86_64
To verify your JRE is installed and can be used, run the command:
java -version
The result should look like this:
Output of java -version openjdk version "1.8.0_65" OpenJDK Runtime Environment (build 1.8.0_65-b17) OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
Installation¶
Download and Install Elasticsearch
Elasticsearch can be downloaded directly from elastic.co in zip, tar.gz, deb, or rpm packages. For CentOS, it’s best to use the native rpm package which will install everything you need to run Elasticsearch. Download it in a directory of your choosing with the command:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.10.rpm
Then install it in the usual CentOS way with the rpm command like this:
sudo rpm -ivh elasticsearch-5.6.10.noarch.rpm
This results in Elasticsearch being installed in
/usr/share/elasticsearch/
with its configuration files placed in/etc/elasticsearch
and its init script added in/etc/init.d/elasticsearch
.To make sure Elasticsearch starts and stops automatically, add its init script to the default runlevels with the command:
sudo systemctl enable elasticsearch.service
Note
If you manage an ElasticSearch cluster externally from Morpheus, follow the steps located on the ElasticSearch website to upgrade to the latest version compatible with Morpheus
Configuring Elastic
Now that Elasticsearch and its Java dependencies have been installed, it is time to configure Elasticsearch.
The Elasticsearch configuration files are in the
/etc/elasticsearch
directory. There are two files:sudo vi /etc/elasticsearch/elasticsearch.yml
- elasticsearch.yml
Configures the Elasticsearch server settings. This is where all options, except those for logging, are stored, which is why we are mostly interested in this file.
- logging.yml
Provides configuration for logging. In the beginning, you don’t have to edit this file. You can leave all default logging options. You can find the resulting logs in
/var/log/elasticsearch
by default.
The first variables to customize on any Elasticsearch server are
node.name
andcluster.name
inelasticsearch.yml
. As their names suggest, node.name specifies the name of the server (node) and the cluster to which the latter is associated.Node 1
cluster.name: morpheusha1 node.name: "morpheuses1" discovery.zen.ping.unicast.hosts: ["10.30.20.91","10.30.20.149","10.30.20.165"]
Node 2
cluster.name: morpheusha1 node.name: "morpheuses2" discovery.zen.ping.unicast.hosts: ["10.30.20.91","10.30.20.149","10.30.20.165"]
Node 3
cluster.name: morpheusha1 node.name: "morpheuses3" discovery.zen.ping.unicast.hosts: ["10.30.20.91","10.30.20.149","10.30.20.165"]
For the above changes to take effect, you will have to restart Elasticsearch with the command:
sudo service elasticsearch restart
Testing
By now, Elasticsearch should be running on port 9200. You can test it with curl, the command line client-side URL transfers tool and a simple GET request like this:
[~]$ sudo curl -X GET 'http://10.30.20.149:9200' { "status" : 200, "name" : "morpheuses1", "cluster_name" : "morpheusha1", "version" : { "number" : "1.7.3", "build_hash" : "05d4530971ef0ea46d0f4fa6ee64dbc8df659682", "build_timestamp" : "2015-10-15T09:14:17Z", "build_snapshot" : false, "lucene_version" : "4.10.4" },
Application Tier¶
Morpheus configuration is controlled by a configuration file located
at /etc/morpheus/morpheus.rb
. This file is read when you run
morpheus-ctl reconfigure
after installing the appliance package. Each
section is tied to a deployment tier: database is mysql, message queue
is rabbitmq, search index is elasticsearch. There are no entries for the
web and application tiers since those are part of the core application
server where the configuration file resides.
- Download and install the Morpheus Appliance Package
- Next we must install the package onto the machine and configure the morpheus services:
sudo sudo rpm -i morpheus-appliance-x.x.x-1.x86_64.rpm
- After installing and prior to reconfiguring, edit the
morpheus.rb
file
sudo vi /etc/morpheus/morpheus.rb
Change the values to match your configured services:
Note
The values below are examples. Update hosts, ports, usernames and password with your specifications. Only include entries for services you wish to externalize.
mysql['enable'] = false
mysql['host'] = {'10.30.20.139' => 3306, '10.30.20.153' => 3306, '10.30.20.196' => 3306}
mysql['morpheus_db'] = 'morpheusdb'
mysql['morpheus_db_user'] = 'morpheusadmin'
mysql['morpheus_password'] = 'morpheus4admin!'
rabbitmq['enable'] = false
rabbitmq['vhost'] = 'morph'
rabbitmq['queue_user'] = 'lbuser'
rabbitmq['queue_user_password'] = 'morpheus4admin'
rabbitmq['host'] = 'morpheus-ha-mq-lb-1.den.morpheusdata.com'
rabbitmq['port'] = '5672'
rabbitmq['stomp_port'] = '61613'
rabbitmq['heartbeat'] = 50
elasticsearch['enable'] = false
elasticsearch['cluster'] = 'morpheusha1'
elasticsearch['es_hosts'] = {'10.30.20.91' => 9200, '10.30.20.149' => 9200, '10.30.20.165' => 9200}
- Reconfigure Morpheus
sudo morpheus-ctl reconfigure
3 Node with Externalized DB Configuration¶
Assumptions¶
This guide assumes the following:
- There is an externalized database running for Morpheus to access.
- The database service is a MySQL dialect (MySQL, MariaDB, Galera, etc…)
- A database has been created for Morpheus as well as a user and proper grants have been run for the user. Morpheus will create the schema.
- The baremetal nodes cannot access the public internet
- The base OS is RHEL 7.x
- Shortname versions of hostnames will be resolvable
- All nodes have access to a shared volume for
/var/opt/morpheus/morpheus-ui
. This can be done as a post startup step. - This configuration will support the complete loss of a single node, but no more. Specifically the Elasticsearch tier requires at least two nodes to always be clustered..
Steps¶
First begin by downloading the requisite Morpheus packages either to the nodes or to your workstation for transfer. These packages need to be made available on the nodes you wish to install Morpheus on.
[root@app-server-1 ~]# wget https://downloads.gomorpheus.com/example/path/morpheus-appliance-offline-3.1.5- 1.noarch.rpm [root@app-server-1 ~]# wget https://downloads.gomorpheus.com/example/path/morpheus-appliance-3.1.5- 1.el7.x86_64.rpm
Once the packages are available on the nodes they can be installed. Make sure that no steps beyond the rpm install are run.
[root@app-server-1 ~] rpm -i morpheus-appliance-3.1.5-1.el7.x86_64.rpm [root@app-server-1 ~] rpm -i morpheus-appliance-offline-3.1.5-1.noarch.rpm
Next you will need to edit the Morpheus configuration file
/etc/morpheus/morpheus.rb
on each node.Node 1
appliance_url 'https://morpheus1.localdomain' elasticsearch['es_hosts'] = {'10.100.10.121' => 9200, '10.100.10.122' => 9200, '10.100.10.123' => 9200} elasticsearch['node_name'] = 'morpheus1' elasticsearch['host'] = '0.0.0.0' rabbitmq['host'] = '0.0.0.0' rabbitmq['nodename'] = 'rabbit@node01' mysql['enable'] = false mysql['host'] = '10.100.10.111' mysql['morpheus_db'] = 'morpheusdb' mysql['morpheus_db_user'] = 'morpheus' mysql['morpheus_password'] = 'password'
Node 2
appliance_url 'https://morpheus2.localdomain' elasticsearch['es_hosts'] = {'10.100.10.121' => 9200, '10.100.10.122' => 9200, '10.100.10.123' => 9200} elasticsearch['node_name'] = 'morpheus2' elasticsearch['host'] = '0.0.0.0' rabbitmq['host'] = '0.0.0.0' rabbitmq['nodename'] = 'rabbit@node02' mysql['enable'] = false mysql['host'] = '10.100.10.112' mysql['morpheus_db'] = 'morpheusdb' mysql['morpheus_db_user'] = 'morpheus' mysql['morpheus_password'] = 'password'
Node 3
appliance_url 'https://morpheus3.localdomain' elasticsearch['es_hosts'] = {'10.100.10.121' => 9200, '10.100.10.122' => 9200, '10.100.10.123' => 9200} elasticsearch['node_name'] = 'morpheus3' elasticsearch['host'] = '0.0.0.0' rabbitmq['host'] = '0.0.0.0' rabbitmq['nodename'] = 'rabbit@node03' mysql['enable'] = false mysql['host'] = '10.100.10.113' mysql['morpheus_db'] = 'morpheusdb' mysql['morpheus_db_user'] = 'morpheus' mysql['morpheus_password'] = 'password'
Note
If you are running MySQL in a Master/Master configuration we will need to slightly alter the mysql[‘host’] line in the
morpheus.rb
to account for both masters in a failover configuration. As an example:mysql['host'] = '10.100.10.111:3306,10.100.10.112'
. Morpheus will append the ‘3306’ port to the end of the final IP in the string, which is why we leave it off but explicitly type it for the first IP in the string. The order of IPs matters in that it should be the same across all three Morpheus Application Servers. As mentioned, this will be a failover configuration for MySQL in that the application will only read/write from the second master if the first master becomes unavailable. This way we can avoid commit lock issues that might arise from a load balanced Master/Master.Run the reconfigure on all nodes
[root@app-server-1 ~] morpheus-ctl reconfigure
Morpheus will come up on all nodes and Elasticsearch will auto-cluster. The only item left is the manual clustering of RabbitMQ.
Select one of the nodes to be your Source Of Truth (SOT) for RabbitMQ clustering. We need to copy the secrets for RabbitMQ, copy the erlang cookie and join the other nodes to the SOT node.
Begin by copying secrets from the SOT node to the other nodes.
[root@app-server-1 ~] cat /etc/morpheus/morpheus-secrets.json "rabbitmq": { "morpheus_password": "***REDACTED***", "queue_user_password": "***REDACTED***", "cookie": "***REDACTED***" },
Then copy the erlang.cookie from the SOT node to the other nodes
[root@app-server-1 ~]# cat /opt/morpheus/embedded/rabbitmq/.erlang.cookie # 754363AD864649RD63D28
Once this is done run a reconfigure on the two nodes that are NOT the SOT nodes.
[root@app-server-2 ~] morpheus-ctl reconfigure
Note
This step will fail. This is ok, and expected. If the reconfigure hangs then use Ctrl+C to quit the reconfigure run and force a failure.
Subsequently we need to stop and start Rabbit on the NOT SOT nodes.
[root@app-server-2 ~]# morpheus-ctl stop rabbitmq [root@app-server-2 ~]# morpheus-ctl start rabbitmq [root@app-server-2 ~]# PATH=/opt/morpheus/sbin:/opt/morpheus/sbin:/opt/morpheus/embedded/sbin:/opt/morpheus/embedded/bin:$PATH [root@app-server-2 ~]# rabbitmqctl stop_app Stopping node 'rabbit@app-server-2' ... [root@app-server-2 ~]# rabbitmqctl join_cluster rabbit@app-server-1 Clustering node 'rabbit@app-server-2' with 'rabbit@app-server-1' ... [root@app-server-2 ~]# rabbitmqctl start_app Starting node 'rabbit@app-server-2' ...
Now make sure to reconfigure
[root@app-server-2 ~] morpheus-ctl reconfigure
Once the Rabbit services are up and clustered on all nodes they need to be set to HA/Mirrored Queues:
[root@app-server-2 ~]# rabbitmqctl set_policy -p morpheus --priority 1 --apply-to all ha ".*" '{"ha-mode": "all"}'
The last thing to do is restart the Morpheus UI on the two nodes that are NOT the SOT node.
[root@app-server-2 ~]# morpheus-ctl restart morpheus-ui
If this command times out then run:
[root@app-server-2 ~]# morpheus-ctl kill morpheus-ui [root@app-server-2 ~]# morpheus-ctl start morpheus-ui
You will be able to verify that the UI services have restarted properly by inspecting the logfiles. A standard practice after running a restart is to tail the UI log file.
root@app-server-2 ~]# morpheus-ctl tail morpheus-ui
Lastly, we need to ensure that Elasticsearch is configured in such a way as to support a quorum of 2. We need to do this step on EVERY NODE.
[root@app-server-2 ~]# echo "discovery.zen.minimum_master_nodes: 2" >> /opt/morpheus/embedded/elasticsearch/config/elasticsearch.yml [root@app-server-2 ~]# morpheus-ctl restart elasticsearch
Note
For moving
/var/opt/morpheus/morpheus-ui
files into a shared volume make sure ALL Morpheus services on ALL three nodes are down before you begin.[root@app-server-1 ~]# morpheus-ctl stop
Permissions are as important as is content, so make sure to preserve directory contents to the shared volume.
Subsequently you can start all Morpheus services on all three nodes and tail the Morpheus UI log file to inspect errors.
Database Migration¶
If your new installation is part of a migration then you need to move the data from your original Morpheus database to your new one. This is easily accomplished by using a stateful dump.
To begin this, stop the Morpheus UI on your original Morpheus server:
[root@app-server-old ~]# morpheus-ctl stop morpheus-ui
Once this is done you can safely export. To access the MySQL shell we will need the password for the Morpheus DB user. We can find this in the morpheus-secrets file:
[root@app-server-old ~]# cat /etc/morpheus/morpheus-secrets.json
{ "mysql": { "root_password": "***REDACTED***", "morpheus_password": "***REDACTED***", "ops_password": "***REDACTED***" }, "rabbitmq": { "morpheus_password": "***REDACTED***", "queue_user_password": "***REDACTED***", "cookie": "***REDACTED***" }, "vm-images": { "s3": { "aws_access_id": "***REDACTED***", "aws_secret_key": "***REDACTED***" } } }
Take note of this password as it will be used to invoke a dump. Morpheus provides embedded binaries for this task. Invoke it via the embedded path and specify the host. In this example we are using the Morpheus database on the MySQL listening on localhost. Enter the password copied from the previous step when prompted:
[root@app-server-old ~]# /opt/morpheus/embedded/mysql/bin/mysqldump -u morpheus -h 127.0.0.1 morpheus -p > /tmp/morpheus_backup.sql Enter password:
This file needs to be pushed to the new Morpheus Installation’s backend. Depending on the GRANTS in the new MySQL backend, this will likely require moving this file to one of the new Morpheus frontend servers.
Once the file is in place it can be imported into the backend. Begin by ensuring the Morpheus UI service is stopped on all of the application servers:
[root@app-server-1 ~]# morpheus-ctl stop morpheus-ui [root@app-server-2 ~]# morpheus-ctl stop morpheus-ui [root@app-server-3 ~]# morpheus-ctl stop morpheus-ui
Then you can import the MySQL dump into the target database using the embedded MySQL binaries, specifying the database host, and entering the password for the Morpheus user when prompted:
[root@app-server-1 ~]# /opt/morpheus/embedded/mysql/bin/mysql -u morpheus -h 10.130.2.38 morpheus -p < /tmp/morpheus_backup.sql Enter password:
Recovery¶
If a node happens to crash most of the time Morpheus will start upon boot of the server and the services will self-recover. However, there can be cases where RabbitMQ and Elasticsearch are unable to recover in a clean fashion and it require minor manual intervention. Regardless, it is considered best practice when recovering a restart to perform some manual health checks.
[root@app-server-1 ~]# morpheus-ctl status
run: check-server: (pid 17808) 7714s;
run: log: (pid 549) 8401s
run: elasticsearch: (pid 19207) 5326s;
run: log: (pid 565) 8401s
run: guacd: (pid 601) 8401s;
run: log: (pid 573) 8401s
run: morpheus-ui: (pid 17976) 7633s;
run: log: (pid 555) 8401s
run: nginx: (pid 581) 8401s;
run: log: (pid 544) 8401s
run: rabbitmq: (pid 17850) 7708s;
run: log: (pid 542) 8401s
run: redis: (pid 572) 8401s;
run: log: (pid 548) 8401s
But, a status can report false positives if, say, RabbitMQ is in a boot loop or Elasticsearch is up, but not able to join the cluster. It is always advisable to tail the logs of the services to investigate their health.
[root@app-server-1 ~]# morpheus-ctl tail rabbitmq
[root@app-server-1 ~]# morpheus-ctl tail elasticsearch
To minimize disruption to the user interface, it is advisable to remedy Elasticsearch clustering first. Due to write locking in Elasticsearch it can be required to restart other nodes in the cluster to allow the recovering node to join. Begin by determining which Elasticsearch node became the master during the outage. On one of the two other nodes (not the recovered node):
[root@app-server-2 ~]# curl localhost:9200/_cat/nodes
app-server-1 10.100.10.121 7 47 0.21 d * morpheus1
localhost 127.0.0.1 4 30 0.32 d m morpheus2
The master is determined by identifying the row with the ‘*’
in it.
SSH to this node (if different) and restart Elasticsearch.
[root@app-server-1 ~]# morpheus-ctl restart elasticsearch
Go to the other of the two ‘up’ nodes and run the curl command again. If the output contains three nodes then Elasticsearch has been recovered and you can move on to re-clustering RabbitMQ. Otherwise you will see output that contains only the node itself:
[root@app-server-2 ~]# curl localhost:9200/_cat/nodes
localhost 127.0.0.1 4 30 0.32 d * morpheus2
If this is the case then restart Elasticsearch on this node as well:
[root@app-server-2 ~]# morpheus-ctl restart elasticsearch
After this you should be able to run the curl command and see all three nodes have rejoined the cluster:
[root@app-server-2 ~]# curl localhost:9200/_cat/nodes
app-server-1 10.100.10.121 9 53 0.31 d * morpheus1
localhost 127.0.0.1 7 32 0.22 d m morpheus2
app-server-3 10.100.10.123 3 28 0.02 d m morpheus3
The most frequent case of restart errors for RabbitMQ is with epmd failing to restart. Morpheus’s recommendation is to ensure the epmd process is running and daemonized by starting it:
[root@app-server-1 ~]# /opt/morpheus/embedded/lib/erlang/erts-5.10.4/bin/epmd - daemon
And then restarting RabbitMQ:
[root@app-server-1 ~]# morpheus-ctl restart rabbitmq
And then restarting the Morpheus UI service:
[root@app-server-1 ~]# morpheus-ctl restart morpheus-ui
Again, it is always advisable to monitor the startup to ensure the Morpheus Application is starting without error:
[root@app-server-1 ~]# morpheus-ctl tail morpheus-ui
Recovery Thoughts/Further Discussion: If Morpheus UI cannot connect to RabbitMQ, Elasticsearch or the database tier it will fail to start. The Morpheus UI logs can indicate if this is the case.
Aside from RabbitMQ, there can be issues with false positives concerning Elasticsearch’s running status. The biggest challenge with Elasticsearch, for instance, is that a restarted node has trouble joining the ES cluster. This is fine in the case of ES, though, because the minimum_master_nodes setting will not allow the un-joined singleton to be consumed until it joins. Morpheus will still start if it can reach the other two ES hosts, which are still clustered.
The challenge with RabbitMQ is that it is load balanced behind Morpheus for requests, but each Morpheus application server needs to boostrap the RabbitMQ tied into it. Thus, if it cannot reach its own RabbitMQ startup for it will fail.
Similarly, if a Morpheus UI service cannot reach the database, startup will fail. However, if the database is externalized and failover is configured for Master/Master, then there should be ample opportunity for Morpheus to connect to the database tier.
Because Morpheus can start even though the Elasticsearch node on the same host fails to join the cluster, it is advisable to investigate the health of ES on the restarted node after the services are up. This can be done by accessing the endpoint with curl and inspecting the output. The status should be “green” and number of nodes should be “3”:
[root@app-server-1 ~]# curl localhost:9200/_cluster/health?pretty=true
{
"cluster_name" : "morpheus",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 110,
"active_shards" : 220,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0
}
If this is not the case it is worth investigating the Elasticsearch logs to understand why the singleton node is having trouble joining the cluster. These can be found at:
/var/log/morpheus/elasticsearch/current
Outside of these stateful tiers, the “morpheus-ctl status” command will not output a “run” status unless the service is successfully running. If a stateless service reports a failure to run, the logs should be investigated and/or sent to Morpheus for additional support. Logs for all Morpheus embedded services are found in /var/log/morpheus
.