SURGEON GENERAL'S WARNING: This following is by no means a helpful guide to upgrading a
RabbitMQ cluster ... yet. In fact, my description here might very well be the "wrong" way to do it. The purpose of this posting is to hopefully solicit comments from other RabbitMQ users and decide on the sanest way to upgrade a cluster - including methods for coping with live upgrades (zero downtime).
Also note that the following assumes a RPM-based Linux distribution; although the instructions should be simple to adapt to non-RPM-based Linux distributions.
1. If current node is disc node, remove from cluster and reset state. For example, on disc nodeA which is part of disc cluster (hostA, hostB, hostC), run the following commands as root.
$ rabbitmqctl stop_app
$ rabbitmqctl cluster rabbit@hostB rabbit@hostC
$ rabbitmqctl reset
$ rabbitmqctl start_app
2. Shutdown rabbitmq.
$ /etc/init.d/rabbitmq-server stop
3. Remove mnesia files; hopefully there's a nicer way of doing this - for example, is there a safe way to preserve message journal across versions?
$ rm -rf /var/lib/rabbitmq/mnesia/rabbit/*
4. Upgrade with new rpm.
$ rpm -U --force rabbitmq-server-.rpm
5. Reconfigure rabbitmq node per requirements for your system; configure users, etc. I believe this only should be done on first node in cluster. The other nodes should use this to seed their configuration upon joining the cluster.
6. If current node is disc node, add to cluster (
after all nodes in disc cluster have been upgraded). For example if current node is hostA and part of cluster (hostA, hostB, hostC):
$ rabbitmqctl stop_app
$ rabbitmqctl force_reset
$ rabbitmqctl cluster rabbit@hostA rabbit@hostB rabbit@hostC
$ rabbitmqctl start_app
This last step is where I the least confidence. For starters, without forcefully resetting the node, I got this error on trying to cluster the nodes: "Bad cookie in table definition amqqueue." Secondly, I had not realized this before, but upon joining the cluster the local node will pick up
all the settings from the synchronized disc nodes - including configured users and passwords.