Erlang Mailing Lists

Author Message

<  RabbitMQ mailing list  ~  Problem with Clustering

Guest
Posted: Wed Jan 06, 2010 8:54 pm Reply with quote
Guest
Hi,

I have a simple cluster setup consisting of 2 nodes. Call them A and B going forward.

Status of node rabbit@B...
[{running_applications,[{rabbit,"RabbitMQ","1.7.0"},
Guest
Posted: Thu Jan 07, 2010 10:39 am Reply with quote
Guest
Hi David,

You're using the latest default code, yes?

We've just changed the behaviour in this case - previously, yes, the
queue would be recreated. This is incorrect because it then causes many
problems when the failed node comes back up - you end up with the same
queue on both nodes with different concepts of what should be in the
queue. Exciting if undesireable things happen in this case.

Thus if a queue is declared but it's found that the queue does already
exist but is on a downed node, we return a 404, because it's really
saying "the node on which this queue exists can't be found".

From your code, I see you're declaring the queue durable. This really
reenforces the issue because if its durable, then persistent messages
shouldn't be lost, and yet if you want to be able to recreate the queue
on the other node, then it'll start empty, at which point, logically,
the contents of the queue have been lost.

In general, clustering should not be used for HA purposes. If you wish
to achive HA, then active/passive HA can be achieved by using shared
disk storage, heartbeat/pacemaker, maybe a tcp load balancer on the
front, and make sure you set the node names to localhost, and point both
rabbit instances at the same mnesia dir on the shared storage. When the
passive node comes up, it will recover everything from the storage.

Matthew

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss@lists.rabbitmq.com
http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Post received from mailinglist
Guest
Posted: Thu Jan 07, 2010 9:18 pm Reply with quote
Guest
Matthew:

This is the best description of the intent of RabbitMQ clustering I have heard yet. Specifically, the distinction that RabbitMQ clustering is for high volume messaging, rather than HA concerns. Do you mind adding some of these points to the RabbitMQ clustering guide (http://www.rabbitmq.com/clustering.html)?


Steve

On Thu, Jan 7, 2010 at 2:38 AM, Matthew Sackman <matthew@lshift.net (matthew@lshift.net)> wrote:
Quote:
Hi David,

You're using the latest default code, yes?

We've just changed the behaviour in this case - previously, yes, the
queue would be recreated. This is incorrect because it then causes many
problems when the failed node comes back up - you end up with the same
queue on both nodes with different concepts of what should be in the
queue. Exciting if undesireable things happen in this case.

Thus if a queue is declared but it's found that the queue does already
exist but is on a downed node, we return a 404, because it's really
saying "the node on which this queue exists can't be found".

From your code, I see you're declaring the queue durable. This really
reenforces the issue because if its durable, then persistent messages
shouldn't be lost, and yet if you want to be able to recreate the queue
on the other node, then it'll start empty, at which point, logically,
the contents of the queue have been lost.

In general, clustering should not be used for HA purposes. If you wish
to achive HA, then active/passive HA can be achieved by using shared
disk storage, heartbeat/pacemaker, maybe a tcp load balancer on the
front, and make sure you set the node names to localhost, and point both
rabbit instances at the same mnesia dir on the shared storage. When the
passive node comes up, it will recover everything from the storage.

Matthew

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss@lists.rabbitmq.com (rabbitmq-discuss@lists.rabbitmq.com)
http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss




Post received from mailinglist
Guest
Posted: Fri Jan 08, 2010 2:32 pm Reply with quote
Guest
-- Sorry if you are getting this twice Matthew, it got bounced from the list

Matthew,

These are roughly the requirements for messaging

- need to ensure there are no messages dropped even if a node in the cluster goes down (HA)
- easily scales
- zero configuration for brokers (nice to have, but Rabbit is pretty easy to configure)
- seamless fail over for clients when the node they are connected to fails (again, nice to have)

Along with those go the regular messaging requirements, point to point, pub/sub, performance, security, integrates well, etc.

As far as the hardware/infrastructure being used for for HA, i was targeting it for cloud deployment.

thanks again!

On Thu, Jan 7, 2010 at 4:17 PM, Stephen Day <sjaday@gmail.com (sjaday@gmail.com)> wrote:
Quote:
Matthew:

This is the best description of the intent of RabbitMQ clustering I have heard yet. Specifically, the distinction that RabbitMQ clustering is for high volume messaging, rather than HA concerns. Do you mind adding some of these points to the RabbitMQ clustering guide (http://www.rabbitmq.com/clustering.html)?


Steve

On Thu, Jan 7, 2010 at 2:38 AM, Matthew Sackman <matthew@lshift.net (matthew@lshift.net)> wrote:


Quote:
Hi David,

You're using the latest default code, yes?

We've just changed the behaviour in this case - previously, yes, the
queue would be recreated. This is incorrect because it then causes many
problems when the failed node comes back up - you end up with the same
queue on both nodes with different concepts of what should be in the
queue. Exciting if undesireable things happen in this case.

Thus if a queue is declared but it's found that the queue does already
exist but is on a downed node, we return a 404, because it's really
saying "the node on which this queue exists can't be found".

From your code, I see you're declaring the queue durable. This really
reenforces the issue because if its durable, then persistent messages
shouldn't be lost, and yet if you want to be able to recreate the queue
on the other node, then it'll start empty, at which point, logically,
the contents of the queue have been lost.

In general, clustering should not be used for HA purposes. If you wish
to achive HA, then active/passive HA can be achieved by using shared
disk storage, heartbeat/pacemaker, maybe a tcp load balancer on the
front, and make sure you set the node names to localhost, and point both
rabbit instances at the same mnesia dir on the shared storage. When the
passive node comes up, it will recover everything from the storage.

Matthew

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss@lists.rabbitmq.com (rabbitmq-discuss@lists.rabbitmq.com)
http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss






_______________________________________________
rabbitmq-discuss mailing list
rabbitmq-discuss@lists.rabbitmq.com (rabbitmq-discuss@lists.rabbitmq.com)
http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss




Post received from mailinglist

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum