Erlang Mailing Lists

Author Message

<  Ejabberd mailing list  ~  Clustered servers not reconnecting

Guest
Posted: Sun Dec 10, 2006 3:27 am Reply with quote
Guest
Hi. I have two ejabberd 1.1.2 servers clustered together with the
documentation I've found on the website here and in the manual. Things seem
to work fine for a while but it seems like every night (possibly when they're
doing nightly maintenance) the servers disconnect from eachother. The main
one logs this

=ERROR REPORT==== 2006-12-05 00:16:45 ===
** Node ejabberd@server2 not responding **
** Removing (timedout) connection **

and the backup

=ERROR REPORT==== 2006-12-05 00:16:54 ===
** Node ejabberd@server1 not responding **
** Removing (timedout) connection **

while it's possible the vpn between them is having a hiccup at that
point I know it's not going down long because no other processes complain
at all. When I connect to the erl session on server2 it is able to ping
server1 without a problem (and vice versa) and when I restart server2 then
everything is fine again for a while although I do get this mesage

=ERROR REPORT==== 2006-12-05 10:29:30 ===
Mnesia(ejabberd@server1): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, ejabberd@server2}

So, is there a way I can find out what's causing the problem in the
first place?

And, is there a way I can get mnesia / ejabberd to continue to attempt
the connection until it reconnects?

--
Matthew Harrell Nondeterminism means never
Bit Twiddlers, Inc. having to say you are wrong.
mharrell@bittwiddlers.com

_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Sun Dec 10, 2006 9:20 am Reply with quote
Guest
Hello Matthew,
Le 10 d
Guest
Posted: Sun Dec 10, 2006 12:51 pm Reply with quote
Guest
> They do that and this is automatic. However, if your two nodes are =20
> both kept online, receiving client connections the two database will =20
> evolves differently. The database are different and thus =20
> inconsistent. Mnesia does not know which version of the data it =20
> should keep, that's why the database wait for a manual operation.
>
> As a first place, I would not build an ejabberd cluster over a wan, =20
> but only over a lan to avoid those problems.
>
> I hope this helps,

Yeah, that helps. I really just set up the cluster because I thought it
would be useful and a good learning process. It's not critical in my
setup. I may fiddle with reducing the latency or at least determining
what's causing the high latency in the early morning

When I get the message about the inconsistent, partitioned network I assume
this means that mnesia has detected that they have both run separately for
a while and can no longer be easily merged. Is there a process I can use
at this point to get them back together?

--
Matthew Harrell Quantum Mechanics:
Bit Twiddlers, Inc. The dreams stuff is made of.
mharrell@bittwiddlers.com
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum