Erlang Mailing Lists

Author Message

<  Ejabberd mailing list  ~  server design questions

Guest
Posted: Fri Jul 27, 2007 6:12 pm Reply with quote
Guest
Hi,

I have a few question about server design with ejabberd.

ejabberd appears to be the most qualified server to handle the type of
service I plan to deploy.

- We have one large email domain that consists of around 45K active
users. We also run a hosted email domain service (within the same
environment) for a few hundred domains, consisting of around 10K active
users total. Our plan is to offer a chat service as an option to all of
our domains.

- Authentication will go against our existing LDAP environment, using
the same uid/password as the email accounts. I plan to use an
authentication script for to check for eligibility prior to verifying
the authentication information.

- I would like to install ejabberd as a cluster in Solaris 10 zones on
our existing email servers. I plan to use a layer 4 switch with a
virtual IP address for load balancing.

a) At what point should I consider using something other than Mnesia and
local disk for data storage? How does Mnesia compare to the other
database options? I like that Mnesia is able to cluster easily. We
have an Oracle cluster that I could use, but I don't necessarily want to
rely on it(*). I noticed that jabber.org is using Postgres...

b) If Mnesia is used, how much local disk should I allocate? I know
this is hard to predict without knowing how the system will be used.
What is the average space used per user? What else takes up significant
space... group chat logging?

c) Regarding disk I/O? Is there a need for fibre channel storage? I
could hook up to our storage area network, but again, I don't
necessarily want to rely on it(*).

d) Is it possible to configure 2 virtual domains to both use Mnesia, but
use different disk paths for the data storage? This way I could have
most of the domains on the SAN, and one fall-back domain to use for SAN
outages.

e) Regarding clustering... at what point should I consider using
clustering? Does it raise any instability concerns? Given our
potential active user population (which should be smaller than
jabber.org) would I need to cluster more than 2 servers?

f) Does it matter (or improve performance and stability) if our load
balancing switch hashes client sessions to the same server every time?
Or can it do a round-robin or least-connections balancing algorithm?

(*) I'd like to keep this chat service an independent of our existing
infrastructure as possible since it will be the primary method of
communication for the administrators responsible for restoring downed
services.

Thanks in advance!


Post recived from mailinglist
Guest
Posted: Fri Jul 27, 2007 6:27 pm Reply with quote
Guest
A few peripheral points inline...

Jesse Thompson wrote:

> - We have one large email domain that consists of around 45K active
> users. We also run a hosted email domain service (within the same
> environment) for a few hundred domains, consisting of around 10K active
> users total. Our plan is to offer a chat service as an option to all of
> our domains.
>
> - Authentication will go against our existing LDAP environment, using
> the same uid/password as the email accounts. I plan to use an
> authentication script for to check for eligibility prior to verifying
> the authentication information.

Regarding usernames, there is a potential gotcha because the local part
of an email address is allowed to contain characters that are disallowed
in the node identifier of a Jabber ID. The most common example is the
single quote character.

So for example the following is a valid email address:

tim.o'brien@wisc.edu

However, that is not a valid JabberID because ' is disallowed.

This is why we have JID Escaping:

http://www.xmpp.org/extensions/xep-0106.html

So for example you would map that email address to the following JID:

tim.o\27brien@wisc.edu

I'm not sure if ejabberd needs to support XEP-0106 in order for you to
do the mapping, or whether you can do it with your own specialized script.

> a) At what point should I consider using something other than Mnesia and
> local disk for data storage? How does Mnesia compare to the other
> database options? I like that Mnesia is able to cluster easily. We
> have an Oracle cluster that I could use, but I don't necessarily want to
> rely on it(*). I noticed that jabber.org is using Postgres...

Correct, we are. However, that's probably because we had an existing
postgres install and didn't want to do another database migration.

> (*) I'd like to keep this chat service an independent of our existing
> infrastructure as possible since it will be the primary method of
> communication for the administrators responsible for restoring downed
> services.

Good idea. Smile

Peter

--
Peter Saint-Andre
https://stpeter.im/



Post recived from mailinglist
Guest
Posted: Fri Jul 27, 2007 6:38 pm Reply with quote
Guest
Peter Saint-Andre wrote:
> A few peripheral points inline...
>
> Jesse Thompson wrote:
>
>> - We have one large email domain that consists of around 45K active
>> users. We also run a hosted email domain service (within the same
>> environment) for a few hundred domains, consisting of around 10K active
>> users total. Our plan is to offer a chat service as an option to all of
>> our domains.
>>
>> - Authentication will go against our existing LDAP environment, using
>> the same uid/password as the email accounts. I plan to use an
>> authentication script for to check for eligibility prior to verifying
>> the authentication information.
>
> Regarding usernames, there is a potential gotcha because the local part
> of an email address is allowed to contain characters that are disallowed
> in the node identifier of a Jabber ID. The most common example is the
> single quote character.
>
> ...
>
> I'm not sure if ejabberd needs to support XEP-0106 in order for you to
> do the mapping, or whether you can do it with your own specialized script.

Thanks for the tip. I'll just have to disallow chat activation for
those uids in our hosted domains. I doubt that we have any special
characters in our main domain's uid namespace.


Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 5:35 pm Reply with quote
Guest
I didn't see much of a response to my questions.

What about limits on the number of virtual domains? Aside from
configuration complexity, are there any other concerns?

Jesse

Jesse Thompson wrote:
> Hi,
>
> I have a few question about server design with ejabberd.
>
> ejabberd appears to be the most qualified server to handle the type of
> service I plan to deploy.
>
> - We have one large email domain that consists of around 45K active
> users. We also run a hosted email domain service (within the same
> environment) for a few hundred domains, consisting of around 10K active
> users total. Our plan is to offer a chat service as an option to all of
> our domains.
>
> - Authentication will go against our existing LDAP environment, using
> the same uid/password as the email accounts. I plan to use an
> authentication script for to check for eligibility prior to verifying
> the authentication information.
>
> - I would like to install ejabberd as a cluster in Solaris 10 zones on
> our existing email servers. I plan to use a layer 4 switch with a
> virtual IP address for load balancing.
>
> a) At what point should I consider using something other than Mnesia and
> local disk for data storage? How does Mnesia compare to the other
> database options? I like that Mnesia is able to cluster easily. We
> have an Oracle cluster that I could use, but I don't necessarily want to
> rely on it(*). I noticed that jabber.org is using Postgres...
>
> b) If Mnesia is used, how much local disk should I allocate? I know
> this is hard to predict without knowing how the system will be used.
> What is the average space used per user? What else takes up significant
> space... group chat logging?
>
> c) Regarding disk I/O? Is there a need for fibre channel storage? I
> could hook up to our storage area network, but again, I don't
> necessarily want to rely on it(*).
>
> d) Is it possible to configure 2 virtual domains to both use Mnesia, but
> use different disk paths for the data storage? This way I could have
> most of the domains on the SAN, and one fall-back domain to use for SAN
> outages.
>
> e) Regarding clustering... at what point should I consider using
> clustering? Does it raise any instability concerns? Given our
> potential active user population (which should be smaller than
> jabber.org) would I need to cluster more than 2 servers?
>
> f) Does it matter (or improve performance and stability) if our load
> balancing switch hashes client sessions to the same server every time?
> Or can it do a round-robin or least-connections balancing algorithm?
>
> (*) I'd like to keep this chat service an independent of our existing
> infrastructure as possible since it will be the primary method of
> communication for the administrators responsible for restoring downed
> services.
>
> Thanks in advance!
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> ejabberd mailing list
> ejabberd@jabber.ru
> http://lists.jabber.ru/mailman/listinfo/ejabberd


Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 5:41 pm Reply with quote
Guest
On 7/30/07, Jesse Thompson <jesse.thompson@doit.wisc.edu> wrote:
> I didn't see much of a response to my questions.
>
> What about limits on the number of virtual domains? Aside from
> configuration complexity, are there any other concerns?

If you're using SQL database ejabberd keeps opened up to ten
connections to a database server per virtual host. If you're using
Mnesia you'll almost certainly get scalability issues (Mnesia is not
very robust with huge data, and it is proved to behave badly during
server startup and during backup/restore from backup).

Cheers!
--
Sergei Golovan
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 5:57 pm Reply with quote
Guest
Sergei Golovan wrote:
> On 7/30/07, Jesse Thompson <jesse.thompson@doit.wisc.edu> wrote:
>> I didn't see much of a response to my questions.
>>
>> What about limits on the number of virtual domains? Aside from
>> configuration complexity, are there any other concerns?
>
> If you're using SQL database ejabberd keeps opened up to ten
> connections to a database server per virtual host. If you're using
> Mnesia you'll almost certainly get scalability issues (Mnesia is not
> very robust with huge data, and it is proved to behave badly during
> server startup and during backup/restore from backup).

That's very good to know! Is there a way to adjust the number of
database connections it keeps open? Are some databases better than
others at scaling with many ejabberd virtual domains?


Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 6:37 pm Reply with quote
Guest
Sergei,
How do you think mnesia will perform with over 5 million user records
on it, each user with 100 buddies on their roster?
Would it even scale to that? If so will it still be practical to
manage?
If not, what would you choose for storage for a +5M user server?

Thanks
Jorge Guntanis

-----Original Message-----
From: ejabberd-bounces@jabber.ru [mailto:ejabberd-bounces@jabber.ru] On
Behalf Of Sergei Golovan
Sent: lunes, 30 de julio de 2007 10:41 a.m.
To: ejabberd@jabber.ru
Subject: Re: [ejabberd] server design questions

On 7/30/07, Jesse Thompson <jesse.thompson@doit.wisc.edu> wrote:
> I didn't see much of a response to my questions.
>
> What about limits on the number of virtual domains? Aside from
> configuration complexity, are there any other concerns?

If you're using SQL database ejabberd keeps opened up to ten
connections to a database server per virtual host. If you're using
Mnesia you'll almost certainly get scalability issues (Mnesia is not
very robust with huge data, and it is proved to behave badly during
server startup and during backup/restore from backup).

Cheers!
--
Sergei Golovan
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 6:48 pm Reply with quote
Guest
On 7/30/07, Jorge Guntanis <jorge.guntanis@telcentris.com> wrote:
> Sergei,
> How do you think mnesia will perform with over 5 million user records
> on it, each user with 100 buddies on their roster?
> Would it even scale to that? If so will it still be practical to
> manage?

Well, during Mnesia backup erlang VM needs an amount of memory
approximately equal to Mnesia DB size.

> If not, what would you choose for storage for a +5M user server?

I don't have an experience of deploying such a huge servers, so I
don't have a suggestion... I know about testing ejabberd with
postgresql backend for 200000 concurrent connection on a two-noded
cluster. It worked, but that was a rather short-term testing, and I
don't know how the memory usage evolves over time.

BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
and suffers from memory overfulls on about 10000-12000 concurrent
connections.

--
Sergei Golovan
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 6:52 pm Reply with quote
Guest
Sergei Golovan wrote:
> BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
> and suffers from memory overfulls on about 10000-12000 concurrent
> connections.
>

I thought jabber.org uses postgresql.



Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 6:54 pm Reply with quote
Guest
Sergei Golovan wrote:
> On 7/30/07, Jesse Thompson <jesse.thompson@doit.wisc.edu> wrote:
>> I didn't see much of a response to my questions.
>>
>> What about limits on the number of virtual domains? Aside from
>> configuration complexity, are there any other concerns?
>
> If you're using SQL database ejabberd keeps opened up to ten
> connections to a database server per virtual host. If you're using
> Mnesia you'll almost certainly get scalability issues (Mnesia is not
> very robust with huge data, and it is proved to behave badly during
> server startup and during backup/restore from backup).
>
> Cheers!

So, assuming the Mnesia is not an option, are there any issues with
using Oracle (assuming it can deal with 10 connections/domain and
supports ODBC)?


Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 6:57 pm Reply with quote
Guest
On 7/30/07, Jesse Thompson <jesse.thompson@doit.wisc.edu> wrote:
> Sergei Golovan wrote:
> > BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
> > and suffers from memory overfulls on about 10000-12000 concurrent
> > connections.
> >
>
> I thought jabber.org uses postgresql.

It does. But ejabberd likes memory very much even when using SQL
database backend (in fact its memory consumption remains about the
same).

--
Sergei Golovan
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 7:13 pm Reply with quote
Guest
Jesse Thompson said the following on 07/30/2007 02:51 PM:
> Sergei Golovan wrote:
>> BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
>> and suffers from memory overfulls on about 10000-12000 concurrent
>> connections.
>>
>
> I thought jabber.org uses postgresql.
>

Yes we(jabber.org) do use postgresql. The seg faults are because of a large vcard.

-Jonathan


Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 7:42 pm Reply with quote
Guest
Jonathan,
If it's posible to know.
How many boxes are you running jabber.org on?
Does it have any special architecture? Do you separate jabber services
between boxes?

Thanks much.

Jorge Guntanis

-----Original Message-----
From: ejabberd-bounces@jabber.ru [mailto:ejabberd-bounces@jabber.ru] On
Behalf Of jsiegle
Sent: lunes, 30 de julio de 2007 12:13 p.m.
To: ejabberd@jabber.ru
Subject: Re: [ejabberd] server design questions

Jesse Thompson said the following on 07/30/2007 02:51 PM:
> Sergei Golovan wrote:
>> BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
>> and suffers from memory overfulls on about 10000-12000 concurrent
>> connections.
>>
>
> I thought jabber.org uses postgresql.
>

Yes we(jabber.org) do use postgresql. The seg faults are because of a
large vcard.

-Jonathan
_______________________________________________
ejabberd mailing list
ejabberd@jabber.ru
http://lists.jabber.ru/mailman/listinfo/ejabberd
Post recived from mailinglist
Guest
Posted: Mon Jul 30, 2007 8:17 pm Reply with quote
Guest
jsiegle wrote:
> Jesse Thompson said the following on 07/30/2007 02:51 PM:
>> Sergei Golovan wrote:
>>> BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
>>> and suffers from memory overfulls on about 10000-12000 concurrent
>>> connections.
>>>
>>
>> I thought jabber.org uses postgresql.
>>
>
> Yes we(jabber.org) do use postgresql. The seg faults are because of a
> large vcard.
> -Jonathan

So, would it help alleviate this specific type of problem to spread the
traffic across a 2 (or more) node cluster?


Post recived from mailinglist
Guest
Posted: Tue Jul 31, 2007 5:27 pm Reply with quote
Guest
jsiegle wrote:
> Jesse Thompson said the following on 07/30/2007 02:51 PM:
>> Sergei Golovan wrote:
>>> BTW, jabber.org uses one 32-bit erlang node (with 2Gb heap boundary)
>>> and suffers from memory overfulls on about 10000-12000 concurrent
>>> connections.
>>>
>>
>> I thought jabber.org uses postgresql.
>>
>
> Yes we(jabber.org) do use postgresql. The seg faults are because of a
> large vcard.
> -Jonathan

Would these problems exist if mod_vcard_ldap were used instead? We have
an existing ldap whitepages service. Ideally, I'd like to query it
directly when possible.


Post recived from mailinglist

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum