| Author |
Message |
< Yaws mailing list ~ required steps for clustering yaws ? |
| Guest |
Posted: Mon Jul 17, 2006 8:06 pm |
|
|
|
Guest
|
I'm not sure about the load balancing, but it sounds like if you want
session state to be shared between all Yaws instances, you should put
this state in a Mnesia table that can be accessed from all instances.
Maybe somebody else has a better idea...
On 7/15/06, Roberto Saccon <rsaccon@gmail.com> wrote:
> I didin't found any info in the yaws docu about running multiples yaws
> behind a loadbalancer in a server farm.
>
> For mnesia I think nothing special needs to be done at yaws, just
> creating the tables as replicated is enough, but please correct me if
> that is wrong.
>
> But what about the yaws HTPP session, which are kept in an ETS tables
> ? Can / or should that data be replicated among cluster-nodes ? If
> yes, how ? If no, then probably a loadbalancer with sticky sessions
> such as http://haproxy.1wt.eu/ (injecting info about the selected
> cluster node into the HTTP header should be used. Or does anybody have
> a better suggestion ?
>
> regards
> --
> Roberto Saccon
>
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Erlyaws-list mailing list
> Erlyaws-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/erlyaws-list
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| tobbe |
Posted: Mon Jul 17, 2006 8:41 pm |
|
|
|
User
Joined: 19 Jan 2005
Posts: 274
Location: Stockholm, Sweden
|
Idea:
If you connect the info: which IP address your interface had
where the client first entered your system, to your cookie,
then check it and do a HTTP redirect if necessary, etc...
Cheers, Tobbe
Roberto Saccon wrote:
> I didin't found any info in the yaws docu about running multiples yaws
> behind a loadbalancer in a server farm.
>
> For mnesia I think nothing special needs to be done at yaws, just
> creating the tables as replicated is enough, but please correct me if
> that is wrong.
>
> But what about the yaws HTPP session, which are kept in an ETS tables
> ? Can / or should that data be replicated among cluster-nodes ? If
> yes, how ? If no, then probably a loadbalancer with sticky sessions
> such as http://haproxy.1wt.eu/ (injecting info about the selected
> cluster node into the HTTP header should be used. Or does anybody have
> a better suggestion ?
>
> regards
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Tue Jul 18, 2006 12:32 pm |
|
|
|
Guest
|
I have just started looking into this problem myself.
From a chat with a friend Java expert (yes, Erlang fans are allowed to keep some of those around ), it would seem that Java Application Server clusters commonly replicate session IDs and objects between them. So Yariv's suggestion would probably be the closest method. Using a RAM-only mnesia table would be fastest, I guess.
On 7/17/06, Torbjorn Tornkvist <tobbe@tornkvist.org (tobbe@tornkvist.org)> wrote:Quote:
Idea:
If you connect the info: which IP address your interface had
where the client first entered your system, to your cookie,
then check it and do a HTTP redirect if necessary, etc...
Cheers, Tobbe
Roberto Saccon wrote:
> I didin't found any info in the yaws docu about running multiples yaws
> |
|
|
| Back to top |
|
| Guest |
Posted: Tue Jul 18, 2006 1:42 pm |
|
|
|
Guest
|
After thinking about it more, I came at the following two options:
- Have a session-aware load balancer that always sends clients to the
same web server. This lets you avoid storing session data in Mnesia,
but it's also more failure prone because if the web server goes down,
the session data is lost. Not exactly ideal.
- Use a simple load balancer that round robins requests between web
servers, and keep all session data in a replicated Mnesia table. This
is a simpler setup in a sense and it's more resilient too. The
downside is that the replication may slow things down, but everything
has a price
Yariv
On 7/18/06, Alex Arnon <alex.arnon@gmail.com> wrote:
> I have just started looking into this problem myself.
> From a chat with a friend Java expert (yes, Erlang fans are allowed to keep
> some of those around ), it would seem that Java Application Server
> clusters commonly replicate session IDs and objects between them. So Yariv's
> suggestion would probably be the closest method. Using a RAM-only mnesia
> table would be fastest, I guess.
>
>
>
> On 7/17/06, Torbjorn Tornkvist <tobbe@tornkvist.org> wrote:
> >
> > Idea:
> > If you connect the info: which IP address your interface had
> > where the client first entered your system, to your cookie,
> > then check it and do a HTTP redirect if necessary, etc...
> >
> > Cheers, Tobbe
> >
> >
> > Roberto Saccon wrote:
> > > I didin't found any info in the yaws docu about running multiples yaws
> > > behind a loadbalancer in a server farm.
> > >
> > > For mnesia I think nothing special needs to be done at yaws, just
> > > creating the tables as replicated is enough, but please correct me if
> > > that is wrong.
> > >
> > > But what about the yaws HTPP session, which are kept in an ETS tables
> > > ? Can / or should that data be replicated among cluster-nodes ? If
> > > yes, how ? If no, then probably a loadbalancer with sticky sessions
> > > such as http://haproxy.1wt.eu/ (injecting info about the selected
> > > cluster node into the HTTP header should be used. Or does anybody have
> > > a better suggestion ?
> > >
> > > regards
> >
> >
> >
> -------------------------------------------------------------------------
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to share
> your
> > opinions on IT & business topics through brief surveys -- and earn cash
> >
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > _______________________________________________
> > Erlyaws-list mailing list
> > Erlyaws-list@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
> >
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>
> _______________________________________________
> Erlyaws-list mailing list
> Erlyaws-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/erlyaws-list
>
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Tue Jul 18, 2006 3:21 pm |
|
|
|
Guest
|
|
| Back to top |
|
| Guest |
Posted: Tue Jul 18, 2006 3:37 pm |
|
|
|
Guest
|
Eddie (http://eddie.sourceforge.net/what.html) might be what you are looking for.
t
-------- Original Message --------
From: "Yariv Sadan" <yarivvv@gmail.com>
Apparently from: erlyaws-list-bounces@lists.sourceforge.net
To: "Alex Arnon" <alex.arnon@gmail.com>
Cc: erlyaws-list@lists.sourceforge.net
Subject: Re: [Erlyaws-list] required steps for clustering yaws ?
Date: Tue, 18 Jul 2006 09:41:28 -0400
> After thinking about it more, I came at the following two options:
>
> - Have a session-aware load balancer that always sends clients to the
> same web server. This lets you avoid storing session data in Mnesia,
> but it's also more failure prone because if the web server goes down,
> the session data is lost. Not exactly ideal.
>
> - Use a simple load balancer that round robins requests between web
> servers, and keep all session data in a replicated Mnesia table. This
> is a simpler setup in a sense and it's more resilient too. The
> downside is that the replication may slow things down, but everything
> has a price
>
> Yariv
>
> On 7/18/06, Alex Arnon <alex.arnon@gmail.com> wrote:
> > I have just started looking into this problem myself.
> > From a chat with a friend Java expert (yes, Erlang fans are allowed to keep
> > some of those around ), it would seem that Java Application Server
> > clusters commonly replicate session IDs and objects between them. So Yariv's
> > suggestion would probably be the closest method. Using a RAM-only mnesia
> > table would be fastest, I guess.
> >
> >
> >
> > On 7/17/06, Torbjorn Tornkvist <tobbe@tornkvist.org> wrote:
> > >
> > > Idea:
> > > If you connect the info: which IP address your interface had
> > > where the client first entered your system, to your cookie,
> > > then check it and do a HTTP redirect if necessary, etc...
> > >
> > > Cheers, Tobbe
> > >
> > >
> > > Roberto Saccon wrote:
> > > > I didin't found any info in the yaws docu about running multiples yaws
> > > > behind a loadbalancer in a server farm.
> > > >
> > > > For mnesia I think nothing special needs to be done at yaws, just
> > > > creating the tables as replicated is enough, but please correct me if
> > > > that is wrong.
> > > >
> > > > But what about the yaws HTPP session, which are kept in an ETS tables
> > > > ? Can / or should that data be replicated among cluster-nodes ? If
> > > > yes, how ? If no, then probably a loadbalancer with sticky sessions
> > > > such as http://haproxy.1wt.eu/ (injecting info about the selected
> > > > cluster node into the HTTP header should be used. Or does anybody have
> > > > a better suggestion ?
> > > >
> > > > regards
> > >
> > >
> > >
> > -------------------------------------------------------------------------
> > > Take Surveys. Earn Cash. Influence the Future of IT
> > > Join SourceForge.net's Techsay panel and you'll get the chance to share
> > your
> > > opinions on IT & business topics through brief surveys -- and earn cash
> > >
> > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > > _______________________________________________
> > > Erlyaws-list mailing list
> > > Erlyaws-list@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
> > >
> >
> >
> > -------------------------------------------------------------------------
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to share your
> > opinions on IT & business topics through brief surveys -- and earn cash
> > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> >
> > _______________________________________________
> > Erlyaws-list mailing list
> > Erlyaws-list@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
> >
> >
> >
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Erlyaws-list mailing list
> Erlyaws-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/erlyaws-list
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Wed Jul 19, 2006 5:07 am |
|
|
|
Guest
|
Interesting, I am trying to get Eddie running but it needs some
surgery to compile cleanly on gcc 4. I'm out of hacking time for today
but if anybody else is interested, let's put together an IRC hackathon
on Saturday to fix this.
Steve
On 7/18/06, tty@safe-mail.net <tty@safe-mail.net> wrote:
> Eddie (http://eddie.sourceforge.net/what.html) might be what you are looking for.
>
> t
>
> -------- Original Message --------
> From: "Yariv Sadan" <yarivvv@gmail.com>
> Apparently from: erlyaws-list-bounces@lists.sourceforge.net
> To: "Alex Arnon" <alex.arnon@gmail.com>
> Cc: erlyaws-list@lists.sourceforge.net
> Subject: Re: [Erlyaws-list] required steps for clustering yaws ?
> Date: Tue, 18 Jul 2006 09:41:28 -0400
>
> > After thinking about it more, I came at the following two options:
> >
> > - Have a session-aware load balancer that always sends clients to the
> > same web server. This lets you avoid storing session data in Mnesia,
> > but it's also more failure prone because if the web server goes down,
> > the session data is lost. Not exactly ideal.
> >
> > - Use a simple load balancer that round robins requests between web
> > servers, and keep all session data in a replicated Mnesia table. This
> > is a simpler setup in a sense and it's more resilient too. The
> > downside is that the replication may slow things down, but everything
> > has a price
> >
> > Yariv
> >
> > On 7/18/06, Alex Arnon <alex.arnon@gmail.com> wrote:
> > > I have just started looking into this problem myself.
> > > From a chat with a friend Java expert (yes, Erlang fans are allowed to keep
> > > some of those around ), it would seem that Java Application Server
> > > clusters commonly replicate session IDs and objects between them. So Yariv's
> > > suggestion would probably be the closest method. Using a RAM-only mnesia
> > > table would be fastest, I guess.
> > >
> > >
> > >
> > > On 7/17/06, Torbjorn Tornkvist <tobbe@tornkvist.org> wrote:
> > > >
> > > > Idea:
> > > > If you connect the info: which IP address your interface had
> > > > where the client first entered your system, to your cookie,
> > > > then check it and do a HTTP redirect if necessary, etc...
> > > >
> > > > Cheers, Tobbe
> > > >
> > > >
> > > > Roberto Saccon wrote:
> > > > > I didin't found any info in the yaws docu about running multiples yaws
> > > > > behind a loadbalancer in a server farm.
> > > > >
> > > > > For mnesia I think nothing special needs to be done at yaws, just
> > > > > creating the tables as replicated is enough, but please correct me if
> > > > > that is wrong.
> > > > >
> > > > > But what about the yaws HTPP session, which are kept in an ETS tables
> > > > > ? Can / or should that data be replicated among cluster-nodes ? If
> > > > > yes, how ? If no, then probably a loadbalancer with sticky sessions
> > > > > such as http://haproxy.1wt.eu/ (injecting info about the selected
> > > > > cluster node into the HTTP header should be used. Or does anybody have
> > > > > a better suggestion ?
> > > > >
> > > > > regards
> > > >
> > > >
> > > >
> > > -------------------------------------------------------------------------
> > > > Take Surveys. Earn Cash. Influence the Future of IT
> > > > Join SourceForge.net's Techsay panel and you'll get the chance to share
> > > your
> > > > opinions on IT & business topics through brief surveys -- and earn cash
> > > >
> > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > > > _______________________________________________
> > > > Erlyaws-list mailing list
> > > > Erlyaws-list@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
> > > >
> > >
> > >
> > > -------------------------------------------------------------------------
> > > Take Surveys. Earn Cash. Influence the Future of IT
> > > Join SourceForge.net's Techsay panel and you'll get the chance to share your
> > > opinions on IT & business topics through brief surveys -- and earn cash
> > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > >
> > > _______________________________________________
> > > Erlyaws-list mailing list
> > > Erlyaws-list@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
> > >
> > >
> > >
> >
> > -------------------------------------------------------------------------
> > Take Surveys. Earn Cash. Influence the Future of IT
> > Join SourceForge.net's Techsay panel and you'll get the chance to share your
> > opinions on IT & business topics through brief surveys -- and earn cash
> > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> > _______________________________________________
> > Erlyaws-list mailing list
> > Erlyaws-list@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/erlyaws-list
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> Erlyaws-list mailing list
> Erlyaws-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/erlyaws-list
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Wed Jul 19, 2006 3:06 pm |
|
|
|
Guest
|
[snip]
Quote: > >
> > - Use a simple load balancer that round robins requests between web
> > servers, and keep all session data in a replicated Mnesia table. This
> > is a simpler setup in a sense and it's more resilient too. The
> > downside is that the replication may slow things down, but everything
> > has a price
> >
This is the option that I'm currently exploring, as it seems to be more generic in nature.
However, there is the problem of expired sessions, specifically the case where a user started a session but for some reason or other never got around to "closing" it.
I think a per-node Session Manager process would be necessary for this, where the connection process (at the start of out()) will notify it that it is using the session, and would manage locking of the session object. To complete the picture, session scrubbing processes should activate periodically and remove sessions that are expired. Distributing this might be done by another table (ordered set) whose key is expiration time.
Thoughts?
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Wed Jul 19, 2006 4:28 pm |
|
|
|
Guest
|
Alex, I think you're thinking on the right track. Yaws currently
automatically expires sessions that are stored in its internal ets
table. The code is rather simple, in yaws_session_server.erl. However,
this mechanism is too simplistic for replicated, distributed session
data, because you don't want all Yaws instances to be responsible for
expiring sessions. One instance is enough.
A better approach, similar to what you suggested, would be to create a
"super" Yaws supervisor that supervises the whole Yaws farm. This
superviser will also be responsible for creating the Mnesia session
store and for purging expired sessions.
This supervisor could also provide a simple API for Yaws to updating
and querying session state, or it could be done directly with Mnesia
functions.
Session state could be stored on the same nodes that run Yaws or on
different nodes altogether. This can be configured depending on the
application's needs (similar to a Mnesia schema).
What do you think?
Yariv
>
>
> This is the option that I'm currently exploring, as it seems to be more
> generic in nature.
> However, there is the problem of expired sessions, specifically the case
> where a user started a session but for some reason or other never got around
> to "closing" it.
> I think a per-node Session Manager process would be necessary for this,
> where the connection process (at the start of out()) will notify it that it
> is using the session, and would manage locking of the session object. To
> complete the picture, session scrubbing processes should activate
> periodically and remove sessions that are expired. Distributing this might
> be done by another table (ordered set) whose key is expiration time.
>
> Thoughts?
>
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>
> _______________________________________________
> Erlyaws-list mailing list
> Erlyaws-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/erlyaws-list
>
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| noss |
Posted: Thu Jul 20, 2006 7:54 am |
|
|
|
User
Joined: 09 Oct 2005
Posts: 290
|
On 7/19/06, Yariv Sadan <yarivvv@gmail.com> wrote:
> A better approach, similar to what you suggested, would be to create a
> "super" Yaws supervisor that supervises the whole Yaws farm. This
> superviser will also be responsible for creating the Mnesia session
> store and for purging expired sessions.
Sounds centralised and vulnerable to single-machine failures.
Having each node scanning for sessions created by itself frequently
doesnt sound that though.
Though, as always, the best solution probably depend on the number of
sessions and how much data each session carries, and how load
balancing over the cluster is implemented.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Thu Jul 20, 2006 12:56 pm |
|
|
|
Guest
|
>
> Sounds centralised and vulnerable to single-machine failures.
>
Yes, that's true. Is there an easy mechanism to have another node take
over when the master goes down?
This is actually a more generic question: what is the "standard" way
of implementing supervision trees in a multi-master network? How is it
done in Mnesia?
> Having each node scanning for sessions created by itself frequently
> doesnt sound that though.
I think this may create problems, too. What if a node goes offline for
a long time? There needs to be a mechanism for another node to take
ownership over its sessions. (This is probably an edge case, but it's
good to think about it.)
Also, the storage cost goes up, because the creator ID needs to be
stored in the DB and indexed. This sounds cheap, but if the sessions
are stored in a RAM table it might affect performance by consuming
space that could be used for DB row caching.
>
> Though, as always, the best solution probably depend on the number of
> sessions and how much data each session carries, and how load
> balancing over the cluster is implemented.
>
I think there are probably one or two approaches that would work well
for a large number of "simple" cases (2-4 servers). It would be nice
to have them implemented in Yaws out of the box.
Yariv
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived frommailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Thu Jul 20, 2006 2:30 pm |
|
|
|
Guest
|
[Disclaimer: I wrote this in two sittings, so apologies for typos/discrepancies.]
You're right in that one process for purging expired sessions is enough - no need for extra collisions
Using a supervisor would become a single point of failure... and a bottleneck.
Here's an alteration to your suggestion, then. Note that I haven't used mnesia in a distributed environment yet (so I rely on your experience in this matter), and am assuming that gen_leader (from jungerl) works well. If it doesn't, then implementing something close or "good enough" (where we might get several concurrent active leaders occasionally) should be fine.
First of all, to simplify matters I am going to also make the following assumptions:
1) We keep all session tables replicated among the yaws servers. I.e, yaws server = session server.
2) All session state/vars etc. are kept in ram_copies. Persistent state might have gotchas I'm not aware of. In short, our cluster runs "forever", since it's so very very robust.
3) Our session IDs (SID) are passed via the URL. No cookies.
4) Generated SID are never repeated, and the SID space is unique per generating node (use node name + date + some randomized something etc.). If you know a bit of magic to do this, please share
Now, to complicate matters once more:
1) I'm assuming a semantics where each connection process will use an api for session access where the entire session data/object is read into the process (acquisition) then atomically written back to store when it is done. Looking at some servlet documentation this seems like a valid approach (completely locking the session data also introduces other complications, essentially distributed lock management which might be complex in this specific case since session reaping process(es) would also access the same structures). Of course, the fact that in the EJB universe practically EVERYTHING is open to interpretation, but I made my research more rigorous by reading through some independent magazine ARTICLES
2) A process group (via gen_leader) is created for purging/reaping expired sessions.
Each member of this group will be a registered per-node process, whose lifetime is longer than that of yaws. This means that whenever a reaper dies, its siblings can assume that the entire node has gone down.
3) Loosely, we define the following data structures (some refinement might be in order):
%%=============================================================================
%% Replicated tables.
%%=============================================================================
%%% session_lifetime (set): Also used for quick session lookup.
%% |
|
|
| Back to top |
|
| Guest |
Posted: Fri Jul 21, 2006 7:46 pm |
|
|
|
Guest
|
On 7/20/06, Alex Arnon <alex.arnon@gmail.com> wrote:
> [Disclaimer: I wrote this in two sittings, so apologies for
> typos/discrepancies.]
>
> You're right in that one process for purging expired sessions is enough - no
> need for extra collisions
> Using a supervisor would become a single point of failure... and a
> bottleneck.
I agree.
> Here's an alteration to your suggestion, then. Note that I haven't used
> mnesia in a distributed environment yet (so I rely on your experience in
> this matter),
Neither have I
> and am assuming that gen_leader (from jungerl) works well. If
> it doesn't, then implementing something close or "good enough" (where we
> might get several concurrent active leaders occasionally) should be fine.
>
> First of all, to simplify matters I am going to also make the following
> assumptions:
> 1) We keep all session tables replicated among the yaws servers. I.e, yaws
> server = session server.
Maybe the number of session servers should be a parameter. Let's say
you have 6 yaws servers, it might be sufficient to have only 3 of them
act as session servers to spare the extra replication cost. The
downside is that the 3 non-session servers have higher latency when
requesting session data. I'm not sure what the best tradeoff is in a
production environment.
> 2) All session state/vars etc. are kept in ram_copies. Persistent state
> might have gotchas I'm not aware of. In short, our cluster runs "forever",
> since it's so very very robust.
disc_copies have the nice advantage that in case of a crash, the whole
table doesn't need to be copied over to the crashed node when it's
restarted. I'm not sure we should dismiss it right away Maybe it
should be ram_copies by default with a disc_copies or even
disc_only_copies options for the user. It shouldn't break in the
latter cases though.
> 3) Our session IDs (SID) are passed via the URL. No cookies.
This means that you need automatic generation of forms, because
developers shouldn't manually embed this SID in each form as a hidden
parameter. Why not use cookies? It seems simpler to me.
> 4) Generated SID are never repeated, and the SID space is unique per
> generating node (use node name + date + some randomized something etc.). If
> you know a bit of magic to do this, please share
I think your algorithm will work well enough here
>
> Now, to complicate matters once more:
> 1) I'm assuming a semantics where each connection process will use an api
> for session access where the entire session data/object is read into the
> process (acquisition) then atomically written back to store when it is done.
Yes, I do like having a "session" api in Yaws, so the users of
sessions don't have to know or worry about how sessions are
implemented and where they are stored.
> Looking at some servlet documentation this seems like a valid approach
> (completely locking the session data also introduces other complications,
> essentially distributed lock management which might be complex in this
> specific case since session reaping process(es) would also access the same
> structures). Of course, the fact that in the EJB universe practically
> EVERYTHING is open to interpretation, but I made my research more rigorous
> by reading through some independent magazine ARTICLES
I think Mnesia nicely takes care of the concurrency and locking
issues. Am I missing something?
> 2) A process group (via gen_leader) is created for purging/reaping expired
> sessions.
> Each member of this group will be a registered per-node process, whose
> lifetime is longer than that of yaws. This means that whenever a reaper
> dies, its siblings can assume that the entire node has gone down.
I need to read up about gen_leader, but this sounds like a promising approach.
> 3) Loosely, we define the following data structures (some refinement might
> be in order):
>
> %%=============================================================================
> %% Replicated tables.
> %%=============================================================================
>
> %%% session_lifetime (set): Also used for quick session lookup.
> %% K: sid().
> %% V: time() | list(reaper_pid() - one per attached connection).
>
> %%% reaper_session (bag):
> %% K: reaper_pid().
> %% V: sid().
>
> %%% expiration_times (ordered_set):
> %% K: {time(),sid()}.
>
> %%% session_data (set):
> %% K: sid().
> %% V: proplist() of session variables.
>
> %%=============================================================================
> %% Reaper internal.
> %%=============================================================================
>
> %%% connections (set):
> %% K: conn_pid().
> %% V: list(sid()).
>
> %%=============================================================================
> %% Connection internal.
> %%=============================================================================
>
> %%% PD entries:
> %% sessions: list(sid()).
> %% {session, sid()}: dict() or proplist() containing the session variables.
>
>
> Several things to note here:
> - Each connection process keeps track of attached sessions via its process
> dictionary.
> - Each reaper knows which processes are attached to which sessions in its
> node.
> - A session is in use when its session_lifetime table contains a pid list
> and not a time() value.
> - Each reaper is implicitly aware (via gen_leader and its configuration) of
> the set of active and defunct reapers in the clusters.
>
> Outline of operational logic:
> - When a connection process wishes to acquire/attach a session, it first
> sends a notification to its local reaper, then performs a transaction which
> adds itself to the appropriate tables and retrieves the session data
> (writing it to its PD). Upon receipt of this message, the reaper starts
> monitoring the connection process. Note that the reaper may have the process
> written down for several sessions.
> - Release/writeback of a session is done in reverse. When the last
> connection process detaches from the session data and the session should
> live on, it sets the session's expiry time, otherwise it erases the session.
> - If a connection process exits without properly detaching from its
> sessions, the reaper will be notified via a monitor message, and can do the
> detachment bit itself (or spawn a process to do it, so it maintains good
> response).
> - If a reaper process goes down, the lead reaper (the Grim one, with the
> scythe) will go over all sessions marked by that reaper and perform the
> appropriate detachment operations.
> - To take care of the case of a double failure or worse, where both a
> non-leader and a leader go down, the lead reaper can perform periodic
> cleanup as it knows which reapers are active and which are not.
>
> The idea was to let the connection processes do their own work as much as
> possible and avoid bottlenecks such as going through a global or even
> per-node process for accessing/manipulating the sessions. Reaper processes
> must access mnesia as little as possible to achieve that, I think.
Maybe I missed a couple of details as I read though this description,
but why is it better to have one reaper per node rather than one
global reaper in the leader that occasionally queries the Mnesia table
for sessions who've been idle for more than X minutes, and the purges
them? Then the problem is reduced to always picking one leader, even
in the face of node failure.
Am I missing something?
Best
Yariv
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Erlyaws-list mailing list
Erlyaws-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/erlyaws-list
Post recived from mailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Fri Jul 21, 2006 8:15 pm |
|
|
|
Guest
|
Yariv Sadan wrote:
>>Sounds centralised and vulnerable to single-machine failures.
>>
>>
>>
>
>Yes, that's true. Is there an easy mechanism to have another node take
>over when the master goes down?
>
>This is actually a more generic question: what is the "standard" way
>of implementing supervision trees in a multi-master network? How is it
>done in Mnesia?
>
>
>
It is non trivial. Mnesia relies on erlangs builtin "nodedown" discovery
which use heartbeats, timeouts and whatnot.
In the case of websites, we typically almost always want the master to
also "take over" an IP address - typically the IP which the IP defined
for the site we are running. This is also non trivial and almost always
requires gratitiotous ARP to let the first upsream L2 switch to learn
that the IP has moved. (I enclose a file I wrote some time ago
which deals with this)
>>Having each node scanning for sessions created by itself frequently
>>doesnt sound that though.
>>
>>
>
>I think this may create problems, too. What if a node goes offline for
>a long time? There needs to be a mechanism for another node to take
>ownership over its sessions. (This is probably an edge case, but it's
>good to think about it.)
>
>
To have multinode persistent sesions all session data must be stored on
at least 2 nodes. Right. One trick is to embed the nodename or some id
in the cookie which makes it possible to easily map sessions to nodes.
>Also, the storage cost goes up, because the creator ID needs to be
>stored in the DB and indexed. This sounds cheap, but if the sessions
>are stored in a RAM table it might affect performance by consuming
>space that could be used for DB row caching.
>
>
>
Nothing is for free.
>>Though, as always, the best solution probably depend on the number of
>>sessions and how much data each session carries, and how load
>>balancing over the cluster is implemented.
>>
>>
>>
>
>
>
Loadbalance tecnology has come long way - there are lots of hw/products
that do this.
I used to work for such a company - Alteon.
>I think there are probably one or two approaches that would work well
>for a large number of "simple" cases (2-4 servers). It would be nice
>to have them implemented in Yaws out of the box.
>
>
>
It makes a lot of sence to build support into yaws for clustering (using
the attached vip.erl)
Actually, it's not a lot of work.
<sorry>
There are quite a few pointers and references to internal project code
in vip.erl
but it's useable nevertheless if you are into the faulttolerance game
</sorry>
/klacke
Post recived from mailinglist |
|
|
| Back to top |
|
| Guest |
Posted: Mon Jul 24, 2006 11:27 am |
|
|
|
Guest
|
On 7/21/06, Yariv Sadan <yarivvv@gmail.com (yarivvv@gmail.com)> wrote:Quote: On 7/20/06, Alex Arnon <alex.arnon@gmail.com (alex.arnon@gmail.com)> wrote:
> [Disclaimer: I wrote this in two sittings, so apologies for
> typos/discrepancies.]
>
> You're right in that one process for purging expired sessions is enough - no
> need for extra collisions
> Using a supervisor would become a single point of failure... and a
> bottleneck.
I agree.
> Here's an alteration to your suggestion, then. Note that I haven't used
> mnesia in a distributed environment yet (so I rely on your experience in
> this matter),
Neither have I
> and am assuming that gen_leader (from jungerl) works well. If
> it doesn't, then implementing something close or "good enough" (where we
> might get several concurrent active leaders occasionally) should be fine.
>
> First of all, to simplify matters I am going to also make the following
> assumptions:
> 1) We keep all session tables replicated among the yaws servers. I.e, yaws
> server = session server.
Maybe the number of session servers should be a parameter. Let's say
you have 6 yaws servers, it might be sufficient to have only 3 of them
act as session servers to spare the extra replication cost. The
downside is that the 3 non-session servers have higher latency when
requesting session data. I'm not sure what the best tradeoff is in a
production environment.
I think that making it more generic (N session servers vs. M (>=N) yaws servers) would indeed be a smart move.
|
|
|
| Back to top |
|
|
|
All times are GMT
Page 1 of 2
Goto page 1, 2 Next
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You can attach files in this forum You can download files in this forum
|
|
|