Erlang Mailing Lists

Author Message

<  Erlyweb mailing list  ~  html generation

Guest
Posted: Fri Feb 22, 2008 8:14 am Reply with quote
Guest
I think it's highly unlikely that you'll have a key collision given
that most webapps can expect at most a few thousand sessions and
8.39299e+17 is a very big number Smile

In production code, though, I would make this code more efficient:

length(lists:seq($a, $z) ++ lists:seq($A, $Z) ++ lists:seq($0, $9)).

It should lookup each element from a fixed size tuple.

On Thu, Feb 21, 2008 at 4:23 AM, jm <jeffm@ghostgun.com> wrote:
>
>
> > length(lists:seq($a, $z) ++ lists:seq($A, $Z) ++ lists:seq($0, $9)).
> 62
>
> > math:pow(62,10).
> 8.39299e+17
>
> Mmm, very little chance of collision in the original algorithm.
>
>
> gen_key() ->
> F = fun() ->
> choose_key(5)
> end,
> mensia:transaction(F)
>
> choose_key(0) ->
> %% time to panic
> undefined;
> choose_key(Count) when is_integer(Count) ->
> Chars = lists:seq($a, $z) ++ lists:seq($A, $Z) ++ lists:seq($0, $9),
> Key = [lists:nth(crypto:rand_uniform(1, length(Chars)), Chars) ||
> N <- lists:seq(1, 10)],
> case mnesia:read(cont, Key, read) of
> [] ->
> %% choose again
> choose_key(Count - 1);
> [_ContRec] ->
> Key
> end.
>
> Just thinking about edge cases and what would be the correct course of
> action when a key collision occurs.
>
> Jeff.
>
>
> jm wrote:
> > Yariv Sadan wrote:
> >
> >> Yeah, it looks similar in the way it uses processes for user sessions.
> >> Unfortunately, the documentation doesn't say much about how to use it
> >> to build full apps.
> >
> > I know your code is really only a proof of concept, but I noticed that
> > your gen_key() function doesn't test for key collisions. The code I've
> > been testing is a little more complicated. It walks the mnesia table to
> > find the max id and adds one then encodes to/from a modified base64. The
> > modification to base64 is to replace $+ with $- and $/ with $_.
> > Encryption/decryption of the integer id is added to avoid making the ids
> > guessable, but this is frankly all a little heavy handed.
> >
> > I wont post all the code as there's about three or four versions/ways of
> > doing things in the same file and not all of it works at the moment as I
> > keep changing my mind on how to approach problems. Anyway, use anything
> > that's proves useful.
> >
> > Really it comes down to how likely a key/id collision is and how to
> > guarrantee that a unique key can be generated in near O(1) time for a
> > large number of sessions.
> >
> >
> > Jeff.
> > -------------
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "erlyweb" group.
To post to this group, send email to erlyweb@googlegroups.com
To unsubscribe from this group, send email to erlyweb-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/erlyweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Post recived from mailinglist
Guest
Posted: Sat Feb 23, 2008 4:14 am Reply with quote
Guest
Yariv Sadan wrote:
> I think it's highly unlikely that you'll have a key collision given
> that most webapps can expect at most a few thousand sessions and
> 8.39299e+17 is a very big number Smile
>
> In production code, though, I would make this code more efficient:
>
> length(lists:seq($a, $z) ++ lists:seq($A, $Z) ++ lists:seq($0, $9)).
>
> It should lookup each element from a fixed size tuple.
>

In production code, if your really worried about efficency you'd take
the time, all three seconds of it, to add

-define(LETTERS,
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789").


at the top of the module if I understand you correctly.

This little corner of the attracted my attention because it's
something I initially did really badly - more worried about other things
- and started to wonder if there's an "optimal" length key vs cpu usage
to generate it. It also shows up in a lot of other places such as cookie
generation. I was thinking last night that it might be better to start
with a list length of 5 octets, then if a collision occurs try 6, then
7, etc. This would keep the keys short, but if you've got images of
several KB flying around and all the overhead of HTTP 5 versus 10 or 20
bytes is neither here nor there.

Getting back on topic. Any reason you used mnesia:dirty_* functions?

It looks as if it should be able to handle cases and ifs without a
problem. I only mention this in that it seems that continations relies
on side effects and I've been meaning to see how Haskell handles this
with Monards and Arrows (not up on Haskell at all).

Would it be possible to separate out the functional and non-functional
parts a-la OTP gen_* modules? Also, if the functional parts of the code
are seperated out then the state could be serialised and written to disk
after short timeout so as not to consume resources then brought back
when needed or deleted entirely after a longer timeout. Unrelated to
this, you could have multiple machines running with the sessions being
forward to the correct machine should a connection be made by the client
to a machine on which the session exists.

The other thing I was think of was: Each process is for each component
instance, hence the max number of processes equals the max number of
session times the max number of components. Ignoring birth/death of
session and components. Anyone care to fill in some realistic numbers or
supply a better model?

Anyway, I'd be interested in seeing Erlyweb scale horizontally across
machines in a manner which minimises the use hardware such as external
load balances.

Jeff.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "erlyweb" group.
To post to this group, send email to erlyweb@googlegroups.com
To unsubscribe from this group, send email to erlyweb-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/erlyweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Post recived from mailinglist
Guest
Posted: Sat Feb 23, 2008 7:29 pm Reply with quote
Guest
> >
>
> In production code, if your really worried about efficency you'd take
> the time, all three seconds of it, to add
>
> -define(LETTERS,
> "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789").
>
>
> at the top of the module if I understand you correctly.

Basically, but I would also use a tuple instead of a list to enable
constant lookup time.

>
> This little corner of the attracted my attention because it's
> something I initially did really badly - more worried about other things
> - and started to wonder if there's an "optimal" length key vs cpu usage
> to generate it. It also shows up in a lot of other places such as cookie
> generation. I was thinking last night that it might be better to start
> with a list length of 5 octets, then if a collision occurs try 6, then
> 7, etc. This would keep the keys short, but if you've got images of
> several KB flying around and all the overhead of HTTP 5 versus 10 or 20
> bytes is neither here nor there.

I think that in generating keys, the lookup for collisions is much
more expensive than adding a few more random bytes. I think the best
approach would be to select a string length that's big enough so that
the probability of collision is close to 0.

>
> Getting back on topic. Any reason you used mnesia:dirty_* functions?

The reason is that I don't have to worry about consistency and the
dirty_* functions have better performance because they don't do any
locking. It's like using ets through the Mnesia interface, with the
extra benefit of Mnesia indexes.

>
> It looks as if it should be able to handle cases and ifs without a
> problem. I only mention this in that it seems that continations relies
> on side effects and I've been meaning to see how Haskell handles this
> with Monards and Arrows (not up on Haskell at all).


I'm not up on Haskell either, but as I understand it you have to use
monads in Haskell to force strict processing of code blocks in an
otherwise purely lazy language. But Erlang is strict, not lazy, so why
would you want to bother with isolating side effects? It would just
create more confusion about how to use the language properly.

>
> Would it be possible to separate out the functional and non-functional
> parts a-la OTP gen_* modules? Also, if the functional parts of the code
> are seperated out then the state could be serialised and written to disk
> after short timeout so as not to consume resources then brought back
> when needed or deleted entirely after a longer timeout. Unrelated to
> this, you could have multiple machines running with the sessions being
> forward to the correct machine should a connection be made by the client
> to a machine on which the session exists.
>
> The other thing I was think of was: Each process is for each component
> instance, hence the max number of processes equals the max number of
> session times the max number of components. Ignoring birth/death of
> session and components. Anyone care to fill in some realistic numbers or
> supply a better model?

Why would you want to model each session component with a process? I
think having a process per session is sufficient. That process would
maintain the state for all its components between requests.

>
> Anyway, I'd be interested in seeing Erlyweb scale horizontally across
> machines in a manner which minimises the use hardware such as external
> load balances.

Mnesia is location transparent and so is Erlang message passing. This
allows you to write

[Pid] = mnesia:read(cont, Id),
Pid ! Msg

without being concerned of where the Pid lives -- it could be either
on a local or a remote node. This feature is one the best things about
Erlang and why it makes writing distributed systems so easy. This
means you won't need a sticky load balancer if you cluster your web
servers and have them share the Mnesia schema. That said, having a
sticky load balancer would reduce some of the backend chatter between
servers but I can't say I know the cost/benefit ratio for this design.


Yariv

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "erlyweb" group.
To post to this group, send email to erlyweb@googlegroups.com
To unsubscribe from this group, send email to erlyweb-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/erlyweb?hl=en
-~----------~----~----~----~------~----~------~--~---

Post recived from mailinglist

Display posts from previous:  

All times are GMT
Page 2 of 2
Goto page Previous  1, 2
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum