Erlang Mailing Lists

Author Message

<  Erlang questions mailing list  ~  sext and Tokyo Tyrant ( sortable serialization format)

uwiger
Posted: Sat Oct 31, 2009 11:20 am Reply with quote
User Joined: 03 Jul 2006 Posts: 604 Location: Sweden
Ulf Wiger wrote:
>
> A while ago I started hacking on a serialization format that
> would have the same sorting properties as Erlang terms.
>
> I didn't quite get it to work (negative floats was the most
> difficult part), but when I returned to it today, I realized
> that it was only a very small problem. Once fixed, all my
> QuickCheck suites passed.

I just had to try this on Tokyo Tyrant, so I wrote a small
prototype for connecting to TT and encoding a few requests,
using the sext library to encode terms before sending them.

I realized that a new function was needed in sext: prefix(Term),
which encodes a 'prefix' that will match similar terms, and allow
some wildcarding. A prefix can't be decoded (at least, I didn't
write any code for doing so).

Some examples:

Eshell V5.7.1 (abort with ^G)
1> sext:encode({1,2,3}).
<<16,0,0,0,3,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6>>
2> sext:prefix({1,'_','_'}).
<<16,0,0,0,3,10,0,0,0,2>>
3> sext:encode([1,2,3]).
<<17,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6,0>>
4> sext:prefix([1,2|'_']).
<<17,10,0,0,0,2,10,0,0,0,4>>


Armed with this, I opened a B-tree table in Tokyo Tyrant,
and connected to it with my prototype module.

Eshell V5.7.1 (abort with ^G)
1> {ok,TT} = tt_proto:open(tt,[]).
{ok,<0.35.0>}
2> tt_proto:put(TT,{1,a}, 1).
ok
3> tt_proto:get(TT,{1,a}).
{ok,1}
4> tt_proto:put(TT,{1,b}, 2).
ok
5> tt_proto:put(TT,{1,c}, 3).
ok
6> tt_proto:put(TT,{2,a}, 4).
ok

Now, for some prefix matching:

7> tt_proto:keys(TT,{1,'_'}).
{ok,[{1,a},{1,b},{1,c}]}
8> tt_proto:keys(TT,{2,'_'}).
{ok,[{2,a}]}
9> timer:tc(tt_proto,keys,[TT,{1,'_'}]).
{279,{ok,[{1,a},{1,b},{1,c}]}}

I made no real effort to optimize anything. The module starts
a gen_server which keeps a connection open to ttserver. It handles
only one query at a time, but looking at the TCP protocol, it's
hard to see how it could to otherwise, as there is no tagging of
requests. The round trip times are going to be fairly high for simple
requests (compared to dets and mnesia on small data sets), but the
main benefit of using TT in the first place ought to be either that
the data set is uncomfortably large for mnesia and dets, or that one
wants ordered_set semantics on disk-based storage.

I put the tt_proto module in sext/examples/
There is some edoc for it too.

http://svn.ulf.wiger.net/sext/trunk/sext/doc/index.html

BR,
Ulf W

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Post received from mailinglist
View user's profile Send private message Visit poster's website
warezio
Posted: Sat Oct 31, 2009 4:20 pm Reply with quote
User Joined: 05 May 2007 Posts: 107 Location: Yahoo
Awesome. That's 90% of tcerl; most of the complexity had to do with
implementing erlang term order from encoded binaries, some of the rest
with augmenting term_to_binary to allow for prefixes (there's no smallest
or largest term in erlang), and the remainder with query planning. The
query planning part is pure erlang and independent of the term encoding
(http://code.google.com/p/tcerl/source/browse/trunk/tcerl/src/tcbdbmsutil.erl),
so you could hopefully reuse it.

In retrospect, clearly, I should have abandoned term_to_binary, if only
because Erlang is easy and C is hard, so the C side should be just memcmp.

-- p

On Sat, 31 Oct 2009, Ulf Wiger wrote:

> Ulf Wiger wrote:
> >
> > A while ago I started hacking on a serialization format that
> > would have the same sorting properties as Erlang terms.
> >
> > I didn't quite get it to work (negative floats was the most
> > difficult part), but when I returned to it today, I realized
> > that it was only a very small problem. Once fixed, all my
> > QuickCheck suites passed.
>
> I just had to try this on Tokyo Tyrant, so I wrote a small
> prototype for connecting to TT and encoding a few requests,
> using the sext library to encode terms before sending them.
>
> I realized that a new function was needed in sext: prefix(Term),
> which encodes a 'prefix' that will match similar terms, and allow
> some wildcarding. A prefix can't be decoded (at least, I didn't
> write any code for doing so).
>
> Some examples:
>
> Eshell V5.7.1 (abort with ^G)
> 1> sext:encode({1,2,3}).
> <<16,0,0,0,3,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6>>
> 2> sext:prefix({1,'_','_'}).
> <<16,0,0,0,3,10,0,0,0,2>>
> 3> sext:encode([1,2,3]).
> <<17,10,0,0,0,2,10,0,0,0,4,10,0,0,0,6,0>>
> 4> sext:prefix([1,2|'_']).
> <<17,10,0,0,0,2,10,0,0,0,4>>
>
>
> Armed with this, I opened a B-tree table in Tokyo Tyrant,
> and connected to it with my prototype module.
>
> Eshell V5.7.1 (abort with ^G)
> 1> {ok,TT} = tt_proto:open(tt,[]).
> {ok,<0.35.0>}
> 2> tt_proto:put(TT,{1,a}, 1).
> ok
> 3> tt_proto:get(TT,{1,a}).
> {ok,1}
> 4> tt_proto:put(TT,{1,b}, 2).
> ok
> 5> tt_proto:put(TT,{1,c}, 3).
> ok
> 6> tt_proto:put(TT,{2,a}, 4).
> ok
>
> Now, for some prefix matching:
>
> 7> tt_proto:keys(TT,{1,'_'}).
> {ok,[{1,a},{1,b},{1,c}]}
> 8> tt_proto:keys(TT,{2,'_'}).
> {ok,[{2,a}]}
> 9> timer:tc(tt_proto,keys,[TT,{1,'_'}]).
> {279,{ok,[{1,a},{1,b},{1,c}]}}
>
> I made no real effort to optimize anything. The module starts
> a gen_server which keeps a connection open to ttserver. It handles
> only one query at a time, but looking at the TCP protocol, it's
> hard to see how it could to otherwise, as there is no tagging of
> requests. The round trip times are going to be fairly high for simple
> requests (compared to dets and mnesia on small data sets), but the
> main benefit of using TT in the first place ought to be either that
> the data set is uncomfortably large for mnesia and dets, or that one
> wants ordered_set semantics on disk-based storage.
>
> I put the tt_proto module in sext/examples/
> There is some edoc for it too.
>
> http://svn.ulf.wiger.net/sext/trunk/sext/doc/index.html
>
> BR,
> Ulf W
>
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>


________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Post received from mailinglist
View user's profile Send private message Yahoo Messenger
wuji
Posted: Thu Aug 23, 2012 7:06 am Reply with quote
User Joined: 10 Aug 2012 Posts: 654
in the meat.In addition to the Minneapolis flight, a needle needle [h2]replica designer *beep*[/h2] needle was discovered by a teenage passenger aboard a Delta
from Amsterdam to Atlanta. The teen would not surrender the the cheap Ralph Lauren the needle to authorities, who noted he told them that
planned to use it as evidence in a lawsuit.In a a [h4]cheap replica *beep*[/h4] a federal report on the incidents, it was noted that
teen was the son of a passenger aboard the flight flight [h4]replica Christian Louboutin[/h4] flight to Minneapolis who also found a needle in his
needles were reported found on two other flights, one by by Cheap Ralph Lauren Shirts by a crew member and another by a federal air
View user's profile Send private message

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum