Erlang/OTP Forums

Author Message

<  Erlang questions mailing list  ~  how: Purging old records from Mnesia table

Gleber
Posted: Thu Mar 06, 2008 11:42 am Reply with quote
User Joined: 15 May 2007 Posts: 75
Hello.

I would like to implement some algorithm of purging old data from
mnesia table. Expiration date is written inside each record (could be
0 for permanent data).

I have few ideas how to implement it:
0) Every few seconds iterate over whole table and purge old records.
It is implemented in this way now and it eats about 35% of CPU on
every purge cycle for about 10 000 entries.
Will index on expiration field speed up selecting? Is it better to use
qlc? Will it speed give speed up if i use mnesia:index_match_object()
?

1) Use erlang:send_after() to send messages to purge individual
records (with special care to timeouts greater then max value of
unsigned int). Will Erlang runtime handle efficiently potentially up
to few millions of timers?

2) Use some sort of queue to manage sorted list of records to be
purged. Make special process which will receive information about new
records, insert into this queue and sleep until first record is to be
purged. Probably gb_trees will handle it. Are there better ADS for
this task?

I'll be greateful for any advices and responses Smile

Best Regards,
--
Gleb Peregud
http://gleber.pl/
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
noss
Posted: Thu Mar 06, 2008 12:02 pm Reply with quote
User Joined: 09 Oct 2005 Posts: 290
On Thu, Mar 6, 2008 at 12:38 PM, Gleb Peregud <gleber.p@gmail.com> wrote:
> Hello.
>
> I would like to implement some algorithm of purging old data from
> mnesia table. Expiration date is written inside each record (could be
> 0 for permanent data).

Maintain a second table updated as you insert records to this primary one.

It could be a bag of records with the expiry-interval as the key. That
way you would get less inefficient lookup of what records are to
expire in the current interval you're in.
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
Guest
Posted: Thu Mar 06, 2008 1:04 pm Reply with quote
Guest
0.1 - Iterate over the whole table slowly. use timer:sleep(delay_for_every_read). That's what I would do, since it's probably the simplest solution.


Sergej


On Thu, Mar 6, 2008 at 12:38 PM, Gleb Peregud <gleber.p@gmail.com (gleber.p@gmail.com)> wrote:
Quote:
Hello.

I would like to implement some algorithm of purging old data from
mnesia table. Expiration date is written inside each record (could be
0 for permanent data).

I have few ideas how to implement it:
0) Every few seconds iterate over whole table and purge old records.
It is implemented in this way now and it eats about 35% of CPU on
every purge cycle for about 10 000 entries.
Will index on expiration field speed up selecting? Is it better to use
qlc? Will it speed give speed up if i use mnesia:index_match_object()
?

1) Use erlang:send_after() to send messages to purge individual
records (with special care to timeouts greater then max value of
unsigned int). Will Erlang runtime handle efficiently potentially up
to few millions of timers?

2) Use some sort of queue to manage sorted list of records to be
purged. Make special process which will receive information about new
records, insert into this queue and sleep until first record is to be
purged. Probably gb_trees will handle it. Are there better ADS for
this task?

I'll be greateful for any advices and responses Smile

Best Regards,
--
Gleb Peregud
http://gleber.pl/
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org (erlang-questions@erlang.org)
http://www.erlang.org/mailman/listinfo/erlang-questions



Post recived from mailinglist
Gleber
Posted: Thu Mar 06, 2008 2:03 pm Reply with quote
User Joined: 15 May 2007 Posts: 75
On 3/6/08, Bengt Kleberg <bengt.kleberg@ericsson.com> wrote:
> Greetings,
>
> While only a small part of the total answer I would recommend that you
> could look into the module timer. It has apply_after/4 and it does not
> have the unsigned int problem.
>
>
> bengt
>
> On Thu, 2008-03-06 at 12:38 +0100, Gleb Peregud wrote:
> > Hello.
> >
> > I would like to implement some algorithm of purging old data from
> > mnesia table. Expiration date is written inside each record (could be
> > 0 for permanent data).
> >
> > I have few ideas how to implement it:
> > 0) Every few seconds iterate over whole table and purge old records.
> > It is implemented in this way now and it eats about 35% of CPU on
> > every purge cycle for about 10 000 entries.
> > Will index on expiration field speed up selecting? Is it better to use
> > qlc? Will it speed give speed up if i use mnesia:index_match_object()
> > ?
> >
> > 1) Use erlang:send_after() to send messages to purge individual
> > records (with special care to timeouts greater then max value of
> > unsigned int). Will Erlang runtime handle efficiently potentially up
> > to few millions of timers?
> >
> > 2) Use some sort of queue to manage sorted list of records to be
> > purged. Make special process which will receive information about new
> > records, insert into this queue and sleep until first record is to be
> > purged. Probably gb_trees will handle it. Are there better ADS for
> > this task?
> >
> > I'll be greateful for any advices and responses Smile
> >
> > Best Regards,
> > --
> > Gleb Peregud
> > http://gleber.pl/
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@erlang.org
> > http://www.erlang.org/mailman/listinfo/erlang-questions
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions
>

After checking source of timer module i can see it is implemented
using ordered_set ets table. This table is used as a queue and
gen_server module is pulling data at necessary moments (it is the same
as my idea in point 3). It seems to be a good choice Smile

Hence it is enough to use timer:send_after()

--
Gleb Peregud
http://gleber.pl/

Every minute is to be grasped.
Time waits for nobody.
-- Inscription on a Zen Gong
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
uwiger
Posted: Thu Mar 06, 2008 2:26 pm Reply with quote
User Joined: 03 Jul 2006 Posts: 604 Location: Sweden
Gleb Peregud skrev:
>
> After checking source of timer module i can see it is implemented
> using ordered_set ets table. This table is used as a queue and
> gen_server module is pulling data at necessary moments (it is the same
> as my idea in point 3). It seems to be a good choice Smile
>
> Hence it is enough to use timer:send_after()

I don't know if it's what you're after, but I've been (slowly)
working on a scheduler module based on mnesia.

It's not ready, and wholly undocumented, but for those who
enjoy source-as-documentation, the work in progress can be
seen at:

http://erlhive.svn.sourceforge.net/viewvc/erlhive/trunk/lib/erlhive/src/erlhive_schedule.erl?view=markup

(It's in ErlHive, but there really isn't that much that's
erlhive-specific about it.)

The idea is that you should be able to register calendar-like
'appointments' - repeating if necessary - and tie them to actions that
are fired at the right moment.

BR,
Ulf W
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message Visit poster's website
Gleber
Posted: Thu Mar 06, 2008 3:19 pm Reply with quote
User Joined: 15 May 2007 Posts: 75
I'm working on simple caching server. Allowing to specify if and when
item will expire.

On 3/6/08, Ulf Wiger (TN/EAB) <ulf.wiger@ericsson.com> wrote:
> Gleb Peregud skrev:
> >
> > After checking source of timer module i can see it is implemented
> > using ordered_set ets table. This table is used as a queue and
> > gen_server module is pulling data at necessary moments (it is the same
> > as my idea in point 3). It seems to be a good choice Smile
> >
> > Hence it is enough to use timer:send_after()
>
> I don't know if it's what you're after, but I've been (slowly)
> working on a scheduler module based on mnesia.
>
> It's not ready, and wholly undocumented, but for those who
> enjoy source-as-documentation, the work in progress can be
> seen at:
>
> http://erlhive.svn.sourceforge.net/viewvc/erlhive/trunk/lib/erlhive/src/erlhive_schedule.erl?view=markup
>
> (It's in ErlHive, but there really isn't that much that's
> erlhive-specific about it.)
>
> The idea is that you should be able to register calendar-like
> 'appointments' - repeating if necessary - and tie them to actions that
> are fired at the right moment.
>
> BR,
> Ulf W
>


--
Gleb Peregud
http://gleber.pl/

Every minute is to be grasped.
Time waits for nobody.
-- Inscription on a Zen Gong
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
warezio
Posted: Thu Mar 06, 2008 6:04 pm Reply with quote
User Joined: 05 May 2007 Posts: 107 Location: Yahoo
gleb,

lately i've been maintaining another mnesia table as an ordered_set with
the expiration time as the first element in the tuple of the primary key.
i then periodically iterate over that (with mnesia:first/1 and
mnesia:next/2), terminating the iteration early when possible, which is
most of the time.

of course it means database modifications have to hit two tables, and
unfortunately sometimes introduces the need for transactions where before
dirty operations would do.

so i'm eager to hear a better solution.

-- p

p.z. one thing not clear in your original post ... is this persistent
data? cuz if so, timers and ets tables seem ill advised.

On Thu, 6 Mar 2008, Gleb Peregud wrote:

> Hello.
>
> I would like to implement some algorithm of purging old data from
> mnesia table. Expiration date is written inside each record (could be
> 0 for permanent data).
>
> I have few ideas how to implement it:
> 0) Every few seconds iterate over whole table and purge old records.
> It is implemented in this way now and it eats about 35% of CPU on
> every purge cycle for about 10 000 entries.
> Will index on expiration field speed up selecting? Is it better to use
> qlc? Will it speed give speed up if i use mnesia:index_match_object()
> ?
>
> 1) Use erlang:send_after() to send messages to purge individual
> records (with special care to timeouts greater then max value of
> unsigned int). Will Erlang runtime handle efficiently potentially up
> to few millions of timers?
>
> 2) Use some sort of queue to manage sorted list of records to be
> purged. Make special process which will receive information about new
> records, insert into this queue and sleep until first record is to be
> purged. Probably gb_trees will handle it. Are there better ADS for
> this task?
>
> I'll be greateful for any advices and responses Smile
>
> Best Regards,
> --
> Gleb Peregud
> http://gleber.pl/
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions
>

Optimism is an essential ingredient of innovation. How else can the
individual favor change over security?

-- Robert Noyce
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message Yahoo Messenger
Guest
Posted: Fri Mar 07, 2008 2:58 am Reply with quote
Guest
A few messages ago, someone said:

>> > Hence it is enough to use timer:send_after()

Gleb Peregud <gleber.p@gmail.com> wrote:

gp> I'm working on simple caching server. Allowing to specify if and
gp> when item will expire.

When having the expiration even sent by a 2nd party, you must avoid race
conditions somehow. Example:

Store tuple {Key, Value0}, expires at time T.
Time T arrives
Client stores {Key, Value1}, expires at time T2
Server deletes Key
Client attempts to fetch Key, fetch fails, client is sad.

-Scott
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
Gleber
Posted: Fri Mar 07, 2008 2:15 pm Reply with quote
User Joined: 15 May 2007 Posts: 75
On Thu, Mar 6, 2008 at 7:00 PM, Paul Mineiro <paul-trapexit@mineiro.com> wrote:
> lately i've been maintaining another mnesia table as an ordered_set with
> the expiration time as the first element in the tuple of the primary key.
> i then periodically iterate over that (with mnesia:first/1 and
> mnesia:next/2), terminating the iteration early when possible, which is
> most of the time.
>
> of course it means database modifications have to hit two tables, and
> unfortunately sometimes introduces the need for transactions where before
> dirty operations would do.
>
> so i'm eager to hear a better solution.
>
> p.z. one thing not clear in your original post ... is this persistent
> data? cuz if so, timers and ets tables seem ill advised.

Caching server, which i'm working on, would be able to store data in
either ways. User will specify if:
1) Record will not expire (user is responsible for purging it)
2) Record will expire in arbitrary number of seconds
3) Record will expire at given date (user will have to pass unix
timestamp) (expire field =

Of course i'm not going to use any timers for data which will never
expire Smile And, as it can be seen in timer module's source, it is
implemented in exacly the same way as you did it in your application.
timer is using ets ordered_set table for storing ordered list of
events which are going to be launched at given time. After handling
some event it will fetch the first item from table and sleep
appropriate time, and so on.

On Fri, Mar 7, 2008 at 3:56 AM, Scott Lystig Fritchie
<fritchie@snookles.com> wrote:
> When having the expiration even sent by a 2nd party, you must avoid race
> conditions somehow. Example:
>
> Store tuple {Key, Value0}, expires at time T.
> Time T arrives
> Client stores {Key, Value1}, expires at time T2
> Server deletes Key
> Client attempts to fetch Key, fetch fails, client is sad.

Thanks for the hint. I will store timer's "handle" in every record to
be able to cancel it Smile

--
Gleb Peregud
http://gleber.pl/

Every minute is to be grasped.
Time waits for nobody.
-- Inscription on a Zen Gong
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist
View user's profile Send private message
wuji
Posted: Sat Sep 15, 2012 5:05 am Reply with quote
User Joined: 10 Aug 2012 Posts: 654
JCPenney and Chrysler, though, decided not to buy airtime during the episode. episode. designer replica *beep* episode. Wendy's decided not to advertise on "The Ellen Show" at all.A
has changed since the day that show first aired.JCPenney has tapped DeGeneres DeGeneres cheap authentic jordans DeGeneres as its spokeswoman and supported its decision even when One Million
a "pro-family advocacy" group, threatened to boycott the national retail chain for for [h3]cheap Ralph Lauren[/h3] for refusing to fire her.In 1996, a Gallup poll showed that the
approval rating for gay marriage was 27 percent. It is more than than [h4]discount designer *beep*[/h4] than 50 percent today.And there are now 34 shows featuring gay characters
leading and supporting roles -- not including reality TV -- compared to to cheap replica *beep* to only 11 shows in 1997.In the "20/20" interview, DeGeneres said she
willing to risk people knowing."I decided this was not going to be be [h4]cheap Ralph Lauren Polo[/h4] be something that I was going to live the rest of my
View user's profile Send private message

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum