| Author |
Message |
|
| seancharles |
Posted: Thu Aug 02, 2007 8:15 am |
|
|
|
User
Joined: 18 Jul 2007
Posts: 57
|
Hi,
I am currently getting to grips with Mnesia, I have used MinneStore, a Smalltalk object database and I have a question about Mnesia as follows:
If I have a value in one record that is the 'foreign key' to another type of record, and that key value is used by say, a thousand different citing records, does mnesia save a thousand copies of the key or does it create an atom and use a more compact internal representation.
What I am getting at is, how well does mnesia handle duplicated data... does it just store it or does it try to avoid repeating keys over and over.
I hope that's clear... I am designing a mnesia schema and I am just concerned that it is going to eat up RAM by not being efficient.
Cheers
Sean Charles |
|
|
| Back to top |
|
| Mazen |
Posted: Thu Aug 02, 2007 9:54 am |
|
|
|
User
Joined: 20 Jul 2006
Posts: 164
Location: London
|
Hi Sean,
Quote: If I have a value in one record that is the 'foreign key' to another type of record, and that key value is used by say, a thousand different citing records, does mnesia save a thousand copies of the key
Yes! Since Mnesia isn't a relational database, you only store records in the tables and each record is it's own "object" in the table. Mnesia doesn't (As far as I know) try to optimize any thing in there except for the indexes you create (simply to be able to use them as references). This means that if you want to create a "relation" in your tables you have to first look up one element, read foregein key, look up another element somewhere else manually.
Quote:
I am designing a mnesia schema and I am just concerned that it is going to eat up RAM by not being efficient.
I wouldn't worry about mnesia eating up RAM. I have used mnesia to store 500 000+ records with different foreign keys to look up in other tables on a normal workstation and it has worked well (Do NOT store huge amount of data in your tables, mnesia should be used for Live data in my experience). The only way to know if this is not enough for your requirements is simply to test It's a tight balance between redundant data and keeping the "jumping" between tables down. |
|
|
| Back to top |
|
| seancharles |
Posted: Thu Aug 02, 2007 12:39 pm |
|
|
|
User
Joined: 18 Jul 2007
Posts: 57
|
You and me again!
So, mnesia just stores what you throw at it and that's it. That's what I thought but I wondered if I maybe had missed some little detail in the documentation somewhere.
I guess I could implement my own 'singleton' system with keys ie each key is replaced by an index in a list of keys. Don't change the keys!
That way when I use the key 'thisismylongkey' I save '42' instead. That should cut down the RAM usage.
However, I'd do well to remember the first rule of optimization: Don't Do It!
I'll see how I get on with it. I really think I have a great future with Erlang/mnesia etc even for web-sites... I have not had too much success with the mysql and pgsql modules as of yet.
I am still trying to get them to work, I connect to MySQL ok but then it seems to drop the connection repeatedly. I know there's an auto-reconnect etc etc but I just need more time to get it in my head!
Oh yeah, and the *&*$^# day job is in the way as well. LOL
Cheers Maven,
Sean |
|
|
| Back to top |
|
| francesco |
Posted: Fri Aug 03, 2007 11:12 am |
|
|
|
User
Joined: 07 Jul 2006
Posts: 249
Location: London
|
Hi Sean,
Quote: I guess I could implement my own 'singleton' system with keys ie each key is replaced by an index in a list of keys. Don't change the keys!
That way when I use the key 'thisismylongkey' I save '42' instead. That should cut down the RAM usage.
However, I'd do well to remember the first rule of optimization: Don't Do It!
Don't... The three golden rules of Erlang are
First make it Work
Then Make it Beautiful
Then, if you really really have to, make it fast.
In regards to your problem, I am not sure if I understand it correctly... Whatever way:
If you have {a,b}, {a,c}, {d,e}, where a and d are the keys, you could rewrite everything to {a, [b,c]}, {d, [e]}. I would not bother and just use bag.
If you are worried that averylongatom will take up more memory than an integer, then you are right in your theory of not optimizing... As atoms in the VM are in fact mapped to integers, so the optimization would not help you at all.
Hope this helps...
Francesco
--
http://www.erlang-consulting.com |
|
|
| Back to top |
|
| seancharles |
Posted: Fri Aug 03, 2007 11:20 am |
|
|
|
User
Joined: 18 Jul 2007
Posts: 57
|
Quote: If you are worried that averylongatom will take up more memory than an integer, then you are right in your theory of not optimizing... As atoms in the VM are in fact mapped to integers, so the optimization would not help you at all.
How does mnesia cope with a restart if atoms are mapped to integers ?
If I save a lot of data with a set of atoms, how does mnesia ensure that when I restart the runtime and reload the data, that the mappings are intact ?
What I am trying to say is given that the atom->integer mappings are transient across uptimes (are they!?), how does mnesia/erlang-VM know what is in the tables if the new runtime values are different ???
Is there an mnesia guru out there who knows how mnesia handles things like this, if it does! Presumably you would have to maintain a directory and then re-intern those atoms that have been used as keys and re-establish the mappings.
Sounds hard!
Anybody...? |
|
|
| Back to top |
|
| francesco |
Posted: Fri Aug 03, 2007 12:54 pm |
|
|
|
User
Joined: 07 Jul 2006
Posts: 249
Location: London
|
Quote: How does mnesia cope with a restart if atoms are mapped to integers ?
If I save a lot of data with a set of atoms, how does mnesia ensure that when I restart the runtime and reload the data, that the mappings are intact ?
The atom table is internal to the VM. Don't worry about it. The only thing to think about when dealing with atoms is that they are not garbage collected. So if you for example map subscriber numbers to atoms in order to save a little space, then your VM will eventually run out of memory if you are dealing with millions of numbers. If you are not dealing with ever changing atoms, then don't worry.
Quote: What I am trying to say is given that the atom->integer mappings are transient across uptimes (are they!?), how does mnesia/erlang-VM know what is in the tables if the new runtime values are different ???
To be honest, I don't know... In the 14 years I have been hacking Erlang, I've just taken it for granted and not lost any sleep over it. You could read the VM code, but it is not for tha faint of heart
Quote: Is there an mnesia guru out there who knows how mnesia handles things like this, if it does! Presumably you would have to maintain a directory and then re-intern those atoms that have been used as keys and re-establish the mappings.
Mnesia does not handle it. It is taken care of in the VM/Byte code. As mnesia is written in Erlang (ets tables excluded), it just uses the constructs which are provided in the language primitives.... And mapping atoms to integers is not one of them.
I hope this clears up your questions.. If not, let me know.
Cheers,
Francesco
--
http://www.erlang-consulting.com |
|
|
| Back to top |
|
|
|
All times are GMT
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You can attach files in this forum You can download files in this forum
|
|
|