Call for Contributions: Mnesia best practices

View: New views
4 Messages — Rating Filter:   Alert me  

Call for Contributions: Mnesia best practices

by Bob Calco-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Everyone:

I'm looking for thoughts from fellow Erlangers about database design &
implementation in Mnesia. With a heavy SQL background I, like many
relatively new Erlang folks I'm sure, have a tendency to think in terms of
the capabilities of traditional RDBMSs, and to try to normalize every data
model with which I come into contact.

I'm also used to letting administrators deal with most database operations
and issues, whereas with Mnesia the programmer has basically supreme control
(both to do some really cool and flexible things, and to do some really dumb
things).

So the question is generic, not intended to solve a specific problem. I have
some ideas and will contribute separately.

The question is: What is the best advice you could give a data architect
about designing and implementing a database in Mnesia from scratch? Examples
of the kinds of issues I'd like to see folks address:

* How to create an optimal data model for performance (vs. reporting,
comparing the SQL way to the Mnesia way). This question is really about
normalization in Mnesia vs. SQL, and tricks like storing whole records in
table fields.

* How to partition data between subsystems, without losing the illusion
they're all one big happy system.

* How to handle complex clustering and failover scenarios.

* How to handle calculation-intensive databases (for example, stock
databases that need to constantly recalculate certain attributes for the
purposes of sorting, searching)

* How to handle complex domain relationships. For example, let's say you are
writing a CRM tool and want to store each "person" in the database. But each
person can also be a colleague, or a client, or an incidental character
(contact person at some organization). E.e, What do you do when there is
inheritance in your domain model?

* What are some current pitfalls or "weak spots" of Mnesia that ought to be
avoided, however tempting they might be?

* How to implement the various callbacks, and what kinds outside of those
described in the Mnesia documentation, have been found most useful in
practice?

That kind of thing. I want to put together something of a master knowledge
base on this big subject that the community can use both to promote Erlang
and to promote Erlang's "best practices" implementation in the important
area of serving data to applications.

/Bob


_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions

Re: Call for Contributions: Mnesia best practices

by Christian S-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Now, this is not exactly mnesia, but google application engine's
datastore which is based off google bigtable. HOWever, they do present
some problems, their causes, and strategies for handling scalability
when you have a distributed database..

http://sites.google.com/site/io/building-scalable-web-applications-with-google-app-engine

The other google io presentatiosn over at
http://sites.google.com/site/io/ on datastore are also worth to see.

Also a comment:

> * How to partition data between subsystems, without losing the illusion
> they're all one big happy system.

They are not one big happy system. The illusion must be forgotten and
reality must be faced.
Things like: Stop doing joins. Instead begin to duplicate data, so it
is available directly on first access.
Or: Send the code to execute where the data is, instead of sending the
data to the machine that has the code.

Look at the hoops they go through to implement efficient
statistics-counters in the video.  ACID properties are a costly
luxury, now you have to start conserve your use of it, find when
almost or eventual consistency is enough and use that fact.

Yes, database programming just got trickier, but if your
write-transactions takes 10ms and must wait in a single line, then you
can only do 100 of them per second. If that is orders of magnitude
less than you need, then it is time to hack around it.


Also a word of caution: These strategies are for enormous scalability.
You only need them if you already know what problems you're facing
with your current overstrained rdbm solution. It takes time to
implement these hacks for distributed databases, because the hacks use
application specific knowledge that only you can know, because it is
your application.
_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions

Re: Call for Contributions: Mnesia best practices

by Taavi Talvik-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jul 2, 2008, at 4:46 PM, Bob Calco wrote:
> I'm looking for thoughts from fellow Erlangers about database design &
> implementation in Mnesia. With a heavy SQL background I, like many
> relatively new Erlang folks I'm sure, have a tendency to think in  
> terms of
> the capabilities of traditional RDBMSs, and to try to normalize  
> every data
> model with which I come into contact.

uninformed comments..

> The question is: What is the best advice you could give a data  
> architect
> about designing and implementing a database in Mnesia from scratch?  
> Examples
> of the kinds of issues I'd like to see folks address:

First of all.

Mnesia is not actually fully fledged relational database.
It is simple key-value thingy with some query capabilities thrown in.

On the other hand it is really well distributed.

> * How to create an optimal data model for performance (vs. reporting,
> comparing the SQL way to the Mnesia way). This question is really  
> about
> normalization in Mnesia vs. SQL, and tricks like storing whole  
> records in
> table fields.

There is no silver bullet;)

Look at application requirements - for queries used 80% of time is  
wise to give
special attention. And not ownly on database level. Probably much more  
on application
level - should something to be cached, precomputed, distirbuted etc..

> * How to partition data between subsystems, without losing the  
> illusion
> they're all one big happy system.

What? Why should application see partitioning? Hide partitioning
from consumer with some middleman.

Look for classic example of map -> pmap evolution
http://www.erlang.org/ml-archive/erlang-questions/200606/msg00187.html

> * How to handle complex clustering and failover scenarios

Programming reliable systems:
http://www.sics.se/~joe/thesis/armstrong_thesis_2003.pdf

Failures will happen - just fail fast and recover fast enough;)

> * How to handle calculation-intensive databases (for example, stock
> databases that need to constantly recalculate certain attributes for  
> the
> purposes of sorting, searching)

Real time? Historical? Process per requesting user?

> * How to handle complex domain relationships. For example, let's say  
> you are
> writing a CRM tool and want to store each "person" in the database.  
> But each
> person can also be a colleague, or a client, or an incidental  
> character
> (contact person at some organization). E.e, What do you do when  
> there is
> inheritance in your domain model?

Ask from "person" who he is? I.e. create separate process/server/
distributed application
and ask via some protocol. Model each entity as process, which knows  
all messages
which can be asked from him.

> * What are some current pitfalls or "weak spots" of Mnesia that  
> ought to be
> avoided, however tempting they might be?

No "generic" solution for recovering partitioned network.
Recovery time after crash.
HUGE datasets
No nice tools like oracle enterprise manager
low level, no referential constraints - i.e. it is not fully fledged  
RDBMS


best regards,
taavi
_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions

Re: Call for Contributions: Mnesia best practices

by Scott Lystig Fritchie :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

There's nothing like replying to a message that's well over a month
old.

Bob Calco <bobcalco@...> wrote:

bc> I'm looking for thoughts from fellow Erlangers about database design
bc> & implementation in Mnesia.
bc> [...]
bc> That kind of thing. I want to put together something of a master
bc> knowledge base on this big subject that the community can use both
bc> to promote Erlang and to promote Erlang's "best practices"
bc> implementation in the important area of serving data to
bc> applications.

Bob, I think that such a thing would be welcome.  I'd contribute from
time to time, though I confess I've less time for Open Source work than
I prefer.  Have you considered a place to host it?  I'm a fan of
Wiki-style documentation, and TrapExit(*) provides a wiki for
documenting exactly this sort of stuff.

-Scott

(*) http://www.trapexit.org and more specifically,
http://www.trapexit.org/Erlang_Wiki
_______________________________________________
erlang-questions mailing list
erlang-questions@...
http://www.erlang.org/mailman/listinfo/erlang-questions
LightInTheBox - Buy quality products at wholesale price