Hello,
> some of you might have had already memory problems with weka.
> Especially when you start off with more than 10000 features!
> [...]
> Does anyone has experience with memory usage of RapidMiner?
RapidMiner employs a completely different data storage mechanism than
Weka and does hardly perform any deep copies of the data at all. So the
memory usage is often lower as a default. We had several users who
reported that even for the Weka learning schemes included in RapidMiner
the memory usage was (much) smaller than in Weka itself which is of
course an interesting result of these data structures. The same applies
for many preprocessing processes. The reason for this is the fact that
we do not build Weka instances from our data (again: no data copy here)
but deliver a new instances object to Weka which directly accesses the
data structures of RapidMiner without deep-copying the data even in Weka
operations. For certain data mining processes, however, things are
exactly the other way round: the learning algorithms of Weka are already
very mature and highly optimized and several implementations of
corresponding RapidMiner operators can not deliver better results. So
often the best solution for memory intensive processes is to combine the
strong points of both worlds: the mature analysis algorithms of Weka on
top of the more efficient data structures of RapidMiner.
Another side note: many RapidMiner processes can be directly applied on
a database by setting the appropriate parameters and there is basically
no memory restriction in these cases. And a second note: RapidMiner is
also available as a 64 version in cases where more than 4 Gb of memory
are available on a 64 bit OS. We ourself work here on a 16 Gb machine
and then the running time starts to be the limiting factor. However,
both notes might help in cases where they are applicable.
Hope that helps,
Ingo
--
Ingo Mierswa
Managing Director
Rapid-I GmbH
Stockumer Str. 475
44149 Dortmund, Germany
Phone: +49 (0)231 425 786 90
E-Mail:
mierswa@...
Sitz: Dortmund
HRB 20720, Amtsgericht Dortmund
Geschäftsführer: Ingo Mierswa, Ralf Klinkenberg
www:
http://rapid-i.com/_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist