« Return to Thread: weka's memory problem

Re: weka's memory problem

by Sebastian Briesemeister :: Rate this Message:

Reply to Author | View in Thread

Hello,

>> some of you might have had already memory problems with weka.
>> Especially when you start off with more than 10000 features!
> I have run datasets with 500,000 samples and 1,3 million features
> without problems, even with non-incremental (batch) learning
> algorithms. You might be using inappropriate learning algorithms,
> could you give more details on your experiments?

I simply do a backward attribute selection with CfsSubset as evaluator.

> C.f. why WEKA performs deep copying: This is sound programming
> practice in Java - _not_ deep copying can lead to some very hard to
> find bugs in slightly buggy code. It essentially makes your live
> easier when you develop learning algorithms for WEKA at a slight
> memory consumption penalty, which is likely to be irrelevant in
> practice.

I thought Java only performs a "real" deep copy in case you make changes
in the object, otherwise it will use references to point to that object.
Even though, for feature selection it is not necessary to copy the data,
a list of indices should do.

>> Another side note: many RapidMiner processes can be directly applied
>> on a database by setting the appropriate parameters and there is
>> basically no memory restriction in these cases.
> Database access is simple for WEKA as well, see e.g.
> http://weka.sourceforge.net/wiki/index.php/Databases
> - weka.core.converters.DatabaseLoader even allows incremental
>   loading, in which case (combined with an incremental learner)
>   no memory restrictions exist as well.
>
>> And a second note: RapidMiner is also available as a 64 version in
>> cases where more than 4 Gb of memory are available on a 64 bit OS.
>> We ourself work here on a 16 Gb machine and then the running time starts
>> to be the limiting factor.
> Any 64bit version of Java (e.g. I use Java HotSpot(TM) 64-Bit Server
> VM (build 1.5.0_04-b05), which was built somewhere in 2005) can run
> WEKA with > 4G of main memory on a 64bit OS. So that again is not a
> limitation of WEKA, but of Java itself - a 32bit JVM is only able to
> address slightly less than 2GB of memory.
Right! I also didn't have a problem to address more than 4 GB. The only
really urgent problem is the extreme memory waste for simple operations
as attribute selection.

Cheers,
Sebastian

_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

 « Return to Thread: weka's memory problem

LightInTheBox - Buy quality products at wholesale price!