« Return to Thread: weka's memory problem

Re: weka's memory problem

by Mark Hall-9 :: Rate this Message:

Reply to Author | View in Thread


On 2/07/2008, at 10:35 PM, Sebastian Briesemeister wrote:

> Hello,
>
>>> some of you might have had already memory problems with weka.
>>> Especially when you start off with more than 10000 features!
>> I have run datasets with 500,000 samples and 1,3 million features
>> without problems, even with non-incremental (batch) learning
>> algorithms. You might be using inappropriate learning algorithms,
>> could you give more details on your experiments?
>
> I simply do a backward attribute selection with CfsSubset as  
> evaluator.
CfsSubsetEval discretizes all numeric attributes (if the class is  
discrete), which creates another copy of your data. Also, a  
correlation matrix is computed. For 20,000 attributes, that's 3.2Gb  
right there :-)

Cheers,
Mark.

--
Mark Hall
Senior Developer/Consultant, Pentaho Open Source Business Intelligence
Citadel International, Suite 340, 5950 Hazeltine National Dr.,
Orlando, FL 32822, USA
+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
Download the latest release today <http://www.sourceforge.net/ 
projects/pentaho>




_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

 « Return to Thread: weka's memory problem

LightInTheBox - Buy quality products at wholesale price!