Doubt in Experimenter

View: New views
7 Messages — Rating Filter:   Alert me  

Doubt in Experimenter

by Thiago Ferreira :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

     Using the Explorer GUI for multi-class problems the statistics (Area under ROC, FP, TP...) produced by the classifier is detailed in class-by-class way (doing one class vs all for each one, right?).
     But using the Experimenter GUI for the same problem there's only one Area under ROC, how that single area is calculated?
     I tried to find this information on the manual but had no success.


Regards,
Thiago

_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Mark Hall-9 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The default is to compute IR statistics and AUC for the first class  
value. You can change this, but you will have to switch to using the  
"Advanced" mode of the Experimenter. Assuming you are using the  
development version of Weka, and once in the advanced mode, click on  
the "Result generator" to bring up the GenericObjectEditor, then  
click on the "splitEvaluator" option to bring up a second GOE for the  
ClassifierSplitEvaluator. In the ClassifierSplitEvaluator you will  
see an option called "classForIRStatistics" - change this value to  
the index (zero-based) of the class value to be considered the  
"positive" class.

HTH.

Cheers,
Mark.

On 1/07/2008, at 1:19 PM, Thiago Ferreira wrote:

> Hi,
>
>      Using the Explorer GUI for multi-class problems the statistics  
> (Area under ROC, FP, TP...) produced by the classifier is detailed  
> in class-by-class way (doing one class vs all for each one, right?).
>      But using the Experimenter GUI for the same problem there's  
> only one Area under ROC, how that single area is calculated?
>      I tried to find this information on the manual but had no  
> success.
>
>
> Regards,
> Thiago
> _______________________________________________
> Wekalist mailing list
> Wekalist@...
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
--
Mark Hall
Senior Developer/Consultant, Pentaho Open Source Business Intelligence
Citadel International, Suite 340, 5950 Hazeltine National Dr.,
Orlando, FL 32822, USA
+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
Download the latest release today <http://www.sourceforge.net/ 
projects/pentaho>




_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Thiago Ferreira :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Mark,

       Is there a way to retrieve the IR statistics for each class without running each experiment to output the IR for each class (for deterministic schemes the only problem would be computational time, but for a non-deterministic scheme that could invalidate the other results)?
      I know that this is done by the Explorer GUI, but, standard deviations for example, are only output in the Experimenter GUI, that's why I'm asking this (I'm new in ExperimenterGUI).

Thanks for the help,
Thiago

On Tue, Jul 1, 2008 at 10:38 PM, Mark Hall <mhall@...> wrote:
The default is to compute IR statistics and AUC for the first class value. You can change this, but you will have to switch to using the "Advanced" mode of the Experimenter. Assuming you are using the development version of Weka, and once in the advanced mode, click on the "Result generator" to bring up the GenericObjectEditor, then click on the "splitEvaluator" option to bring up a second GOE for the ClassifierSplitEvaluator. In the ClassifierSplitEvaluator you will see an option called "classForIRStatistics" - change this value to the index (zero-based) of the class value to be considered the "positive" class.

HTH.

Cheers,
Mark.


On 1/07/2008, at 1:19 PM, Thiago Ferreira wrote:

Hi,

    Using the Explorer GUI for multi-class problems the statistics (Area under ROC, FP, TP...) produced by the classifier is detailed in class-by-class way (doing one class vs all for each one, right?).
    But using the Experimenter GUI for the same problem there's only one Area under ROC, how that single area is calculated?
    I tried to find this information on the manual but had no success.


Regards,
Thiago
_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

--
Mark Hall
Senior Developer/Consultant, Pentaho Open Source Business Intelligence
Citadel International, Suite 340, 5950 Hazeltine National Dr.,
Orlando, FL 32822, USA
+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
Download the latest release today <http://www.sourceforge.net/projects/pentaho>




_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist



_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Mark Hall-9 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 3/07/2008, at 1:56 AM, Thiago Ferreira wrote:

> Mark,
>
>        Is there a way to retrieve the IR statistics for each class  
> without running each experiment to output the IR for each class  
> (for deterministic schemes the only problem would be computational  
> time, but for a non-deterministic scheme that could invalidate the  
> other results)?
>       I know that this is done by the Explorer GUI, but, standard  
> deviations for example, are only output in the Experimenter GUI,  
> that's why I'm asking this (I'm new in ExperimenterGUI).
I'm afraid there is no mechanism for that at present in the  
Experimenter. There would need to be a way of specifying which class  
you are interested in when the analysis is done, but that becomes  
awkward, to say the least, when your experiment contains multiple  
data sets and each has a different number of class values.

Cheers,
Mark.

--
Mark Hall
Senior Developer/Consultant, Pentaho Open Source Business Intelligence
Citadel International, Suite 340, 5950 Hazeltine National Dr.,
Orlando, FL 32822, USA
+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
Download the latest release today <http://www.sourceforge.net/ 
projects/pentaho>




_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Thiago Ferreira :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Mark,

    In the analysis it could output a row for each class information,
like is done in Explorer.
    Another question, in the Experimenter it looks like the property
class for IR inside the splitEvaluator is one for the entire
experiment and not one for each dataset. Is that right?


Regards,
Thiago

On 7/2/08, Mark Hall <mhall@...> wrote:

>
> On 3/07/2008, at 1:56 AM, Thiago Ferreira wrote:
>
>> Mark,
>>
>>        Is there a way to retrieve the IR statistics for each class
>> without running each experiment to output the IR for each class
>> (for deterministic schemes the only problem would be computational
>> time, but for a non-deterministic scheme that could invalidate the
>> other results)?
>>       I know that this is done by the Explorer GUI, but, standard
>> deviations for example, are only output in the Experimenter GUI,
>> that's why I'm asking this (I'm new in ExperimenterGUI).
>
> I'm afraid there is no mechanism for that at present in the
> Experimenter. There would need to be a way of specifying which class
> you are interested in when the analysis is done, but that becomes
> awkward, to say the least, when your experiment contains multiple
> data sets and each has a different number of class values.
>
> Cheers,
> Mark.
>
> --
> Mark Hall
> Senior Developer/Consultant, Pentaho Open Source Business Intelligence
> Citadel International, Suite 340, 5950 Hazeltine National Dr.,
> Orlando, FL 32822, USA
> +64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
> Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
> Download the latest release today <http://www.sourceforge.net/
> projects/pentaho>
>
>
>
>

_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Peter Reutemann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>    In the analysis it could output a row for each class information,
> like is done in Explorer.

Having several hundred class labels, this would blow up your results
database quite dramatically.

>    Another question, in the Experimenter it looks like the property
> class for IR inside the splitEvaluator is one for the entire
> experiment and not one for each dataset. Is that right?

Yes, that's correct. The setting is for the complete experiment setup.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist

Re: Doubt in Experimenter

by Thiago Ferreira :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Peter,

        That's true, maybe it could be an option (i.e, setting class for IR = -1, or another flag that make it ignore class for IR) so that it only cause that if the user specifically asked for that.
        Look at this scenario, I want to make a cross-validation experiment in Iris using NB, if I use ExplorerGUI, I can obtain the mean results of the cross-validation, but not the standard deviations (and I can't calculate the standard deviation because I don't have the results for each fold). If I use ExperimenterGUI I can get the mean result and the standard deviation for a lot of things but not for the IR statistics for each class and I also don't have this information in a per-fold basis to calculate it myself.
       Another option would be to add the standard deviations to the ExplorerGUI result, that would keep it simple, but IMO it would be a nice addition to the ExperimenterGUI if that information could be retrieved there also.

Regards,
Thiago


On Thu, Jul 3, 2008 at 4:35 AM, Peter Reutemann <fracpete@...> wrote:
>    In the analysis it could output a row for each class information,
> like is done in Explorer.

Having several hundred class labels, this would blow up your results
database quite dramatically.

>    Another question, in the Experimenter it looks like the property
> class for IR inside the splitEvaluator is one for the entire
> experiment and not one for each dataset. Is that right?

Yes, that's correct. The setting is for the complete experiment setup.

Cheers, Peter
--
Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
http://www.cs.waikato.ac.nz/~fracpete/ Ph. +64 (7) 858-5174

_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist


_______________________________________________
Wekalist mailing list
Wekalist@...
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
LightInTheBox - Buy quality products at wholesale price