<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
	<id>tag:www.nabble.com,2006:forum-435</id>
	<title>Nabble - WEKA</title>
	<updated>2008-10-11T02:04:50Z</updated>
	<link rel="self" type="application/atom+xml" href="http://www.nabble.com/WEKA-f435.xml" />
	<link rel="alternate" type="text/html" href="http://www.nabble.com/WEKA-f435.html" />
	<subtitle type="html">WEKA machine learning software discussion</subtitle>
	
<entry>
	<id>tag:www.nabble.com,2006:post-19930872</id>
	<title>Re: Feature selection</title>
	<published>2008-10-11T02:04:50Z</published>
	<updated>2008-10-11T02:04:50Z</updated>
	<author>
		<name>Ashwin Ittoo</name>
	</author>
	<content type="html">Association mining is well-documented in weka, try the explorer or the
&lt;br&gt;java api
&lt;br&gt;&lt;br&gt;pre-processing: for text use the filters like stringtowordvector or
&lt;br&gt;nominaltostring (i don't remember the name of this one correctly)
&lt;br&gt;Feature selection: for text i usually use Latent semantic analysis with
&lt;br&gt;ranker search
&lt;br&gt;&lt;br&gt;for non-text: i normally use &amp;nbsp;attributeSelection, which allows you to
&lt;br&gt;specify which attribute evaluation method and which search method to use
&lt;br&gt;when selecting/searching for the best attributes
&lt;br&gt;the weka wiki also discusses these issues, just google weka wiki, and
&lt;br&gt;search the wiki for these things
&lt;br&gt;&lt;br&gt;&lt;br&gt;ashwin
&lt;br&gt;&lt;br&gt;&lt;br&gt;svpriyan wrote:
&lt;div class='shrinkable-quote'&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; I am actually doing an experimental study with a few WEKA
&lt;br&gt;&amp;gt; classification/clustering/ association analysis techniques on a few UCI (or
&lt;br&gt;&amp;gt; other) datasets.
&lt;br&gt;&amp;gt; I know how to do Preprocessing in WEKA, also I got the idea of comparing to
&lt;br&gt;&amp;gt; classifiers
&lt;br&gt;&amp;gt; My problems are
&lt;br&gt;&amp;gt; •	How I can do Association analysis for my data set
&lt;br&gt;&amp;gt; { if I &amp;nbsp;get some step by step hints I can try}
&lt;br&gt;&amp;gt; •	Further preprocessing
&lt;br&gt;&amp;gt; I have to look on some of these techniques too in WEKA
&lt;br&gt;&amp;gt; –	Vector space for clustering/classification?
&lt;br&gt;&amp;gt; –	Feature selection/transformation
&lt;br&gt;&amp;gt; How I can do these for my preprocessed data.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Vass
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; 
&lt;br&gt;&amp;gt; ------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; _______________________________________________
&lt;br&gt;&amp;gt; Wekalist mailing list
&lt;br&gt;&amp;gt; &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19930872&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&amp;gt; &lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; 
&lt;/div&gt;&lt;/div&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19930872&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19913845p19930872.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19930489</id>
	<title>Re: Feature selection</title>
	<published>2008-10-11T00:55:46Z</published>
	<updated>2008-10-11T00:55:46Z</updated>
	<author>
		<name>Giorgio Corani-2</name>
	</author>
	<content type="html">Hi, there are plenty of such techniques (association analysis, feature
&lt;br&gt;selection etc) into WEKA.
&lt;br&gt;I guess you can learn how to use them if you dedicate 1 hour of your
&lt;br&gt;time to read the WEKA user manual.
&lt;br&gt;&lt;br&gt;regards
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;On Sat, Oct 11, 2008 at 9:50 AM, svpriyan &amp;lt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19930489&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;svpriyan@...&lt;/a&gt;&amp;gt; wrote:
&lt;div class='shrinkable-quote'&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I am actually doing an experimental study with a few WEKA
&lt;br&gt;&amp;gt; classification/clustering/ association analysis techniques on a few UCI (or
&lt;br&gt;&amp;gt; other) datasets.
&lt;br&gt;&amp;gt; I know how to do Preprocessing in WEKA, also I got the idea of comparing to
&lt;br&gt;&amp;gt; classifiers
&lt;br&gt;&amp;gt; My problems are
&lt;br&gt;&amp;gt; • &amp;nbsp; &amp;nbsp; &amp;nbsp; How I can do Association analysis for my data set
&lt;br&gt;&amp;gt; { if I &amp;nbsp;get some step by step hints I can try}
&lt;br&gt;&amp;gt; • &amp;nbsp; &amp;nbsp; &amp;nbsp; Further preprocessing
&lt;br&gt;&amp;gt; I have to look on some of these techniques too in WEKA
&lt;br&gt;&amp;gt; – &amp;nbsp; &amp;nbsp; &amp;nbsp; Vector space for clustering/classification?
&lt;br&gt;&amp;gt; – &amp;nbsp; &amp;nbsp; &amp;nbsp; Feature selection/transformation
&lt;br&gt;&amp;gt; How I can do these for my preprocessed data.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Vass
&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19930489&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19913845p19930489.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19930464</id>
	<title>Re: Feature selection</title>
	<published>2008-10-11T00:50:57Z</published>
	<updated>2008-10-11T00:50:57Z</updated>
	<author>
		<name>svpriyan</name>
	</author>
	<content type="html">&lt;br&gt;I am actually doing an experimental study with a few WEKA classification/clustering/ association analysis techniques on a few UCI (or other) datasets.
&lt;br&gt;I know how to do Preprocessing in WEKA, also I got the idea of comparing to classifiers
&lt;br&gt;My problems are
&lt;br&gt;•	How I can do Association analysis for my data set
&lt;br&gt;{ if I &amp;nbsp;get some step by step hints I can try}
&lt;br&gt;•	Further preprocessing
&lt;br&gt;I have to look on some of these techniques too in WEKA
&lt;br&gt;–	Vector space for clustering/classification?
&lt;br&gt;–	Feature selection/transformation
&lt;br&gt;How I can do these for my preprocessed data.
&lt;br&gt;&lt;br&gt;Vass
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19913845p19930464.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19927810</id>
	<title>RBF network width</title>
	<published>2008-10-10T16:54:38Z</published>
	<updated>2008-10-10T16:54:38Z</updated>
	<author>
		<name>Lloyd Smith-3</name>
	</author>
	<content type="html">&lt;div dir=&quot;ltr&quot;&gt;How does weka calculate the width of a hidden node in the rbf network?&lt;br&gt;&lt;/div&gt;
&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19927810&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/RBF-network-width-tp19927810p19927810.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19927782</id>
	<title>Re: Feature selection</title>
	<published>2008-10-10T16:50:03Z</published>
	<updated>2008-10-10T16:50:03Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp;Further preprocessing
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; – &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Vector space for clustering/classification?
&lt;br&gt;&amp;gt; – &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Feature selection/transformation
&lt;br&gt;&amp;gt; Could you please give some hints to do this.
&lt;br&gt;&lt;br&gt;Could you please elaborate on this?
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19927782&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19913845p19927782.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19927766</id>
	<title>Re: Could anyone tell me</title>
	<published>2008-10-10T16:48:55Z</published>
	<updated>2008-10-10T16:48:55Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I'm using WEKA classifiers to classify my data. I was using preset 10 fold
&lt;br&gt;&amp;gt; cross validation and getting ROC values ranging from 0.545 to 0.773. Today I
&lt;br&gt;&amp;gt; accidentally clicked Percentage Split 66 and ROC values for some of the
&lt;br&gt;&amp;gt; classifiers went upto 0.864. Could anyone be able to explain this to me and
&lt;br&gt;&amp;gt; is there anyway to improve the ROC values. I would appreciate that.
&lt;br&gt;&lt;br&gt;Classifiers are quite often sensitive to the order the data presented.
&lt;br&gt;A single run of percentage split will produce most likely results that
&lt;br&gt;are either too pessimistic or too optimistic. 10-fold CV gives you a
&lt;br&gt;better estimate. Better even, the approach that the Weka Experimenter
&lt;br&gt;uses, is 10 runs of 10-fold CV.
&lt;br&gt;&lt;br&gt;BTW the Explorer is only a tool for &amp;quot;exploring&amp;quot; the data, having a
&lt;br&gt;play with it, getting a feel for it. If you want reliable statistical
&lt;br&gt;results, use the Experimenter.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19927766&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Could-anyone-tell-me-tp19927353p19927766.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19927353</id>
	<title>Could anyone tell me</title>
	<published>2008-10-10T16:00:49Z</published>
	<updated>2008-10-10T16:00:49Z</updated>
	<author>
		<name>Mehdi Satter-2</name>
	</author>
	<content type="html">&lt;div dir=&quot;ltr&quot;&gt;&lt;p&gt;Hi,&lt;br&gt;&lt;/p&gt;&lt;p&gt;I&amp;#39;m using WEKA classifiers to classify my data. I was using preset 10 fold cross validation and getting ROC values ranging from 0.545 to 0.773. Today I accidentally clicked Percentage Split 66 and ROC values for some of the classifiers went upto 0.864. Could anyone be able to explain this to me and is there anyway to improve the ROC values. I would appreciate that.&lt;/p&gt;
&lt;p&gt;Thanks!&lt;/p&gt;&lt;p&gt;-Mehdi&lt;/p&gt;&lt;/div&gt;
&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19927353&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Could-anyone-tell-me-tp19927353p19927353.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19913845</id>
	<title>Feature selection</title>
	<published>2008-10-10T01:12:21Z</published>
	<updated>2008-10-10T01:12:21Z</updated>
	<author>
		<name>svpriyan</name>
	</author>
	<content type="html">&lt;br&gt;Dear Sir,
&lt;br&gt;&amp;nbsp; &amp;nbsp;
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; Further preprocessing
&lt;br&gt;&lt;br&gt;– &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Vector space for clustering/classification?
&lt;br&gt;– &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Feature selection/transformation
&lt;br&gt;Could you please give some hints to do this.
&lt;br&gt;&lt;br&gt;Vass
&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19913845p19913845.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19912890</id>
	<title>Feature selection</title>
	<published>2008-10-09T23:49:14Z</published>
	<updated>2008-10-09T23:49:14Z</updated>
	<author>
		<name>svpriyan</name>
	</author>
	<content type="html">Dear Sir,
&lt;br&gt;&amp;nbsp; &amp;nbsp; 
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; Further preprocessing
&lt;br&gt;&lt;br&gt;– &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Vector space for clustering/classification?
&lt;br&gt;– &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Feature selection/transformation
&lt;br&gt;Could you please give some hints to do this.
&lt;br&gt;&lt;br&gt;Vass</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Feature-selection-tp19912890p19912890.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19905156</id>
	<title>Re: predicted class in 10 fold cross validation</title>
	<published>2008-10-09T11:57:24Z</published>
	<updated>2008-10-09T11:57:24Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I have done 10 fold cross validation in Weka using LWL classifier on &amp;nbsp;a data
&lt;br&gt;&amp;gt; set with &amp;nbsp;30 instances. Is there a way to find out the predicted class for
&lt;br&gt;&amp;gt; each of these 30 instances?
&lt;br&gt;&lt;br&gt;- In the Explorer: check &amp;quot;Output predictions&amp;quot; in the &amp;quot;More options...&amp;quot; dialog
&lt;br&gt;- On the command-line (since 3.5.8): use the &amp;quot;-p&amp;quot; option (if you need
&lt;br&gt;to do it for a numeric class, you need download a developer snapshot
&lt;br&gt;from the Weka homepage, since the 3.5.8 release contains a bug -
&lt;br&gt;didn't output anything)
&lt;br&gt;- From Java: use method &amp;quot;predictions()&amp;quot; of the
&lt;br&gt;weka.classifiers.Evaluation class (only nominal classes)
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19905156&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/predicted-class-in-10-fold-cross-validation-tp19901630p19905156.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19905012</id>
	<title>Re: Convert CSV 2 ARFF in Java code?</title>
	<published>2008-10-09T11:50:21Z</published>
	<updated>2008-10-09T11:50:21Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; Thank you for your email, actually I need help on detailed steps of how to predict phosphorylation sites with my existing file.
&lt;br&gt;&amp;gt; Would you please tell me how to do that?
&lt;br&gt;&lt;br&gt;See FAQ &amp;quot;How do I use Weka's classes in my own code?&amp;quot; for more
&lt;br&gt;information of how to use Weka's API from your own code. Link to the
&lt;br&gt;FAQs available from the Weka homepage.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19905012&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Convert-CSV-2-ARFF-in-Java-code--tp19558068p19905012.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19904970</id>
	<title>Re: java code for predicting contineous values using MLP-neural network</title>
	<published>2008-10-09T11:47:24Z</published>
	<updated>2008-10-09T11:47:24Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I was to get the predicted values using java code if i am having predictable
&lt;br&gt;&amp;gt; column as discrete
&lt;br&gt;&amp;gt; but i am facing a problem if the preditable column is contineous............
&lt;br&gt;&amp;gt; i am trying the code for multilayer perceptron.............
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; can any one give me a sample java code for prediction for multilayer
&lt;br&gt;&amp;gt; perceptron for contineous values
&lt;br&gt;&lt;br&gt;You just use the classifyInstance method, like in the example here:
&lt;br&gt;&lt;a href=&quot;http://weka.sourceforge.net/wiki/index.php/Use_Weka_in_your_Java_code#Classifying_instances&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://weka.sourceforge.net/wiki/index.php/Use_Weka_in_your_Java_code#Classifying_instances&lt;/a&gt;&lt;br&gt;&lt;br&gt;This method returns, in case of numeric attributes, the numeric prediction.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19904970&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/java-code-for-predicting-contineous-values-using-MLP-neural-network-tp19895586p19904970.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19902782</id>
	<title>RE: Convert CSV 2 ARFF in Java code?</title>
	<published>2008-10-09T09:39:22Z</published>
	<updated>2008-10-09T09:39:22Z</updated>
	<author>
		<name>Bioinfo</name>
	</author>
	<content type="html">&lt;br&gt;Dear Peter,
&lt;br&gt;&lt;br&gt;&lt;br&gt;Thank you for your email, actually I need help on detailed steps of how to predict phosphorylation sites with my existing file.
&lt;br&gt;Would you please tell me how to do that?
&lt;br&gt;&lt;br&gt;Thanks,
&lt;br&gt;Bahareh
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;-----Original Message-----
&lt;br&gt;From: &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19902782&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;wekalist-bounces@...&lt;/a&gt; on behalf of Peter Reutemann
&lt;br&gt;Sent: Mon 06/10/2008 4:22 PM
&lt;br&gt;To: Weka machine learning workbench list.
&lt;br&gt;Subject: Re: [Wekalist] Convert CSV 2 ARFF in Java code?
&lt;br&gt;&amp;nbsp;
&lt;br&gt;&amp;gt; I am a new WEKA user and I don't know how to convert my CSV file to arff. I
&lt;br&gt;&amp;gt; use eclipse for java programming but it doesn't recognize weka. Would you
&lt;br&gt;&amp;gt; please help me to find out where I have to type this code?
&lt;br&gt;&lt;br&gt;[...]
&lt;br&gt;&lt;br&gt;Steps involved:
&lt;br&gt;- Create a new project (or use an existing one)
&lt;br&gt;- add the weka.jar as (external) jar under &amp;quot;Libraries&amp;quot;
&lt;br&gt;&amp;nbsp; (or, in an existing projects, under &amp;quot;Project -&amp;gt; Properties -&amp;gt; Java
&lt;br&gt;Build Path -&amp;gt; Libraries&amp;quot;)
&lt;br&gt;- create a new package, e.g., &amp;quot;myweka&amp;quot;
&lt;br&gt;- add the CSV2Arff class in that package and set the package of the
&lt;br&gt;source file to &amp;quot;myweka&amp;quot;
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19902782&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt; &lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19902782&amp;i=2&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;&lt;div class=&quot;small&quot;&gt;&lt;br/&gt;&lt;img src=&quot;http://www.nabble.com/images/icon_attachment.gif&quot; &gt; &lt;strong&gt;winmail.dat&lt;/strong&gt; (4K) &lt;a href=&quot;http://www.nabble.com/attachment/19902782/0/winmail.dat&quot; target=&quot;_top&quot;&gt;Download Attachment&lt;/a&gt;&lt;/div&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Convert-CSV-2-ARFF-in-Java-code--tp19558068p19902782.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19901630</id>
	<title>predicted class in 10 fold cross validation</title>
	<published>2008-10-09T08:44:06Z</published>
	<updated>2008-10-09T08:44:06Z</updated>
	<author>
		<name>reddymettu</name>
	</author>
	<content type="html">&lt;html&gt;&lt;head&gt;&lt;/head&gt;&lt;body&gt;&lt;div style=&quot;font-family:times new roman,new york,times,serif;font-size:10pt&quot;&gt;&lt;div&gt;Hi all,&lt;br&gt;&lt;br&gt;I have done 10 fold cross validation in Weka using LWL classifier on&amp;nbsp; a data set with&amp;nbsp; 30 instances. Is there a way to find out the predicted class for each of these 30 instances?&lt;br&gt;&lt;br&gt;Thanks &lt;br&gt;Rk&lt;br&gt;&lt;/div&gt;&lt;/div&gt;&lt;br&gt;
      &lt;!--6--&gt;&lt;hr size=1&gt;&lt;/hr&gt; Add more friends to your messenger and enjoy! &lt;a href=&quot;http://in.rd.yahoo.com/tagline_messenger_6/*http://messenger.yahoo.com/invite/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt; Invite them now.&lt;/a&gt;&lt;/body&gt;&lt;/html&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19901630&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/predicted-class-in-10-fold-cross-validation-tp19901630p19901630.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19901293</id>
	<title>RE: Convert CSV 2 ARFF in Java code?</title>
	<published>2008-10-09T08:22:28Z</published>
	<updated>2008-10-09T08:22:28Z</updated>
	<author>
		<name>Bioinfo</name>
	</author>
	<content type="html">Dear Peter,
&lt;br&gt;&lt;br&gt;&lt;br&gt;Thank you for your email, actually I need help on detailed steps of how to predict phosphorylation sites with my existing file.
&lt;br&gt;&lt;br&gt;Thanks,
&lt;br&gt;Bahareh
&lt;br&gt;&lt;br&gt;-----Original Message-----
&lt;br&gt;From: &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19901293&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;wekalist-bounces@...&lt;/a&gt; on behalf of Peter Reutemann
&lt;br&gt;Sent: Mon 06/10/2008 4:22 PM
&lt;br&gt;To: Weka machine learning workbench list.
&lt;br&gt;Subject: Re: [Wekalist] Convert CSV 2 ARFF in Java code?
&lt;br&gt;&amp;nbsp;
&lt;br&gt;&amp;gt; I am a new WEKA user and I don't know how to convert my CSV file to arff. I
&lt;br&gt;&amp;gt; use eclipse for java programming but it doesn't recognize weka. Would you
&lt;br&gt;&amp;gt; please help me to find out where I have to type this code?
&lt;br&gt;&lt;br&gt;[...]
&lt;br&gt;&lt;br&gt;Steps involved:
&lt;br&gt;- Create a new project (or use an existing one)
&lt;br&gt;- add the weka.jar as (external) jar under &amp;quot;Libraries&amp;quot;
&lt;br&gt;&amp;nbsp; (or, in an existing projects, under &amp;quot;Project -&amp;gt; Properties -&amp;gt; Java
&lt;br&gt;Build Path -&amp;gt; Libraries&amp;quot;)
&lt;br&gt;- create a new package, e.g., &amp;quot;myweka&amp;quot;
&lt;br&gt;- add the CSV2Arff class in that package and set the package of the
&lt;br&gt;source file to &amp;quot;myweka&amp;quot;
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19901293&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt; &lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19901293&amp;i=2&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;&lt;div class=&quot;small&quot;&gt;&lt;br/&gt;&lt;img src=&quot;http://www.nabble.com/images/icon_attachment.gif&quot; &gt; &lt;strong&gt;winmail.dat&lt;/strong&gt; (4K) &lt;a href=&quot;http://www.nabble.com/attachment/19901293/0/winmail.dat&quot; target=&quot;_top&quot;&gt;Download Attachment&lt;/a&gt;&lt;/div&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Convert-CSV-2-ARFF-in-Java-code--tp19558068p19901293.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19895586</id>
	<title>java code for predicting contineous values using MLP-neural network</title>
	<published>2008-10-09T02:45:16Z</published>
	<updated>2008-10-09T02:45:16Z</updated>
	<author>
		<name>sijusony</name>
	</author>
	<content type="html">hi,
&lt;br&gt;&lt;br&gt;I am new to WEKA,I was able &amp;nbsp;to get the predicted values using java code if i am having predictable column as discrete
&lt;br&gt;but i am facing a problem if the preditable column is contineous............
&lt;br&gt;i am trying the code for multilayer perceptron.............
&lt;br&gt;&lt;br&gt;can any one give me a sample java code for prediction for multilayer perceptron for contineous values</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/java-code-for-predicting-contineous-values-using-MLP-neural-network-tp19895586p19895586.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19895481</id>
	<title>Re: ROC for cross validation</title>
	<published>2008-10-09T02:36:45Z</published>
	<updated>2008-10-09T02:36:45Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&lt;div class='shrinkable-quote'&gt;&amp;gt; So it's not a very sophisticated way to make a ROC for cross validations -
&lt;br&gt;&amp;gt; there is no averaging being done as is the case for other methods like
&lt;br&gt;&amp;gt; Vertical Averaging or Threshold Averaging for instance.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; There's no need for averaging with just 1 run of CV.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Hmm, I do not agree here. You could still build the average of the curves of
&lt;br&gt;&amp;gt; all folds to see how &amp;quot;stable&amp;quot; the curves are located in their space. Here
&lt;br&gt;&amp;gt; is, for example, how this is done by RapidMiner for a cross validation of
&lt;br&gt;&amp;gt; one learning scheme (in red are the ROC values and in blue are the
&lt;br&gt;&amp;gt; thresholds):
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &lt;a href=&quot;http://www.rapid-i.com/roc/RapidMiner_ROC_1.jpg&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.rapid-i.com/roc/RapidMiner_ROC_1.jpg&lt;/a&gt;&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; This gives you an idea how robust a threshold selection for example in
&lt;br&gt;&amp;gt; cost-sensitive learning is. And here is another result, this time of a
&lt;br&gt;&amp;gt; 10-fold cross validation of three different learning schemes (generated with
&lt;br&gt;&amp;gt; the RapidMiner operator &amp;quot;ROCComparison&amp;quot;):
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &lt;a href=&quot;http://www.rapid-i.com/roc/RapidMiner_ROC_2.jpg&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.rapid-i.com/roc/RapidMiner_ROC_2.jpg&lt;/a&gt;&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Maybe these pictures give you an idea why something like this can actually
&lt;br&gt;&amp;gt; help.
&lt;/div&gt;&lt;br&gt;True, does anybody feel like contributing some code? ;-)
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19895481&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/ROC-for-cross-validation-tp19892729p19895481.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19895446</id>
	<title>Re: ROC for cross validation</title>
	<published>2008-10-09T02:33:48Z</published>
	<updated>2008-10-09T02:33:48Z</updated>
	<author>
		<name>Ingo Mierswa</name>
	</author>
	<content type="html">&lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01 Transitional//EN&quot;&gt;
&lt;html&gt;
&lt;head&gt;
  &lt;meta content=&quot;text/html;charset=ISO-8859-1&quot; http-equiv=&quot;Content-Type&quot;&gt;
  &lt;title&gt;&lt;/title&gt;
&lt;/head&gt;
&lt;body bgcolor=&quot;#ffffff&quot; text=&quot;#000000&quot;&gt;
Hi,&lt;br&gt;
&lt;br&gt;
&lt;blockquote cite=&quot;mid:548e07050810090038s66c38ca1if8bbc3b4811f445@mail.gmail.com&quot; type=&quot;cite&quot;&gt;
  &lt;blockquote type=&quot;cite&quot;&gt;
    &lt;pre wrap=&quot;&quot;&gt;So it's not a very sophisticated way to make a ROC for cross validations -
there is no averaging being done as is the case for other methods like
Vertical Averaging or Threshold Averaging for instance.
    &lt;/pre&gt;
  &lt;/blockquote&gt;
  &lt;pre wrap=&quot;&quot;&gt;&lt;!----&gt;
There's no need for averaging with just 1 run of CV.
  &lt;/pre&gt;
&lt;/blockquote&gt;
&lt;br&gt;
Hmm, I do not agree here. You could still build the average of the
curves of all folds to see how &quot;stable&quot; the curves are located in their
space. Here is, for example, how this is done by RapidMiner for a cross
validation of one learning scheme (in red are the ROC values and in
blue are the thresholds):&lt;br&gt;
&lt;br&gt;
&lt;a class=&quot;moz-txt-link-freetext&quot; href=&quot;http://www.rapid-i.com/roc/RapidMiner_ROC_1.jpg&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.rapid-i.com/roc/RapidMiner_ROC_1.jpg&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
This gives you an idea how robust a threshold selection for example in
cost-sensitive learning is. And here is another result, this time of a
10-fold cross validation of three different learning schemes (generated
with the RapidMiner operator &quot;ROCComparison&quot;):&lt;br&gt;
&lt;br&gt;
&lt;a class=&quot;moz-txt-link-freetext&quot; href=&quot;http://www.rapid-i.com/roc/RapidMiner_ROC_2.jpg&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.rapid-i.com/roc/RapidMiner_ROC_2.jpg&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
Maybe these pictures give you an idea why something like this can
actually help.&lt;br&gt;
&lt;br&gt;
Cheers,&lt;br&gt;
Ingo&lt;br&gt;
&lt;br&gt;
&lt;pre class=&quot;moz-signature&quot; cols=&quot;72&quot;&gt;-- 
Ingo Mierswa
Managing Director

Rapid-I GmbH
Stockumer Str. 475
44149 Dortmund, Germany

Phone: +49 (0)231 425 786 90

E-Mail:  &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19895446&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;mierswa@...&lt;/a&gt;

Sitz: Dortmund
HRB 20720, Amtsgericht Dortmund
Gesch&amp;auml;ftsf&amp;uuml;hrer: Ingo Mierswa, Ralf Klinkenberg

www: &lt;a class=&quot;moz-txt-link-freetext&quot; href=&quot;http://rapid-i.com/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://rapid-i.com/&lt;/a&gt;
&lt;/pre&gt;
&lt;/body&gt;
&lt;/html&gt;
&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19895446&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/ROC-for-cross-validation-tp19892729p19895446.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19893950</id>
	<title>Re: ROC for cross validation</title>
	<published>2008-10-09T00:38:58Z</published>
	<updated>2008-10-09T00:38:58Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I've looked at the ThresholdCurve code and I wanted to confirm that when k
&lt;br&gt;&amp;gt; fold cross validation is performed, the curve that is produced is simply a
&lt;br&gt;&amp;gt; ROC curve made from all the predictions from the k folds.
&lt;br&gt;&lt;br&gt;Yes, correct.
&lt;br&gt;&lt;br&gt;&amp;gt; So it's not a very sophisticated way to make a ROC for cross validations -
&lt;br&gt;&amp;gt; there is no averaging being done as is the case for other methods like
&lt;br&gt;&amp;gt; Vertical Averaging or Threshold Averaging for instance.
&lt;br&gt;&lt;br&gt;There's no need for averaging with just 1 run of CV.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19893950&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/ROC-for-cross-validation-tp19892729p19893950.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19892729</id>
	<title>ROC for cross validation</title>
	<published>2008-10-08T22:41:37Z</published>
	<updated>2008-10-08T22:41:37Z</updated>
	<author>
		<name>Shirley Hui</name>
	</author>
	<content type="html">Hello,
&lt;br&gt;I've looked at the ThresholdCurve code and I wanted to confirm that &amp;nbsp;
&lt;br&gt;when k fold cross validation is performed, the curve that is produced &amp;nbsp;
&lt;br&gt;is simply a ROC curve made from all the predictions from the k folds.
&lt;br&gt;So it's not a very sophisticated way to make a ROC for cross &amp;nbsp;
&lt;br&gt;validations - there is no averaging being done as is the case for &amp;nbsp;
&lt;br&gt;other methods like Vertical Averaging or Threshold Averaging for &amp;nbsp;
&lt;br&gt;instance.
&lt;br&gt;thanks,
&lt;br&gt;shirley
&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19892729&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/ROC-for-cross-validation-tp19892729p19892729.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19890996</id>
	<title>Weka moves from CVS to Subversion</title>
	<published>2008-10-08T19:05:51Z</published>
	<updated>2008-10-08T19:05:51Z</updated>
	<author>
		<name>Mark Hall-9</name>
	</author>
	<content type="html">Hi folks,
&lt;br&gt;&lt;br&gt;Weka's CVS repository has been migrated to Subversion. You can access &amp;nbsp;
&lt;br&gt;it from:
&lt;br&gt;&lt;br&gt;&lt;a href=&quot;https://svn.scms.waikato.ac.nz/svn/weka/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://svn.scms.waikato.ac.nz/svn/weka/&lt;/a&gt;&lt;br&gt;&lt;br&gt;At present the &amp;quot;trunk&amp;quot; directory has anonymous read access. We are &amp;nbsp;
&lt;br&gt;just waiting on our technical support guys to add anonymous read &amp;nbsp;
&lt;br&gt;access to &amp;quot;branches&amp;quot; and &amp;quot;tags&amp;quot; as well.
&lt;br&gt;&lt;br&gt;Cheers,
&lt;br&gt;Mark.
&lt;br&gt;&lt;br&gt;--
&lt;br&gt;Mark Hall
&lt;br&gt;Senior Developer/Consultant, Pentaho Open Source Business Intelligence
&lt;br&gt;Citadel International, Suite 340, 5950 Hazeltine National Dr.,
&lt;br&gt;Orlando, FL 32822, USA
&lt;br&gt;+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
&lt;br&gt;Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
&lt;br&gt;Download the latest release today &amp;lt;&lt;a href=&quot;http://www.sourceforge.net/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.sourceforge.net/&lt;/a&gt;&amp;nbsp;
&lt;br&gt;projects/pentaho&amp;gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19890996&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Weka-moves-from-CVS-to-Subversion-tp19890996p19890996.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19888392</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T14:56:37Z</published>
	<updated>2008-10-08T14:56:37Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&lt;div class='shrinkable-quote'&gt;&amp;gt;&amp;gt; Well, I call it &amp;quot;unreliable&amp;quot; since you ask for &amp;quot;1&amp;quot; neighbor, but you
&lt;br&gt;&amp;gt;&amp;gt; can end up with more than that (in my tests, I quite often got 2
&lt;br&gt;&amp;gt;&amp;gt; neighbors back). IMHO it's misleading having the method called
&lt;br&gt;&amp;gt;&amp;gt; &amp;quot;kNearestNeighbours&amp;quot;; it should be called &amp;quot;kOrMoreNearestNeighbors&amp;quot;.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; How would you break ties then? If there are more than one nearest neighbour
&lt;br&gt;&amp;gt; with the same distance, a more reliable model will be generated by using all
&lt;br&gt;&amp;gt; of them. IBk has always done this - when you set k to 1 and there are ties,
&lt;br&gt;&amp;gt; it returns the majority class (or, more correctly, generates a distribution
&lt;br&gt;&amp;gt; from the votes for each class) over these ties as the prediction.
&lt;/div&gt;&lt;br&gt;Good point.
&lt;br&gt;&lt;br&gt;But what I'm saying is that the method is misleading (as well as the
&lt;br&gt;Javadoc documentation), it violates design by contract. Other schemes
&lt;br&gt;expect to get exactly k neighbors back and therefore don't check if
&lt;br&gt;they get back more and need to break ties themselves. One would have
&lt;br&gt;to check whether the current implementation doesn't break any schemes,
&lt;br&gt;due to not checking the actual returned number of instances.
&lt;br&gt;&lt;br&gt;At least there should be an option available for the user to choose
&lt;br&gt;truncating the returned instances to the requested number, for
&lt;br&gt;instance &amp;quot;tieBreaking&amp;quot; property (command-line option &amp;quot;-T &amp;lt;type&amp;gt;&amp;quot;):
&lt;br&gt;- none, include ties in result
&lt;br&gt;- first come, first served
&lt;br&gt;- pick majority class
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19888392&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19888392.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19888146</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T14:39:46Z</published>
	<updated>2008-10-08T14:39:46Z</updated>
	<author>
		<name>Mark Hall-9</name>
	</author>
	<content type="html">&lt;br&gt;On 9/10/2008, at 10:07 AM, Peter Reutemann wrote:
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; The LinearNNSearch algorithm (package weka.core.neighboursearch - &amp;nbsp;
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; Weka
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; 3.5.8) is a bit unreliable in case you have ties in distances. As &amp;nbsp;
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; far
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; as I know, if the algorithm encounters ties on the last neighbor &amp;nbsp;
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; to be
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; returned, it will return all of them. I.e., if you want 10 neighbors
&lt;br&gt;&amp;gt;&amp;gt;
&lt;br&gt;&amp;gt;&amp;gt; This is intentional and not &amp;quot;unreliable&amp;quot;. In fact, it returns ties &amp;nbsp;
&lt;br&gt;&amp;gt;&amp;gt; at any
&lt;br&gt;&amp;gt;&amp;gt; instance in the top k, not only the last one. So, if you ask for five
&lt;br&gt;&amp;gt;&amp;gt; nearest neighbours and the closest neighbour has two ties, then &amp;nbsp;
&lt;br&gt;&amp;gt;&amp;gt; three of the
&lt;br&gt;&amp;gt;&amp;gt; top five slots are consumed.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Well, I call it &amp;quot;unreliable&amp;quot; since you ask for &amp;quot;1&amp;quot; neighbor, but you
&lt;br&gt;&amp;gt; can end up with more than that (in my tests, I quite often got 2
&lt;br&gt;&amp;gt; neighbors back). IMHO it's misleading having the method called
&lt;br&gt;&amp;gt; &amp;quot;kNearestNeighbours&amp;quot;; it should be called &amp;quot;kOrMoreNearestNeighbors&amp;quot;.
&lt;/div&gt;&lt;/div&gt;How would you break ties then? If there are more than one nearest &amp;nbsp;
&lt;br&gt;neighbour with the same distance, a more reliable model will be &amp;nbsp;
&lt;br&gt;generated by using all of them. IBk has always done this - when you &amp;nbsp;
&lt;br&gt;set k to 1 and there are ties, it returns the majority class (or, &amp;nbsp;
&lt;br&gt;more correctly, generates a distribution from the votes for each &amp;nbsp;
&lt;br&gt;class) over these ties as the prediction.
&lt;br&gt;&lt;br&gt;Cheers,
&lt;br&gt;Mark.
&lt;br&gt;&lt;br&gt;--
&lt;br&gt;Mark Hall
&lt;br&gt;Senior Developer/Consultant, Pentaho Open Source Business Intelligence
&lt;br&gt;Citadel International, Suite 340, 5950 Hazeltine National Dr.,
&lt;br&gt;Orlando, FL 32822, USA
&lt;br&gt;+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
&lt;br&gt;Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
&lt;br&gt;Download the latest release today &amp;lt;&lt;a href=&quot;http://www.sourceforge.net/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.sourceforge.net/&lt;/a&gt;&amp;nbsp;
&lt;br&gt;projects/pentaho&amp;gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19888146&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19888146.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19887615</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T14:07:27Z</published>
	<updated>2008-10-08T14:07:27Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt;&amp;gt; The LinearNNSearch algorithm (package weka.core.neighboursearch - Weka
&lt;br&gt;&amp;gt;&amp;gt; 3.5.8) is a bit unreliable in case you have ties in distances. As far
&lt;br&gt;&amp;gt;&amp;gt; as I know, if the algorithm encounters ties on the last neighbor to be
&lt;br&gt;&amp;gt;&amp;gt; returned, it will return all of them. I.e., if you want 10 neighbors
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; This is intentional and not &amp;quot;unreliable&amp;quot;. In fact, it returns ties at any
&lt;br&gt;&amp;gt; instance in the top k, not only the last one. So, if you ask for five
&lt;br&gt;&amp;gt; nearest neighbours and the closest neighbour has two ties, then three of the
&lt;br&gt;&amp;gt; top five slots are consumed.
&lt;br&gt;&lt;br&gt;Well, I call it &amp;quot;unreliable&amp;quot; since you ask for &amp;quot;1&amp;quot; neighbor, but you
&lt;br&gt;can end up with more than that (in my tests, I quite often got 2
&lt;br&gt;neighbors back). IMHO it's misleading having the method called
&lt;br&gt;&amp;quot;kNearestNeighbours&amp;quot;; it should be called &amp;quot;kOrMoreNearestNeighbors&amp;quot;.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19887615&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19887615.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19887459</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T13:58:40Z</published>
	<updated>2008-10-08T13:58:40Z</updated>
	<author>
		<name>Mark Hall-9</name>
	</author>
	<content type="html">&lt;br&gt;On 9/10/2008, at 8:38 AM, Peter Reutemann wrote:
&lt;br&gt;&lt;br&gt;&amp;gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; The LinearNNSearch algorithm (package weka.core.neighboursearch - Weka
&lt;br&gt;&amp;gt; 3.5.8) is a bit unreliable in case you have ties in distances. As far
&lt;br&gt;&amp;gt; as I know, if the algorithm encounters ties on the last neighbor to be
&lt;br&gt;&amp;gt; returned, it will return all of them. I.e., if you want 10 neighbors
&lt;br&gt;&lt;br&gt;This is intentional and not &amp;quot;unreliable&amp;quot;. In fact, it returns ties at &amp;nbsp;
&lt;br&gt;any instance in the top k, not only the last one. So, if you ask for &amp;nbsp;
&lt;br&gt;five nearest neighbours and the closest neighbour has two ties, then &amp;nbsp;
&lt;br&gt;three of the top five slots are consumed.
&lt;br&gt;&lt;br&gt;Cheers,
&lt;br&gt;Mark
&lt;br&gt;&lt;br&gt;--
&lt;br&gt;Mark Hall
&lt;br&gt;Senior Developer/Consultant, Pentaho Open Source Business Intelligence
&lt;br&gt;Citadel International, Suite 340, 5950 Hazeltine National Dr.,
&lt;br&gt;Orlando, FL 32822, USA
&lt;br&gt;+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
&lt;br&gt;Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
&lt;br&gt;Download the latest release today &amp;lt;&lt;a href=&quot;http://www.sourceforge.net/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.sourceforge.net/&lt;/a&gt;&amp;nbsp;
&lt;br&gt;projects/pentaho&amp;gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19887459&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19887459.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19886023</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T12:38:34Z</published>
	<updated>2008-10-08T12:38:34Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&lt;div class='shrinkable-quote'&gt;&amp;gt; I have still another question about how LWL works.
&lt;br&gt;&amp;gt; Assume I use LWL with naive Bayes.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; If I set k=total number of instances in the training set and if I
&lt;br&gt;&amp;gt; choose the constant kernel, then LWL is actually equivalent to a
&lt;br&gt;&amp;gt; global naive Bayes.
&lt;br&gt;&amp;gt; Ok.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; However, let me assume k=1.
&lt;br&gt;&amp;gt; Now, the LWL should be equivalent to a Naive Bayes built on the
&lt;br&gt;&amp;gt; closest instance.
&lt;br&gt;&amp;gt; This is not the case, however.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Let me consider the first instance of the below data set. If I build a
&lt;br&gt;&amp;gt; Naive Bayes using this only instance as training set,
&lt;br&gt;&amp;gt; the instance itself is classified by naive Bayes as 0.214 &amp;nbsp; 0.214 &amp;nbsp;*0.57.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; However, If I run LWL with naive Bayes and a k=1
&lt;br&gt;&amp;gt; (LWL -U 5 -K 1 -W ...naiveBayes), I get the following classification:
&lt;br&gt;&amp;gt; 0.25 &amp;nbsp; 0.25 &amp;nbsp;*0.5.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Why do the results differ?
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Moreover, working with k=1 I get different results, if I use different
&lt;br&gt;&amp;gt; kernels. I expected instead the results to be insensitive on the
&lt;br&gt;&amp;gt; choice of the kernel for k=1, as in fact there is only one instance
&lt;br&gt;&amp;gt; under consideration, to which all kernels should give weight 1.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; What do I miss?
&lt;/div&gt;&lt;br&gt;The LinearNNSearch algorithm (package weka.core.neighboursearch - Weka
&lt;br&gt;3.5.8) is a bit unreliable in case you have ties in distances. As far
&lt;br&gt;as I know, if the algorithm encounters ties on the last neighbor to be
&lt;br&gt;returned, it will return all of them. I.e., if you want 10 neighbors
&lt;br&gt;and you encounter ties with several instances, say 3 have the same
&lt;br&gt;distance, on the tenth one, then all of these will be returned, in
&lt;br&gt;other words 12. I just ran LWL with NaiveBayes and CONSTANT weighting
&lt;br&gt;(-U 5) and k=1 on the UCI dataset &amp;quot;anneal&amp;quot; and encountered sometimes
&lt;br&gt;that 2 instead of 1 instance were returned and use as training set for
&lt;br&gt;the base classifier. That might explain your observations.
&lt;br&gt;&lt;br&gt;I leave that for the Weka maintainer...
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19886023&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19886023.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19882016</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-08T09:07:05Z</published>
	<updated>2008-10-08T09:07:05Z</updated>
	<author>
		<name>Giorgio Corani-2</name>
	</author>
	<content type="html">I have still another question about how LWL works.
&lt;br&gt;Assume I use LWL with naive Bayes.
&lt;br&gt;&lt;br&gt;If I set k=total number of instances in the training set and if I
&lt;br&gt;choose the constant kernel, then LWL is actually equivalent to a
&lt;br&gt;global naive Bayes.
&lt;br&gt;Ok.
&lt;br&gt;&lt;br&gt;However, let me assume k=1.
&lt;br&gt;Now, the LWL should be equivalent to a Naive Bayes built on the
&lt;br&gt;closest instance.
&lt;br&gt;This is not the case, however.
&lt;br&gt;&lt;br&gt;Let me consider the first instance of the below data set. If I build a
&lt;br&gt;Naive Bayes using this only instance as training set,
&lt;br&gt;the instance itself is classified by naive Bayes as 0.214 &amp;nbsp; 0.214 &amp;nbsp;*0.57.
&lt;br&gt;&lt;br&gt;However, If I run LWL with naive Bayes and a k=1
&lt;br&gt;(LWL -U 5 -K 1 -W ...naiveBayes), I get the following classification:
&lt;br&gt;0.25 &amp;nbsp; 0.25 &amp;nbsp;*0.5.
&lt;br&gt;&lt;br&gt;Why do the results differ?
&lt;br&gt;&lt;br&gt;Moreover, working with k=1 I get different results, if I use different
&lt;br&gt;kernels. I expected instead the results to be insensitive on the
&lt;br&gt;choice of the kernel for k=1, as in fact there is only one instance
&lt;br&gt;under consideration, to which all kernels should give weight 1.
&lt;br&gt;&lt;br&gt;What do I miss?
&lt;br&gt;&lt;br&gt;Thank you very much
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;@attribute tear-prod-rate {reduced,normal}
&lt;br&gt;@attribute contact-lenses {soft,hard,none}
&lt;br&gt;@data
&lt;br&gt;reduced,none
&lt;br&gt;normal,soft
&lt;br&gt;normal,hard
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19882016&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19882016.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19875687</id>
	<title>Re: About CV &amp; confusion matrix</title>
	<published>2008-10-08T02:51:16Z</published>
	<updated>2008-10-08T02:51:16Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I have a question about how weka calculates confusion matrix?
&lt;br&gt;&amp;gt; There suppose to be Y models and Y confusion matrixes for Y folds CV.
&lt;br&gt;&amp;gt; However, there are only one confusion matrix printed by weka.
&lt;br&gt;&amp;gt; How does the weka calculated such confusion matrix?
&lt;br&gt;&lt;br&gt;In the course of one cross-validation, the full dataset will be used
&lt;br&gt;for testing (10-fold CV gives you 10 disjoint test sets that make up
&lt;br&gt;the full dataset). Weka collects the information, like distributions
&lt;br&gt;and (mis)classifications, for all the disjoint test sets and then
&lt;br&gt;generates the output. Therefore it's only 1 confusion matrix that gets
&lt;br&gt;generated and not 10.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19875687&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/About-CV---confusion-matrix-tp19873415p19875687.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19873415</id>
	<title>About CV &amp; confusion matrix</title>
	<published>2008-10-08T00:04:27Z</published>
	<updated>2008-10-08T00:04:27Z</updated>
	<author>
		<name>Jerry Chen-3</name>
	</author>
	<content type="html">&lt;div dir=&quot;ltr&quot;&gt;Dear sir:&lt;br&gt;&lt;br&gt;I have a question about how weka calculates confusion matrix?&lt;br&gt;There suppose to be Y models and Y confusion matrixes for Y folds CV.&lt;br&gt;However, there are only one confusion matrix printed by weka.&lt;br&gt;
How does the weka calculated such confusion matrix?&lt;br&gt;thanks &lt;br&gt;&lt;/div&gt;
&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19873415&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/About-CV---confusion-matrix-tp19873415p19873415.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19869287</id>
	<title>Re: total loss precision M5P regression coefficients problem</title>
	<published>2008-10-07T16:10:24Z</published>
	<updated>2008-10-07T16:10:24Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&lt;div class='shrinkable-quote'&gt;&amp;gt;&amp;gt;&amp;gt; To recap, the M5P classifier output displays the leaf regression models with
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; &amp;nbsp;a fixed float format such that any coefficient with an absolute value &amp;lt;
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; .00005 will display as zero -- a total loss of precision error.
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt;
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; It's been discussed that this is not an internal representation problem,
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; merely a output formatting string limitation. (Is it the case that Java
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt; lacks a smart formatting routine for floats analogous, say, to %g in C?)
&lt;br&gt;&amp;gt;&amp;gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Java has a similar facility, e.g. &amp;nbsp;System.out.format(&amp;quot;x = %g&amp;quot;, x);
&lt;br&gt;&amp;gt; IMHO it would be great if &amp;quot;someone&amp;quot; could go through Weka
&lt;br&gt;&amp;gt; and consistently replace printing of doubles with the above,
&lt;br&gt;&amp;gt; but I do understand that this is a lot of tedious and boring work.
&lt;/div&gt;&lt;br&gt;Mostly it would be replacing the Utils.doubleToString methods being
&lt;br&gt;used. There are about 500 occurrences in the current developer version
&lt;br&gt;of Weka. But still, with an IDE like Eclipse it is fairly easy
&lt;br&gt;locating them, ascertain whether it should be replaced (one could
&lt;br&gt;limit it to the toString() methods of classifiers, etc.) and replace
&lt;br&gt;them if necessary.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19869287&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/total-loss-precision-M5P-regression-coefficients-problem-tp19855758p19869287.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19869110</id>
	<title>Re: total loss precision M5P regression coefficients problem</title>
	<published>2008-10-07T15:57:57Z</published>
	<updated>2008-10-07T15:57:57Z</updated>
	<author>
		<name>Bernhard Pfahringer-2</name>
	</author>
	<content type="html">Hi
&lt;br&gt;&lt;br&gt;&amp;gt;&amp;gt; To recap, the M5P classifier output displays the leaf regression models with
&lt;br&gt;&amp;gt;&amp;gt; &amp;nbsp;a fixed float format such that any coefficient with an absolute value &amp;lt;
&lt;br&gt;&amp;gt;&amp;gt; .00005 will display as zero -- a total loss of precision error.
&lt;br&gt;&amp;gt;&amp;gt;
&lt;br&gt;&amp;gt;&amp;gt; It's been discussed that this is not an internal representation problem,
&lt;br&gt;&amp;gt;&amp;gt; merely a output formatting string limitation. (Is it the case that Java
&lt;br&gt;&amp;gt;&amp;gt; lacks a smart formatting routine for floats analogous, say, to %g in C?)
&lt;br&gt;&amp;gt;&amp;gt;
&lt;br&gt;&lt;br&gt;Java has a similar facility, e.g. &amp;nbsp;System.out.format(&amp;quot;x = %g&amp;quot;, x);
&lt;br&gt;IMHO it would be great if &amp;quot;someone&amp;quot; could go through Weka
&lt;br&gt;and consistently replace printing of doubles with the above,
&lt;br&gt;but I do understand that this is a lot of tedious and boring work.
&lt;br&gt;&lt;br&gt;Bernhard
&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;Bernhard Pfahringer, Dept. of Computer Science, University of Waikato
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~bernhard&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~bernhard&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; +64 7 838 4041
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19869110&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/total-loss-precision-M5P-regression-coefficients-problem-tp19855758p19869110.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19867675</id>
	<title>Re: Replace missing values using the EM algorithm</title>
	<published>2008-10-07T14:27:05Z</published>
	<updated>2008-10-07T14:27:05Z</updated>
	<author>
		<name>Mark Hall-9</name>
	</author>
	<content type="html">A PhD student (and Google summer of code contributer) is currently &amp;nbsp;
&lt;br&gt;working on a filter to perform Bayesian multiple imputation for &amp;nbsp;
&lt;br&gt;missing values (as described in Schafer, J. L. Analysis of Incomplete &amp;nbsp;
&lt;br&gt;Multivariate Data, New York: Chapman and Hall, 1997). Hopefully he &amp;nbsp;
&lt;br&gt;will be contributing this in the near future.
&lt;br&gt;&lt;br&gt;Cheers,
&lt;br&gt;Mark.
&lt;br&gt;&lt;br&gt;On 2/10/2008, at 9:42 PM, ali Aich wrote:
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Hi,
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Now I work with the bayes net algorithms. For replacing missing &amp;nbsp;
&lt;br&gt;&amp;gt; values, I use the &amp;quot;MissingValueFilter&amp;quot; of Weka. I want to know how &amp;nbsp;
&lt;br&gt;&amp;gt; we could use the algorithm EM in order to replace missing values.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Thank's for your responses.
&lt;br&gt;&amp;gt; Cordially
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Votre correspondant a choisi Hotmail et profite d'un stockage &amp;nbsp;
&lt;br&gt;&amp;gt; quasiment illimité. Créez un compte Hotmail gratuitement !
&lt;br&gt;&amp;gt; _______________________________________________
&lt;br&gt;&amp;gt; Wekalist mailing list
&lt;br&gt;&amp;gt; &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19867675&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&amp;gt; &lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;/div&gt;&lt;br&gt;--
&lt;br&gt;Mark Hall
&lt;br&gt;Senior Developer/Consultant, Pentaho Open Source Business Intelligence
&lt;br&gt;Citadel International, Suite 340, 5950 Hazeltine National Dr.,
&lt;br&gt;Orlando, FL 32822, USA
&lt;br&gt;+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
&lt;br&gt;Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
&lt;br&gt;Download the latest release today &amp;lt;&lt;a href=&quot;http://www.sourceforge.net/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.sourceforge.net/&lt;/a&gt;&amp;nbsp;
&lt;br&gt;projects/pentaho&amp;gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19867675&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/Replace-missing-values-using-the-EM-algorithm-tp19775473p19867675.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19866725</id>
	<title>Re: SMO Probabilistic</title>
	<published>2008-10-07T13:37:42Z</published>
	<updated>2008-10-07T13:37:42Z</updated>
	<author>
		<name>Mark Hall-9</name>
	</author>
	<content type="html">Turning on probabilistic prediction for SMO results in a standard SMO &amp;nbsp;
&lt;br&gt;model being built and then a logistic regression fitted to the output &amp;nbsp;
&lt;br&gt;- i.e. a new data set is created that contains the predicted &amp;nbsp;
&lt;br&gt;responses of SMO along with the original class values. A logistic &amp;nbsp;
&lt;br&gt;regression is then fitted to this derived data set in order to &amp;nbsp;
&lt;br&gt;produce probability estimates.
&lt;br&gt;&lt;br&gt;You can see the probabilities produced by turning on the option to &amp;nbsp;
&lt;br&gt;output probabilities in the Explorer or by using the -p option on the &amp;nbsp;
&lt;br&gt;command line.
&lt;br&gt;&lt;br&gt;Cheers,
&lt;br&gt;Mark.
&lt;br&gt;&lt;br&gt;On 8/10/2008, at 1:44 AM, Thomas Oommen wrote:
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; I'm doing a two class SMO classification using the polynomial &amp;nbsp;
&lt;br&gt;&amp;gt; Kernel. I use
&lt;br&gt;&amp;gt; 5-fold cross validation.
&lt;br&gt;&amp;gt; When I turn the probabilistic prediction TRUE. I get the following &amp;nbsp;
&lt;br&gt;&amp;gt; output,
&lt;br&gt;&amp;gt; other than the probabilities and accuracy.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Logistic Regression with ridge parameter of 1.0E-8
&lt;br&gt;&amp;gt; Coefficients...
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Class
&lt;br&gt;&amp;gt; Variable &amp;nbsp; &amp;nbsp; &amp;nbsp; Yes
&lt;br&gt;&amp;gt; ====================
&lt;br&gt;&amp;gt; pred &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -2.8806
&lt;br&gt;&amp;gt; Intercept &amp;nbsp; &amp;nbsp;-1.4197
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Odds Ratios...
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Class
&lt;br&gt;&amp;gt; Variable &amp;nbsp; &amp;nbsp; &amp;nbsp; Yes
&lt;br&gt;&amp;gt; ====================
&lt;br&gt;&amp;gt; pred &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0.0561
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Can anyone explain what this means?
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Thanks in adavance.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Thomas Oommen
&lt;br&gt;&amp;gt; Tufts University
&lt;br&gt;&amp;gt; Medford, MA-02155
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; _______________________________________________
&lt;br&gt;&amp;gt; Wekalist mailing list
&lt;br&gt;&amp;gt; &lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19866725&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&amp;gt; &lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;--
&lt;br&gt;Mark Hall
&lt;br&gt;Senior Developer/Consultant, Pentaho Open Source Business Intelligence
&lt;br&gt;Citadel International, Suite 340, 5950 Hazeltine National Dr.,
&lt;br&gt;Orlando, FL 32822, USA
&lt;br&gt;+64 7 847-3537 office, +64 21 399-132 mobile, +1 815 550-8637 fax,
&lt;br&gt;Skype: mark.andrew.hall, Yahoo: mark_andrew_hall
&lt;br&gt;Download the latest release today &amp;lt;&lt;a href=&quot;http://www.sourceforge.net/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.sourceforge.net/&lt;/a&gt;&amp;nbsp;
&lt;br&gt;projects/pentaho&amp;gt;
&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;br /&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19866725&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/SMO-Probabilistic-tp19842915p19866725.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19866707</id>
	<title>Re: total loss precision M5P regression coefficients problem</title>
	<published>2008-10-07T13:37:08Z</published>
	<updated>2008-10-07T13:37:08Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&lt;div class='shrinkable-quote'&gt;&amp;gt; To recap, the M5P classifier output displays the leaf regression models with
&lt;br&gt;&amp;gt; &amp;nbsp;a fixed float format such that any coefficient with an absolute value &amp;lt;
&lt;br&gt;&amp;gt; .00005 will display as zero -- a total loss of precision error.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; It's been discussed that this is not an internal representation problem,
&lt;br&gt;&amp;gt; merely a output formatting string limitation. (Is it the case that Java
&lt;br&gt;&amp;gt; lacks a smart formatting routine for floats analogous, say, to %g in C?)
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Anyway, I'm come up against this problem, and it's causing me enough grief
&lt;br&gt;&amp;gt; to submit this question. The obvious work around is to normalise the input
&lt;br&gt;&amp;gt; values in the data sets so that all the input values are in ranges that will
&lt;br&gt;&amp;gt; increase the coefficients to visible levels, but this introduces a whole
&lt;br&gt;&amp;gt; range of additional problems with respect to managing multiple versions of
&lt;br&gt;&amp;gt; data sets, scope for errors and inconsistency, etc. In &amp;nbsp;short, it would be
&lt;br&gt;&amp;gt; far preferable to be able to avoid messing around with all that.
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; So, two questions: a) Is there a patch or version of Weka (I'm using 3.4.13)
&lt;br&gt;&amp;gt; that has a smarter print formatter for floats?
&lt;/div&gt;&lt;br&gt;Not that I know of. Most people don't seem to have a problem with that.
&lt;br&gt;&lt;br&gt;&amp;gt; b) If not, is there a good
&lt;br&gt;&amp;gt; technical reason why this can't be/should not be fixed once and for all?
&lt;br&gt;&lt;br&gt;No, just they way it was implemented (probably not to waste too much
&lt;br&gt;space). You can always change the output yourself. Just modify method
&lt;br&gt;&amp;quot;treeToString(int)&amp;quot; of class weka.classifiers.trees.m5p.RuleNode.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19866707&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/total-loss-precision-M5P-regression-coefficients-problem-tp19855758p19866707.html" />
</entry>

<entry>
	<id>tag:www.nabble.com,2006:post-19866645</id>
	<title>Re: how to inspect the weights of LWL</title>
	<published>2008-10-07T13:33:41Z</published>
	<updated>2008-10-07T13:33:41Z</updated>
	<author>
		<name>Peter Reutemann</name>
	</author>
	<content type="html">&amp;gt; I would like to inspect the weights assigned by LWL to the different
&lt;br&gt;&amp;gt; instances of the training set.
&lt;br&gt;&amp;gt; (Of course, the weights will change every time a new classification is
&lt;br&gt;&amp;gt; performed, so ).
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I have &amp;nbsp;turned on the debug option in LWL, but this did not change the
&lt;br&gt;&amp;gt; output I get.
&lt;br&gt;&amp;gt; Is there &amp;nbsp;a way to inspect the weights assigned to the different
&lt;br&gt;&amp;gt; instances of the training set?
&lt;br&gt;&lt;br&gt;LWL is a lazy learning scheme, i.e., the actual work is done at
&lt;br&gt;prediction time. Each time you call distributionForInstance or
&lt;br&gt;classifyInstance to make a prediction, a new dataset (based on the
&lt;br&gt;training set) gets generated with the new weights assigned. Check out
&lt;br&gt;variable &amp;quot;neighbours&amp;quot; in the distributionForInstance method, which is
&lt;br&gt;used to train the base classifier with. If you want to output some
&lt;br&gt;information, you will have to hack the code accordingly.
&lt;br&gt;&lt;br&gt;Cheers, Peter
&lt;br&gt;-- 
&lt;br&gt;Peter Reutemann, Dept. of Computer Science, University of Waikato, NZ
&lt;br&gt;&lt;a href=&quot;http://www.cs.waikato.ac.nz/~fracpete/&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.cs.waikato.ac.nz/~fracpete/&lt;/a&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Ph. +64 (7) 858-5174
&lt;br&gt;&lt;br&gt;_______________________________________________
&lt;br&gt;Wekalist mailing list
&lt;br&gt;&lt;a href=&quot;http://www.nabble.com/user/SendEmail.jtp?type=post&amp;post=19866645&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;Wekalist@...&lt;/a&gt;
&lt;br&gt;&lt;a href=&quot;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist&lt;/a&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://www.nabble.com/how-to-inspect-the-weights-of-LWL-tp19854915p19866645.html" />
</entry>

</feed>
