|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
Information-theoretic functionsHello all,
For my own private work with Octave I have prepared a set of information-theoretic functions which I thought I would offer to the community. >From browsing the archives I recognise that another user recently contributed a similar set of functions, but the .tar.gz attachment was not available, so I could not compare. I would very much like to see those if possible. My own functions include one which I think was not in the earlier bundle, to calculate the information gain ratio or uncertainty coefficient. One problem, which I'm not sure how to get round, is in the main information entropy function: it requires vectors at input, but at present only works with row or column vectors, i.e. not with vectors where the active dimension is > 2. The functions are: infoentr(x,y) # if one input, calculates info entropy of sequence x. # if two inputs, calculates joint entropy of sequences x and y. condentr(x,y) # calculates entropy of x conditional on y mutualinfo(x,y) # calculates the mutual information of two sequences x and y. # note that this is symmetric in its inputs. :-) infogain(x,y) # calculates the information gain ratio of x conditional on y. I hope these are relevant and would welcome comments on what if anything needs to be done to bring them up to scratch as serious Octave functions. I suspect the existing contributed bundle is far superior, but I thought I'd give people the opportunity to review these things. Best wishes, -- Joe function H = infoentr(x,y) # If just one input, calculates Shannon Information Entropy # of the sequence x: # H(X) = \sum_{x \in X} p(x) log2(1/p(x)) # # If two inputs, calculates joint entropy of the concurrent # sequences x and y: # H(X,Y) = \sum_{x \in X, y \in Y} p(x,y) log2(1/p(x,y)) if(nargin<1 || nargin>2) usage("infoentr(x,y)") endif if(nargin==2) if((rows(x)~=rows(y)) || (columns(x)~=columns(y))) error("Arguments do not have same dimension.") endif endif # We check that first argument is a vector, and # if necessary convert to row vector. if(columns(x)==1) x = x' elseif(rows(x)~=1) error("First argument is not a vector."); endif if(nargin==1) X = create_set(x); Nx = length(X); # Calculate probability Pr(x) for i=1:Nx Pr(i) = sum(x==X(i)); endfor if(sum(Pr) ~= length(x)) fprintf(stdout,"Sum is wrong.\n"); endif Pr = Pr/length(x); # Calculate Shannon information content h(x) = log2(1/Pr(x)) h = log2(1 ./ Pr); h(find(h==Inf)) = 0; H = sum(Pr .* h); else # Ensure that the second argument is a vector, and # if necessary convert to row vector. Actually # this is probably taken care of by the check on # dimension agreement and the check on x above. :-) if(columns(y)==1) y = y' elseif(rows(y)~=1) error("Second argument is not a vector."); endif X = create_set(x); Y = create_set(y); Nx = length(X); Ny = length(Y); # Calculate joint probability Pr(x,y) for i=1:Nx for j=1:Ny Pr(i,j) = (x==X(i))*(y==Y(j))'; endfor endfor if sum(sum(Pr)) ~= length(x) fprintf(stdout,"Sum is wrong.\n"); endif Pr = Pr/length(x); # Calculate Shannon information content h(x,y) = log2(1/Pr(x,y)) h = log2(1 ./ Pr); h(find(h==Inf)) = 0; H = sum(sum(Pr .* h)); endif function Hcond = condentr(x,y) # Calculates information entropy of the sequence x # conditional on the sequence y: # H(X|Y) = H(X,Y) - H(Y) if nargin!=2 usage("condentr(x,y)") endif Hcond = infoentr(x,y) - infoentr(y); function I = mutualinfo(x,y) # Calculates mutual information of the sequences x and y: # I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) = I(Y;X) if nargin!=2 usage("mutualinfo(x,y)") endif I = infoentr(x) - condentr(x,y); function IGR = infogain(x,y) # Gives the information gain ratio (also known as the # `uncertainty coefficient') of the sequence x # conditional on y: # I(X|Y) = I(X;Y)/H(X) if nargin!=2 usage("infogain(x,y)") endif IGR = mutualinfo(x,y)/infoentr(x); # Could also do # IGR = 1 - condentr(x,y)/infoentr(x); _______________________________________________ Octave-sources mailing list Octave-sources@... https://www.cae.wisc.edu/mailman/listinfo/octave-sources |
|
|
Re: Information-theoretic functionsJoseph Wakeling wrote:
> Hello all, > > For my own private work with Octave I have prepared a set of > information-theoretic functions which I thought I would offer to the > community. > > >From browsing the archives I recognise that another user recently > contributed a similar set of functions, but the .tar.gz attachment was > not available, so I could not compare. I would very much like to see > those if possible. > > My own functions include one which I think was not in the earlier > bundle, to calculate the information gain ratio or uncertainty coefficient. > > One problem, which I'm not sure how to get round, is in the main > information entropy function: it requires vectors at input, but at > present only works with row or column vectors, i.e. not with vectors > where the active dimension is > 2. > theory functions in octave-forge, if any, and get these committed to octave-forge. Perhaps Muthu can do the commit for you.. D. -- David Bateman David.Bateman@... Motorola Labs - Paris +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 6 72 01 06 33 (Mob) 91193 Gif-Sur-Yvette FRANCE +33 1 69 35 77 01 (Fax) The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary _______________________________________________ Octave-sources mailing list Octave-sources@... https://www.cae.wisc.edu/mailman/listinfo/octave-sources |
| Free Forum Powered by Nabble | Forum Help |