Created
April 10, 2012 01:35
-
-
Save shimazaki/2347820 to your computer and use it in GitHub Desktop.
Function `sshist' returns optimal number of bins in a histogram used for density estimation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| function [optN, optD, edges, C, N] = sshist(x,N) | |
| % [optN, edges, C, N] = sshist(x,N) | |
| % | |
| % Function `sshist' returns the optimal number of bins in a histogram | |
| % used for density estimation. | |
| % Optimization principle is to minimize expected L2 loss function between | |
| % the histogram and an unknown underlying density function. | |
| % An assumption made is merely that samples are drawn from the density | |
| % independently each other. | |
| % | |
| % The optimal binwidth D* is obtained as a minimizer of the formula, | |
| % (2K-V) / D^2, | |
| % where K and V are mean and variance of sample counts across bins with width D. | |
| % Optimal number of bins is given as (max(x) - min(x)) / D*. | |
| % | |
| % For more information, visit | |
| % http://2000.jukuin.keio.ac.jp/shimazaki/res/histogram.html | |
| % | |
| % Original paper: | |
| % Hideaki Shimazaki and Shigeru Shinomoto | |
| % A method for selecting the bin size of a time histogram | |
| % Neural Computation 19(6), 1503-1527, 2007 | |
| % http://dx.doi.org/10.1162/neco.2007.19.6.1503 | |
| % | |
| % Example usage: | |
| % optN = sshist(x); hist(x,optN); | |
| % | |
| % Input argument | |
| % x: Sample data vector. | |
| % N (optinal): | |
| % A vector that specifies the number of bins to be examined. | |
| % The optimal number of bins is selected from the elements of N. | |
| % Default value is N = 2:100. | |
| % * Do not search binwidths smaller than a sampling resolution of data. | |
| % | |
| % Output argument | |
| % optN: Optimal number of bins. | |
| % N: Bin numbers examined. | |
| % C: Cost function of N. | |
| % | |
| % See also SSKERNEL | |
| % | |
| % This program is distributed under the terms of | |
| % the Creative Commons Attribution License (CC-BY) | |
| % Hideaki Shimazaki, Ph.D. | |
| % http://2000.jukuin.keio.ac.jp/shimazaki | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | |
| % Parameters Setting | |
| x = reshape(x,1,numel(x)); | |
| x_min = min(x); | |
| x_max = max(x); | |
| if nargin < 2 | |
| buf = abs(diff(sort(x))); | |
| dx = min(buf(logical(buf ~= 0))); | |
| N_MIN = 2; % Minimum number of bins (integer) | |
| % N_MIN must be more than 1 (N_MIN > 1). | |
| N_MAX = min(floor((x_max - x_min)/(2*dx)),100); | |
| % Maximum number of bins (integer) | |
| N = N_MIN:N_MAX; % # of Bins | |
| end | |
| SN = 30; % # of partitioning positions for shift average | |
| D = (x_max - x_min) ./ N; % Bin Size Vector | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | |
| % Computation of the Cost Function | |
| Cs = zeros(length(N),SN); | |
| for i = 1: length(N) | |
| shift = linspace(0,D(i),SN); | |
| for p = 1 : SN | |
| edges = linspace(x_min+shift(p)-D(i)/2,... | |
| x_max+shift(p)-D(i)/2,N(i)+1); % Bin edges | |
| ki = histc(x,edges); % Count # of events in bins | |
| ki = ki(1:end-1); | |
| k = mean(ki); % Mean of event count | |
| v = sum( (ki-k).^2 )/N(i); % Variance of event count | |
| Cs(i,p) = ( 2*k - v ) / D(i)^2; % The Cost Function | |
| end | |
| end | |
| C = mean(Cs,2); | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | |
| % Optimal Bin Size Selectioin | |
| [Cmin idx] = min(C); | |
| optN = N(idx); % Optimal number of bins | |
| optD = D(idx); % *Optimal binwidth | |
| [Cminp idxp] = min(Cs(idx,:)); | |
| shift = linspace(0,D(idx),SN); | |
| edges = linspace(x_min+shift(idxp)-D(idx)/2,... | |
| x_max+shift(idxp)-D(idx)/2,N(idx)+1); | |
| %edges = linspace(x_min,x_max,N(idx)); % Optimal segmentation |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment