Skip to content

Instantly share code, notes, and snippets.

@shimazaki
Created May 28, 2012 09:20
Show Gist options
  • Save shimazaki/2818109 to your computer and use it in GitHub Desktop.
Save shimazaki/2818109 to your computer and use it in GitHub Desktop.
function [y, t, optw, yconf, C, W] = sskernel(x,t,W)
% [optW, C, W] = sskernel(x,W,t)
%
% Function `sskernel' returns an optimal bandwidth (standard deviation)
% of the Gauss density function used in kernel density estimation.
% Optimization principle is to minimize expected L2 loss function between
% the kernel estimate and an unknown underlying density function.
% An assumption made is merely that samples are drawn from the density
% independently each other.
%
% The optimal bandwidth is obtained as a minimizer of the formula,
% sum_{i,j} \int k(x - x_i) k(x - x_j) dx - 2 sum_{i~=j} k(x_i - x_j),
% where k(x) is the kernel function.
%
% For more information, visit
% http://2000.jukuin.keio.ac.jp/shimazaki/res/kernel.html
%
% Original paper:
% Hideaki Shimazaki and Shigeru Shinomoto
% Kernel Bandwidth Optimization in Spike Rate Estimation
% Journal of Computational Neuroscience 2010
% http://dx.doi.org/10.1007/s10827-009-0180-4
%
% Example usage:
% optw = sskernel(x); ksdensity(x,'width',optw);
% use fftkernel.m for much faster density estimate.
%
% Input argument
% x: Sample data vector.
% W (optinal):
% A vector of kernel bandwidths.
% The optimal bandwidth is selected from the elements of W.
% Default value is W = logspace(log10(2*dx),log10((x_max - x_min)),50).
% * Do not search bandwidths smaller than a sampling resolution of data.
% str (optional):
% String that specifies the kernel type.
% This option is reserved for future extention.
% Default str = 'Gauss'.
%
% Output argument
% optW: Optimal kernel bandwidth.
% W: Kernel bandwidths examined.
% C: Cost functions of W.
%
% See also SSHIST
%
% Copyright (c) 2009 2010, Hideaki Shimazaki
% http://2000.jukuin.keio.ac.jp/shimazaki
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Parameters Settings
x = reshape(x,1,numel(x));
if nargin == 1
buf = sort(abs(diff(sort(x))));
dt = min(buf(logical(buf ~= 0)));
t = min(x)-1*(max(x) - min(x)): dt: max(x)+1*(max(x) - min(x));
else
dt = min(diff(t));
end
if nargin < 3 %1 or 2
Wmin = 2*dt; Wmax = 1*(max(x) - min(x));
W = logspace(log10(Wmin),log10(Wmax),50);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Compute a Cost Function
N_total = length(x);
y_hist = histc(x,t-dt/2)/dt/N_total;
%y_hist = histc(x,t-dt/2)/dt;
L = length(y_hist);
idx = find(y_hist ~= 0);
y_ha = y_hist(idx)';
C = zeros(1,length(W));
C_min = Inf;
for k = 1: length(W)
w = W(k);
yh = fftkernel(y_hist,w/dt);
%rate
%C(k) = sum(yh.^2)*dt - 2* yh(idx)*y_ha*dt...
% + 2*1/sqrt(2*pi)/w*N_total;
%density
C(k) = sum(yh.^2)*dt - 2* yh(idx)*y_ha*dt...
+ 2*1/sqrt(2*pi)/w/N_total;
C(k) = C(k) * N_total* N_total;
% Optimal Bin Size Selection
if C(k) < C_min
C_min = C(k);
optw = w;
y = yh;
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Optimal Bin Size Selection
%[optC,nC]=min(C); optw = W(nC);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Bootstrap Confidence Interval
if nargout == 0 || nargout >= 4
nbs = 1*1e3;
yb = zeros(nbs,L);
for i = 1: nbs
xb = poissrnd(y_hist*dt*N_total)/dt/N_total;
yb(i,:) = fftkernel(xb,optw/dt);
end
ybsort = sort(yb);
y95b = ybsort(floor(0.05*nbs),:);
y95u = ybsort(floor(0.95*nbs),:);
yconf = [y95b; y95u];
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Display Results
if nargout == 0
p = line([min(t) max(t)],[0 0]);
set(p,'Color',[.1 .1 .1]);
hold on; line([t; t],[y95b; y95u]...
,'Color',[7 7 7]/8,'LineWidth',1 );
hold on; plot(t,y95b,'Color',[7 7 7]/9,'LineWidth',1);
hold on; plot(t,y95u,'Color',[7 7 7]/9,'LineWidth',1);
plot(t,y,'Color',[0.9 0.2 0.2],'LineWidth',2);
grid on;
ylabel('density');
set(gca,'TickDir','out');
end
function y = fftkernel(x,w)
% y = fftkernel(x,w)
%
% Function `fftkernel' applies the Gauss kernel smoother to
% an input signal using FFT algorithm.
%
% Input argument
% x: Sample signal vector.
% w: Kernel bandwidth (the standard deviation) in unit of
% the sampling resolution of x.
%
% Output argument
% y: Smoothed signal.
%
% MAY 5/23, 2012 Author Hideaki Shimazaki
% RIKEN Brain Science Insitute
% http://2000.jukuin.keio.ac.jp/shimazaki
L = length(x);
n = 2^nextpow2(L);
%n = 2^ceil(log2(L));
X = fft(x,n);
f = (0:n-1)/n;
f = [-f(1:n/2+1) f(n/2:-1:2)];
K = exp(-0.5*(w*2*pi*f).^2);
y = ifft(X.*K,n);
y = y(1:L);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment