Skip to content

Instantly share code, notes, and snippets.

@athlan
Last active August 29, 2015 14:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save athlan/7416c9e2578ba69cc2ef to your computer and use it in GitHub Desktop.
Save athlan/7416c9e2578ba69cc2ef to your computer and use it in GitHub Desktop.
RemoveValuesUnderHistogramTreshold
function [ data_filtered ] = RemoveValuesUnderHistogramTreshold( data, percentageTreshold, numberOfHistogramBins )
%RemoveValuesUnderHistogramTreshold This function removes outstanding data variables that is
% under specific occurences threshold. The occurrences is partitioned
% by histogram method, equals bin size and fixes number of bins.
%
% data_filtered = RemoveValuesUnderHistogramTreshold(DATA, percentageTreshold, numberOfHistogramBins)
% (matrix) DATA : The vector with data
% (double) percentageTreshold : The occurences percentage (relative to maximum)
% that rejects outstanding values. Double value, e. g. 0.5 for under
% 50%
% (integer) numberOfHistogramBins : Number of hist() bin counts
%
% Author: Piotr Pelczar (me@athlan.pl)
%
edges_min = min(data);
edges_max = max(data);
edges_step = abs((edges_max-edges_min)/(numberOfHistogramBins - 1));
edges = edges_min:edges_step:edges_max;
[bincount, data_bin_idx] = histc(data, edges);
bincount_max = max(bincount);
edges_under_treshold = ((bincount/bincount_max) < percentageTreshold);
edges_under_treshold_idx = 1:10;
edges_under_treshold_idx = edges_under_treshold_idx(edges_under_treshold);
data_bin_idx_under_treshold = ismember(data_bin_idx, edges_under_treshold_idx);
data_filtered = data;
data_filtered(data_bin_idx_under_treshold) = [];
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment