Skip to content

Instantly share code, notes, and snippets.

@biogeo
Created December 8, 2013 01:23
Show Gist options
  • Save biogeo/7852243 to your computer and use it in GitHub Desktop.
Save biogeo/7852243 to your computer and use it in GitHub Desktop.
Under formatted string printing (*printf), Matlab writes NaNs as 'NaN'. What if you want it to be something else? E.g., because you are sending the data to SQL and need NaNs indicating missing values to appear as '\N' or 'NULL'. This does it reasonably quickly.
% Suppose you have a vector x containing a bunch of numeric data, including NaNs.
N = 100000; % The amount of data we have
x = randi(1000,N,1); % Our dummy data, a bunch of integers 0-1000.
x(randperm(N,round(N/10))) = NaN; % Make 10 percent of them NaNs.
% And you want to write a text table with this along with some other columns:
w = cellstr(char(randi(26,N,4)+'a'-1)); % Make a bunch of random 4-letter strings
y = 100*rand(N,1); % And a bunch of random numbers 0-100.
% Normally you might just do something like:
outputCellArray = [w, num2cell(x), num2cell(y)]';
s = sprintf('%s\t%d\t%7.3f\n', outputCellArray{:});
% And this will work fine, but the NaNs will appear as the string 'NaN'.
% If you are, for example, trying to create a file suitable for use with SQL's
% LOAD DATA INFILE command, and your NaNs indicate missing data, then you need
% them to appear as '\N'.
% One solution is to manually convert x from a numeric array to a cell array of
% strings, and then replace the 'NaN' entries with '\N'. This is the fastest way
% that I found to do that:
xStr = cellstr(num2str(x,'%-d'));
xStr(isnan(x)) = {'\N'};
% Note that this does require x to be a column vector. The '%-d' format string
% is required to left-justify the output numbers so that cellstr can trim away
% the trailing spaces. Of course, if x is not a column vector, it's easy to
% handle:
xStr = cellstr(num2str(x(:),'%-d'));
xStr = reshape(xStr, size(x));
xStr(isnan(x)) = {'\N'};
% (Incidentally, cellstr(num2str(x,'%-d')) is much faster than converting x to a
% cell array via num2cell and then iterating with num2str.)
% Now it's just a simple matter of changing the format string to print strings
% instead of numbers:
outputCellArray = [w, x, num2cell(y)]';
s = sprintf('%s\t%s\t%7.3f\n', outputCellArray{:});
% And that's all there is to it. Matlab certainly doesn't make text processing
% easy, but being a little tricky often gets the job done.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment