Last active
September 25, 2017 13:41
-
-
Save CamiloGarciaLaRotta/81de9ce6f9ff369f7328cd5d5fc6b6d4 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
table.Properties Access table properties | |
t{i} Access by index | |
t(2:end,[1, 5]) From t make new table w/ specific cols/rows | |
{'var1','var2'} Also valid form to access cols | |
t.newVar = Add new col to table | |
t = datetime(timestamp) Create datetime table | |
hour(t), day(t), ... | |
t{[i j],'colName'} = {'';''} Modify values in position i,j of column in table | |
when sorting a list that has a relationship with another list, | |
one must also obtain the indexed values of the sort so that the relationship between both lists is kept: | |
x,y; | |
[sorted_x, idx_vals] = sort(x); | |
sorted_y = y(idx_vals); | |
plot(sorted_x,sorted_y); | |
enums -> categorical(list). more memory efficient | |
categories(list) make array of categories found in list | |
mergecats(list,{'','',''},biggerCategory) | |
useful flags: | |
Omitnan | |
reading tables: HeaderLines, CommentStyle | |
setoperations: | |
setdiff(a,b) return everything in a but not in b | |
useful functions: | |
nnz: count non zero matrix elements | |
any: determine which row/cols have a true value | |
ismissing: find all NaN, NaT, undefined indexes in a matrix | |
isnan: same as ismissing but only for NaN | |
When importing the data: | |
The numeric missing values are replaced by NaN. | |
The missing datetime values are replaced by NaT (not a time) | |
The missing categorical values are replaced by undefined. | |
Ex: | |
valid_cats = categories(data.Var); | |
valid_cats = setdiff(valid_cats, 'N/A'); | |
data.NewVar = mergecats(data.Var, valid_cats, 'Land'); | |
data.NewVar = renamecats(data.Var, 'N/A', 'Sea') | |
Discretize: discretize(data.Var,bins,'Categorical',catnames) | |
function handle = reference to a function. Allows to pass fct as argument -> fct = @fct_name | |
When creating new table () extracts as table -> variable info but can't perform operations | |
{} extracts as cell -> can perform operations but no variable info | |
Main Datatypes: | |
Arrays: Use when heterogenous datatypes and no need for math ops. | |
- Struct: data accessible by name. s = struct() | |
- Cell: data accessible by index. c = {} | |
diff. sized strings s = {'', '', ''} | |
add/del/mod value -> c{i} = '' | |
Matrix: Use when homogenous datatypes and lots of math ops. v = [] | |
same sized strings s = ['', '', ''] | |
add/del cell -> c(i) = [] | |
Table: Use when heterogenous variables but same col size. t = table() | |
Readtable to import from spreadsheet | |
Joining Tables: | |
[T1;T2] Concatenation | |
join(T1,T2) The key values in the first input all exist in the second input. If not, then reverse the input order. | |
innerjoin(T1,T2) inner join -> (AND) | |
outterjoin(T1,T2'Mergekeys',true) Outter join -> (OR). All empty values NaN | |
Basic graph properties: | |
hold Plot replacement behavior of axes | |
xlim/ylim/zlim Limits of the appropriate axis | |
grid Axes grid lines | |
axis Axis limits, shape, and appearance | |
colormap Map used to assign indexed colors in a figure | |
view 3-D viewpoint of axes | |
axis() square, tight, auto, fill | |
DATASTORE: | |
when trying to import data into a table from multiple files | |
1. create data store | |
2. adjust data store properties | |
3. import data | |
GRO_UPING DATA BY VARIABLES | |
[grp_nums,grp_vals] = findgroups(v) -> given list, map to 1,2,3.... | |
histcounts(list) -> histogram of number of occurrences of group in list | |
splitapply(@fct,data,groups): split list by afor created groups and apply input function | |
MANIPULATING PLOTS | |
fig = gcf | |
ax = gca | |
obj = gco | |
======================================================================== | |
% Read Table | |
data = readtable('fuelEconomy.txt','HeaderLines', 4); | |
% Find and delete NaN rows | |
nan_idx = ismissing(data.CombinedMPG); | |
data(nan_idx,:) = []; | |
% Categorize variable | |
bins = [0 20 30 70]; | |
cat_names = {'Low','Medium','High'}; | |
MPGClass = discretize(data.CombinedMPG,bins,cat_names); | |
MPGClass = categorical(MPGClass); | |
low_idx = MPGClass == 'Low'; | |
medium_idx = MPGClass == 'Medium'; | |
high_idx = MPGClass == 'High'; | |
% Plot | |
scatter(data.CityMPG(low_idx), data.HighwayMPG(low_idx),8,'r','filled'); | |
hold on; | |
scatter(data.CityMPG(medium_idx), data.HighwayMPG(medium_idx),8,'b','filled'); | |
scatter(data.CityMPG(high_idx), data.HighwayMPG(high_idx),8,'k','filled'); | |
legend('Low','Medium','High'); | |
grid('on'); | |
xlabel('City MPG'); | |
ylabel('Highway MPG'); | |
hold off; | |
##### FANCY PLOT | |
% Read data | |
dat = datastore('fuelEconomy2.txt'); | |
dat.ReadSize = 362; | |
data = dat.read; | |
% Group by number of cylinders | |
[gNum,gVal] = findgroups(data.NumCyl); | |
% Find average by groups | |
avgMPG = splitapply(@mean,data.CombinedMPG,gNum); | |
% Create a bar chart | |
b = bar(avgMPG); | |
xlabel('Number of cylinders') | |
title('Average MPG') | |
% Customize the chart | |
f = gcf; | |
a = gca; | |
f.Color = [0.81 0.87 0.9]; | |
a.Color = [0.81 0.87 0.9]; | |
a.Box = 'off'; | |
a.YAxisLocation = 'right'; | |
a.YGrid = 'on'; | |
a.GridColor = [1 1 1]; | |
a.GridAlpha = 1; | |
a.XTickLabel = gVal; | |
a.YLim = [0 40]; | |
ax = a.XAxis; | |
ax.TickDirection = 'out'; | |
b.FaceColor = [0,0.31,0.42]; | |
b.BarWidth = 0.5; | |
##### SURF | |
%% Create an XY grid from the raw data | |
xv = min(x):0.01:max(x); | |
yv = min(y):0.01:max(y); | |
[X,Y] = meshgrid(xv,yv); | |
%% Interpolate the raw data and evaluate it on the X,Y grid | |
Z = griddata(x,y,z,X,Y); | |
##### PCOLOR | |
The pcolor function has the same syntax as surf. In fact, it actually creates a flat surface with ZData all set to 0 and CData set to Z. | |
>> pcolor(X,Y,Z) | |
### LOOPING THROUGH FILE | |
%% Open file | |
fid = fopen('hurricaneData1960.txt'); | |
%% TODO: Create a loop that reads the data until the end of the file | |
while ~feof(fid) | |
% Read header line | |
headerLine = fgetl(fid); | |
% Split up the line | |
txtCell = strsplit(headerLine); | |
% Find locations of M and = | |
mWhere = strcmp(txtCell,'M'); | |
eqWhere = strcmp(txtCell,'='); | |
% Find the index value that follows 'M =' | |
idx = all([0 0 mWhere(1:end-2); 0 eqWhere(1:end-1)]); | |
% Assign that number to nLines | |
nLines = txtCell{idx}; | |
nLines = str2double(nLines); | |
% Skip the next line | |
fgetl(fid); | |
% Read in nLines of data | |
data = textscan(fid,'%D%f%f%f%f',nLines,'Delimiter','\t'); | |
% Read the rest of the line | |
fgetl(fid); | |
end | |
%% All done. Close the file | |
fclose(fid); | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment