Skip to content

Instantly share code, notes, and snippets.

Setup Hadoop on Windows 10

  1. Download Hadoop 3.2.2 from Apache Archives and extract it to C:/hadoop/
  2. Download any version of JDK 8. Older the better.
  3. Any other version of hadoop can be downloaded if someone already build winutils of that version. You can also build it yourself using this tutorial, which would be easy if you have idea of using UNIX commands in Windows. (for which cygwin can be used). So, download the winutils of Hadoop 3.2.2 by pasting the link of github folder to this [website](https://download-directory.githu
@Umair444
Umair444 / normalizeData.m
Last active January 16, 2022 08:36
Normalize Data into Different Tables
T = readtable('original.csv');
% n = cellfun(@(cIn) strsplit(cIn, ','), T.country, 'UniformOutput', false);
C = {0,0,0,0}; count = 0; U = {0,0,0,0};
for i = [4, 5, 6, 11] % Director, Cast, Countries and Genre
tab = T{:, i};
n = [T.show_id regexp(tab, ', ', 'split')];
len = cellfun(@numel, n(:,2));
repeat = repelem(n(:,1), len);
repeat = str2double(extractAfter(repeat, "s"));
t = [table(repeat, 'VariableNames', {'show_id'}) cell2table([n{:,2}]', ...
@Umair444
Umair444 / sptialJoin.m
Created January 16, 2022 08:18
Spatial Joining MATLAB - Shift Attributes using spatial data
% FORMAT:
% DATA1 = [ID1 Lat Lon]
% DATA2 = [ID2 data Lat Lon]
%% Remove nan Before
DATA1(isnan(DATA1.Lat), :) = [];
DATA2(isnan(DATA2.Lat), :) = [];
%% Calculate Dist
dist = zeros(length(DATA2.ID2), length(DATA1.ID1));
function d = greatCircleDistance(lat1, lon1, lat2, lon2)
R = 6371.1e3; % Mean Earth Radius
p1 = lat1*(pi/180);
p2 = lat2*(pi/180);
p = (lat2-lat1)*(pi/180);
l = (lon2-lon1)*(pi/180);
a = sin(p/2)*sin(p/2)+cos(p1)*cos(p2)*sin(l/2)*sin(l/2);
c = 2*atan2(sqrt(a), sqrt(1-a));
d = R*c;