{{ message }}

Instantly share code, notes, and snippets.

Created Aug 10, 2017
attempt to explain why broadcasting is bae
 A starter example is centering a matrix ```{python} import numpy as np X = np.random.normal(size=(3, 3)) # random 3 by 3 matrix mu = X.mean(axis = 0) # array (vector) of column means print("Original matrix") print(X) print("\nArray of column means") print(mu) # array effectively copied till vector matches matrix in size print("\nBroadcast subtraction operation") print(X - mu) ``` To do this in R you have to loop or use something like map/apply. This lets you create kernel matrices in a fast and fairly readable way, for example. ```{python} import numpy as np X = np.random.normal(size=(10, 5)) Y = np.random.normal(size=(2, 5)) sigma = 0.1 # calculate all pairwise distances between rows in X and Y # (x - y)^2 = a^2 - 2 ab + b^2 dist_sq = np.square(X).sum(axis=1)[:, np.newaxis] - 2.0 * np.dot(X, Y.T) + np.square(Y).sum(axis=1) # turn this into an RBF kernel (math might be off, but hopefully the point is clear) rbf = np.exp(-np.sqrt(dist_sq) / (2 * sigma ** 2)) print(dist_sq) print(rbf) ``` In R you have to hope someone vectorized the components of the operation that you care about, copy an object along an appropriate dimension, or use loops/applys/maps, which I think make the math harder to think about.

### DavisVaughan commented Aug 10, 2017 • edited

 I had way too much fun with this. I'm starting to appreciate broadcasting, so I took a stab at how you might solve these two specific problems with new functions that attempt to mimic basic broadcasting for subtraction and addition (%-% and %+%). The first bit it me exploring what R does natively, then I create the functions, then test a few things! I am completely aware the functions would need to be dramatically improved, but I think its kind of neat. FYI - the most important bit is the Pairwise Distance chunk down at the bottom so you may want to just skip to that first. ```# GENERAL EXPLORATION -------------------------------------------------- # Random 3x3 matrix X <- matrix(c(-1.07230736, -0.13967851, 0.88094962, 0.69717778, -0.11573738, -0.57242203, 0.1390293 , -1.77457788, 1.45084404), nrow = 3, byrow = TRUE) # Col means are returned as a vector, not a matrix mu <- colMeans(X) mu #>  -0.07870009 -0.67666459 0.58645721 # Normal subtraction does this wrong # The first element of mu is subtracted from the entire first row of X # Second element of mu subtracted from the second row of X and so on X - mu #> [,1] [,2] [,3] #> [1,] -0.9936073 -0.06097842 0.9596497 #> [2,] 1.3738424 0.56092721 0.1042426 #> [3,] -0.4474279 -2.36103509 0.8643868 # You could exploit that fact like this, but its kinda ugly t(t(X) - mu) #> [,1] [,2] [,3] #> [1,] -0.9936073 0.5369861 0.2944924 #> [2,] 0.7758779 0.5609272 -1.1588792 #> [3,] 0.2177294 -1.0979133 0.8643868 # Or tell R to make a matrix of mu. Telling it 3x3 will extend it automatically # But this is obviously making copies to extend it :( X - matrix(mu, nrow=3, ncol=3, byrow = TRUE) #> [,1] [,2] [,3] #> [1,] -0.9936073 0.5369861 0.2944924 #> [2,] 0.7758779 0.5609272 -1.1588792 #> [3,] 0.2177294 -1.0979133 0.8643868 # R isnt "smart" enough to know how to broadcast, and just throws an error # when you have non conformable arrays mu <- matrix(mu, nrow = 1) X - mu #> Error in X - mu: non-conformable arrays # Loop to prevent copying # This is interesting. Generalize this? X_mu <- matrix(nrow = 3, ncol = 3) for(i in 1:nrow(X)) { X_mu[i,] <- X[i,] - mu } X_mu #> [,1] [,2] [,3] #> [1,] -0.9936073 0.5369861 0.2944924 #> [2,] 0.7758779 0.5609272 -1.1588792 #> [3,] 0.2177294 -1.0979133 0.8643868 # BROADCAST FUNCTIONS? ------------------------------------------------------------------- # New broadcast subtraction function? `%-%` <- function(e1, e2) { # I know this won't be generalizable # But it sets up the empty matrix to be filled with the result dim_row <- max(nrow(e1), nrow(e2)) dim_col <- max(ncol(e1), ncol(e2)) x <- matrix(nrow = dim_row, ncol = dim_col) # e1 - e2 & 1 has more rows & equal cols if(nrow(e1) >= nrow(e2) & ncol(e1) == ncol(e2)) { case <- "1" # e1 - e2 & 2 has more rows & equal cols } else if(nrow(e1) < nrow(e2) & ncol(e1) == ncol(e2)) { case <- "2" # e1 - e2 & 1 has more cols & equal rows } else if(ncol(e1) >= ncol(e2) & nrow(e1) == nrow(e2)) { case <- "3" # e1 - e2 & 2 has more cols & equal rows } else if(ncol(e1) < ncol(e2) & nrow(e1) == nrow(e2)) { case <- "4" # Fail } else { stop("Incorrect dims") } switch(case, "1" = for(i in 1:dim_row) { x[i,] <- e1[i,] - e2 }, "2" = for(i in 1:dim_row) { x[i,] <- e1 - e2[i,] }, "3" = for(i in 1:dim_col) { x[,i] <- e1[,i] - e2 }, "4" = for(i in 1:dim_col) { x[,i] <- e1 - e2[,i] }) x } # New broadcast addition function? `%+%` <- function(e1, e2) { # I know this won't be generalizable # But it sets up the empty matrix to be filled with the result dim_row <- max(nrow(e1), nrow(e2)) dim_col <- max(ncol(e1), ncol(e2)) x <- matrix(nrow = dim_row, ncol = dim_col) # e1 - e2 & 1 has more rows & equal cols if(nrow(e1) >= nrow(e2) & ncol(e1) == ncol(e2)) { case <- "1" # e1 - e2 & 2 has more rows & equal cols } else if(nrow(e1) < nrow(e2) & ncol(e1) == ncol(e2)) { case <- "2" # e1 - e2 & 1 has more cols & equal rows } else if(ncol(e1) >= ncol(e2) & nrow(e1) == nrow(e2)) { case <- "3" # e1 - e2 & 2 has more cols & equal rows } else if(ncol(e1) < ncol(e2) & nrow(e1) == nrow(e2)) { case <- "4" # Fail } else { stop("Incorrect dims") } switch(case, "1" = for(i in 1:dim_row) { x[i,] <- e1[i,] + e2 }, "2" = for(i in 1:dim_row) { x[i,] <- e1 + e2[i,] }, "3" = for(i in 1:dim_col) { x[,i] <- e1[,i] + e2 }, "4" = for(i in 1:dim_col) { x[,i] <- e1 + e2[,i] }) x } # TESTING ------------------------------------------------------------------- # Now try it X %-% mu #> [,1] [,2] [,3] #> [1,] -0.9936073 0.5369861 0.2944924 #> [2,] 0.7758779 0.5609272 -1.1588792 #> [3,] 0.2177294 -1.0979133 0.8643868 # This is working like we want # Note that mu is a matrix, not a vector when we pass it to %-% # PAIRWISE DISTANCE --------------------------------------------------------- X <- matrix(c(1.2547115 , -0.48876422, -0.72856604, 0.71314144, -0.39458829, 1.24672356, 0.5546855 , -0.85967337, -1.44113557, -1.88939305, 2.3343547 , 0.87549444, -0.84078801, -0.0126508 , 0.65216383, -0.4186715 , 0.28402457, -0.30994525, -0.84642535, 0.67971748, 0.17122195, 0.24889469, 1.06096652, 0.15103784, -0.61650479, -0.55185275, 1.22362894, -1.13346561, -1.78699116, -0.80449837, -1.63571732, 0.96625735, -0.229193 , -1.04050556, 0.53947026, -0.38108634, -1.86069407, -0.65388136, -0.14660911, -1.31124108, -0.29728826, 0.03291303, 1.97648715, 0.70430292, 0.71027875, -2.73208985, 0.54960805, 0.56568121, -1.04819229, 0.33661375), nrow = 10, ncol = 5, byrow = TRUE) Y <- matrix(c(-1.96300547, 1.87600543, -0.45271843, -0.20048695, -0.18477972, -0.18735322, -2.27866683, -1.00575347, -0.05794885, -0.95420218), nrow = 2, ncol = 5, byrow = TRUE) # Broadcast subtraction and addition to do pairwise distance! matrix(rowSums(X^2)) %-% (2 * (X %*% t(Y))) %+% matrix(rowSums(Y^2), nrow = 1) #> [,1] [,2] #> [1,] 16.900666 6.2678828 #> [2,] 16.658775 12.8935885 #> [3,] 20.354682 18.9174214 #> [4,] 6.104346 10.3964331 #> [5,] 9.803614 10.9461899 #> [6,] 5.781411 15.4272445 #> [7,] 2.214892 16.4268127 #> [8,] 17.777676 0.4713851 #> [9,] 13.692418 17.6007708 #> [10,] 4.378414 19.5910181```

### alexpghayes commented Oct 18, 2017 • edited

 Just discovered this, don't know how I hadn't seen your comment yet -- super cool! FYI if you dig broadcasting, the whole point of `xtensor` is to provide the same Numpy interface and broadcasting niceness from within Rcpp. Ugh if only that played better with windows.

### kanishkamisra commented Jun 19, 2019

 So this is how `rray` came to be hmm 🤔