Skip to content

Instantly share code, notes, and snippets.

@phabee
Created July 18, 2018 06:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phabee/5732e40da6b5f845c2ce299db1a689d9 to your computer and use it in GitHub Desktop.
Save phabee/5732e40da6b5f845c2ce299db1a689d9 to your computer and use it in GitHub Desktop.
Compare Growing Object Performance R vs. Java
generate_random_tour <- function(nrows) {
x <- c(0, runif(nrows, min = 0, max = 100), 0)
y <- c(0, runif(nrows, min = 0, max = 100), 0)
loc <- c("Depot", floor(runif(nrows, 1000, 9999)), "Depot")
return(data.frame(x = x, y = y, loc = loc, stringsAsFactors = FALSE))
}
append_stop <- function(tour, stop) {
# return (rbind(tour, stop))
# use data.table::rbindlist instead of rbind, reason see here:
# https://stackoverflow.com/questions/15673550/why-is-rbindlist-better-than-rbind
return(data.table::rbindlist(list(tour, stop)))
}
run_test <- function(nloop, initsize) {
for (j in 1:nloop) {
tour <- generate_random_tour(initsize)
for (i in 1:150) {
stop <- generate_random_tour(1)
tour <- append_stop(tour, stop)
}
}
}
system.time(run_test(1000, 1))
package ch.ims.tests;
import java.util.List;
import ch.ims.core.TourGenerator;
import ch.ims.model.TourStop;
public class GrowingTour {
public static void runTest(int nLoop, int initSize) {
for (int i = 0; i < nLoop; i++) {
List<TourStop> tour = TourGenerator.generateRandomTour(initSize);
for (int j = 0; j < 150; j++) {
List<TourStop> stop = TourGenerator.generateRandomTour(1);
tour.add(stop.get(0));
}
}
}
}
package ch.ims.core;
import ch.ims.tests.GrowingTour;
public class TLOptPerfTestRunner {
public static void main(String[] args) {
long startTime = System.nanoTime();
GrowingTour.runTest(10000, 1);
long endTime = System.nanoTime();
System.out.println("Total runtime: " + (endTime - startTime)/1E9);
}
}
/* Example 1000, 1
* java: 0.04538065 s
* r: 51.31 s (rbind)
* r: 50.03 s (rbindlist)
* java speedup: 1100 x
*
* Example 10000, 1
* java: 0.316938847 s
* r: 559.79 s (rbind)
* r: 535.14 s (rbindlist)
* java speedup: 1700 x
*/
package ch.ims.core;
import java.util.ArrayList;
import java.util.List;
import ch.ims.model.TourStop;
public class TourGenerator {
public static List<TourStop> generateRandomTour(int numStops) {
List<TourStop> retVal = new ArrayList<TourStop>();
retVal.add(new TourStop(0, 0, "Depot"));
for (int i = 0; i < numStops; i++) {
retVal.add(new TourStop((float) Math.random() * 100, (float) Math.random() * 100,
String.valueOf((int) (1000 + Math.random() * 8999))));
}
retVal.add(new TourStop(0, 0, "Depot"));
return retVal;
}
}
@phabee
Copy link
Author

phabee commented Jul 18, 2018

These Code-Samples demonstrate the memory-management-issue the R-programming language has when dealing with growing objects. As we all know, rbind performs badly, but even the hi-performance rbindlinst from data.table (see https://stackoverflow.com/questions/15673550/why-is-rbindlist-better-than-rbind for mode details) doesn't change the fact, that R cannot handle growing data-structures in an effective way. All 'solutions' we've come across up to now would suggest to initialize the data-structure once, best by constructing the columns separately. This is not the task to be solved in our case and therefore no option for us. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment