\newpage
It is not enough for an algorithm to be exact (i.e. demonstrably give the correct result). For an algorithm to be useful in practice, it must give the result in reasonable amount of time. The reasonable amount of time is the one that users will be willing to wait. It varies according to tasks, for instance :
- the time between to graphic updates on the screen (cf. frame rate)
- the time before getting bored/annoyed
- a coffee break
- a night / week-end
- …
The time will presumably but shorter than human lifespan, and most probably shorter than the expected time before the predicted end of the solar system.
Of course, the actual running time depends on the actual hardware used to run the program (e.g. CPU clock speed, number of cores,…). So we will have to come up with an abstract measure based on the number of “steps” (cf. instructions) of the algorithms.
While it is possible to just implement a program and time it, an abstract model allows us to avoid implementing an algorithm if it is possible to understand beforehand that it would be unacceptably slow. Furthermore, an abstract model allows us to predict/estimate the running time for various inputs and compare various implementation strategies.
Of course, the running time will often depend on the input.
The most obvious way in which a running time depends on input is the amount of data to be processed. Let us see a few examples : First, we will use a utility class to generate random data.
import java.util.Random;
public class RandomData {
public static int[] generate1d(int nbVals, int min, int max){
int[] res= new int[nbVals];
Random generator = new Random();
for(int i=0; i != nbVals; ++i){
res[i]= (int)((generator.nextLong()%((long)max-min))+min);
}
return res;
}
}
The random data generator in listing random-data-generator, takes
min
and max
arguments to restrict the interval of possible
values. When studying the algorithms, try to consider the impact (if
any !) of this interval on :
- the running time of the implementations
- the possibility for other, more specific, implementations
We’ll study the former in section quick-bad-sec and the later in section values-interval-sec.
Another way in which data can impact the running time is the order of values. We will study this in sections bubble-sorted-sec and quick-bad-sec.
public class DummySort {
public static void main(String[] args){}
}
echo -n "Testing ${classname} "
echo $(javac Tests.java RandomData.java ${classname}.java && seq 1 1000 |parallel "java ${classname}" 2>&1 1>/dev/null |wc -l) "errors found"
javac Tests.java RandomData.java ${classname}.java && (seq 1000 1000 500000 | parallel "java -Xss512M ${classname}" | sort -n)
1 | 2 |
return input
import numpy as np
from sklearn import linear_model
data=np.log(np.array(data))
regr= linear_model.LinearRegression()
regr.fit(data[:,0:1], data[:,1:2])
return regr.coef_[0]
The first algorithm that we will consider is simply finding the minimum element of an array of numerical values. How long does it take to find the minimum of such a sequence ?
public class Minimum {
public static void main(String [] args){
int nbVals= Integer.parseInt(args[0]);
int [] data= RandomData.generate1d(nbVals, Integer.MIN_VALUE, Integer.MAX_VALUE);
long start = System.nanoTime();
int min=minimum(data);
System.out.println(""+nbVals+"\t"+(System.nanoTime()-start));
}
public static int minimum(int[] data){
int res=data[0];
for(int i=1; i != data.length; ++i){
if(data[i] < res){
res= data[i];
}
}
return res;
}
}
We can see the experimental results of the running time measurements :
Can you estimate what is the relationship between the running time and the number of elements ?
Could you know it beforehand be reading the code ? You should focus first on the function minimumIndex()
and then consider :
- the number of times it is called from
selectionSort()
- the number of elements it has to process each time
Would the running time depend on the range of values in the data ?
What if the data were sorted ?
Of course, the type of data has no implication on the algorithmic
complexity, as long as the elementary operations can still be
considered … elementary.(One notes public static float
mean(float[] data)
?
We will now study more a more interesting and challenging task : sorting an array. In order to “move” values in a (fixed-size) array, one has to swap elements. This can be achieved by the following code :
public static void swap(int[] data, int i, int j){
int tmp= data[i];
data[i]= data[j];
data[j]= tmp;
}
What is the algorithmic complexity of this function ?
We will first benchmark our sorting implementations on random data :
public static void main(String [] args){
int nbVals= Integer.parseInt(args[0]);
int [] data= RandomData.generate1d(nbVals, Integer.MIN_VALUE, Integer.MAX_VALUE);
long start = System.nanoTime();
sort(data);
System.out.println(nbVals+"\t"+(System.nanoTime()-start));
if(!Tests.isSorted(data)){System.err.println("BUG!");
for(int i=0; i != data.length; ++i){
System.err.println(data[i]);
}
}
}
A check is performed so as to ascertain that our implementations are working as intended:
public class Tests {
public static boolean isSorted(int[] data){
if(data.length<2){return true;}
boolean res=true;
for(int i=1; (i != data.length) && res; ++i){
res= data[i-1] <= data[i];
}
return res;
}
}
Selection sort is a sorting algorithm based on the fact that when an
array is sorted (in ascending order), each element lesser or equal
than all the elements after it :
Running times are compounded (multiplied) in nested loops, as we can see in the following implementation selection sort functions. Could you guess the relationship between the running time and the number of elements to sort ?
public static int minimumIndex(int[] data, int begin, int end){
int res= begin;
for(int i=begin+1; i != end; ++i){
if(data[i] < data[res]){
res= i;
}
}
return res;
}
public static void sort(int[] data){
if(data.length < 2){return;}
for(int i=0; i != data.length-1; ++i){
swap(data, i, minimumIndex(data, i, data.length));
}
}
public class SelectionSort {
<<sort-main>>
<<swap>>
<<selection-sort-functions>>
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: selection-algo-test
Testing SelectionSort, 0 errors found
The running time of the selection sort implementation is shown below :
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: selection-exponent-estimate
1.98663308 |
Can you estimate what is the relationship between the running time and the number of elements ?
Could you know it beforehand be reading the code ?
Another way to approach sorting is to consider that in a sorted array, any two adjacent elements are in relative order. Bubble sort takes (all) adjacent pairs of elements and orders them. This process is repeated until all adjacent pairs are ordered.
Try to understand the relationship between the running time and the number on elements to sort with bubble sort, by reading the implementation.
public static void sort(int[] data){
if(data.length < 2){return;}
boolean hadToSwap= false;
do{
hadToSwap=false;
for(int i= 0; i != data.length-1; ++i){
if(data[i] > data[i+1]){
swap(data, i, i+1);
hadToSwap= true;
}
}
}while(hadToSwap);
}
public class BubbleSort {
<<sort-main>>
<<swap>>
<<bubble-sort-functions>>
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]:
Testing BubbleSort 0 errors found
The running time of the bubble sort implementation is shown below :
Can you estimate what is the relationship between the running time and the number of elements ?
Could you know it beforehand be reading the code ?
An interesting variation of the bubble sort is the comb sort. Compare its following implementation with the bubble sort in Listing bubble-sort-functions.
public class CombSort{
<<sort-main>>
<<swap>>
public static void sort(int[] data){
if(data.length < 2) {return;}
boolean hadToSwap= false;
float gapFactor=1.3f;
int gap=data.length;
do{
hadToSwap= false;
gap= (gap > gapFactor) ? (int)(gap / gapFactor) : 1;
for(int i=0; (i+gap) < data.length; ++i){
if(data[i] > data[i+gap]){
swap(data, i, i+gap);
hadToSwap= true;
}
}
}while((gap > 1) || hadToSwap);
}
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]:
Testing CombSort 0 errors found
Can you estimate what is the relationship between the running time and the number of elements ?
Could you know it beforehand be reading the code ?
Yet another way to sort elements is to sort recursively halve and merge the sorted halves. Of course, array intervals of 0 or 1 element are trivially “sorted” by not doing anything.
Let us first focus on the merging algorithm.
Given the following implementation to merge to sorted consecutive
intervals $\left[ begin, middle \right[$ and
public static void mergeSorted(int data[], int begin, int middle, int end){
int[] tmp= new int[middle-begin];
System.arraycopy(data, begin, tmp, 0, tmp.length);
int i=0, j=middle, dest=begin;
while((i< tmp.length) && (j<end)){
data[dest++]= (tmp[i] < data[j]) ? tmp[i++] : data[j++] ;
}
while(i < tmp.length){
data[dest++]= tmp[i++];
}
}
What is the algorithmic complexity of this algorithm ? How much memory does it use ?
Given the following implementation of a merge sort, can you tell the relationship between the running time and the number of elements to sort ?
public class MergeSort {
<<sort-main>>
<<merge-sorted>>
public static void sort(int[] data){
sort(data, 0, data.length);
}
public static void sort(int[] data, int begin, int end){
if((end-begin) < 2){return;}
int middle= (end+begin)/2;
sort(data, begin, middle);
sort(data, middle, end);
mergeSorted(data, begin, middle, end);
}
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: merge-algo-test
Testing MergeSort 0 errors found
Can you find out the relationship between the running time of the merge sort algorithm and the number of elements to sort ? Try by studying the data and by studying the code.
Another way to “divide and conquer” for a sorting algorithm is to pre-process the data before dividing so as to make the merging trivial. Quicksort does so by partitioning the data according to a pivot element. This enables in-place sorting without any external memory.
- Convergence
- we want to be sure to reduce the intervals when doing the recursive calls after the partition around the pivot. To ensure that, we make sure that the pivot is processed and we will recurse excluding this value. To enable that, we have to move the pivot value at the beginning of the second interval (so just after the first one). We will then be able to skip this value in the recursive call (cf. line (quick-recur) in listing quick-sort-functions)
public static int partition(int[] data, int begin, int end, int pivotIdx){
swap(data, pivotIdx, --end);
pivotIdx= end;
int pivot= data[pivotIdx];
//invariant is that everything before begin is known to be < pivot
// and everything after end is known to be >= pivot
while(begin != end){
if(data[begin] >= pivot){
swap(data, begin, --end);
}else{
++begin;
}
}
swap(data, pivotIdx, begin);
return begin;
}
<<pivoting>>
public static void sort(int[] data){
sort(data, 0, data.length);
}
public static void sort(int[] data, int begin, int end){
if((end-begin) < 2){ return; }
int m= partition(data, begin, end, (end+begin)/2);
sort(data, begin, m);
sort(data, m+1, end); // +1 for convergence (ref:quick-recur)
}
public class QuickSort {
<<sort-main>>
<<swap>>
<<quick-sort-functions>>
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: quick-algo-test
Testing QuickSort 0 errors found
The running time of the quicksort implementation is shown below :
Can you estimate what is the relationship between the running time and the number of elements ?
Could you know it beforehand be reading the code ?
One can rewrite implementations to eliminate recursions. It is trivial in the case of a tail call as there is no local execution context to save (on a stack). We can study if such a rewrite changes anything with regards to algorithmic complexity.
public static void sort(int[] data){
sort(data, 0, data.length);
}
public static void sort(int[] data, int begin, int end){
while((end-begin) > 1){
int m= partition(data, begin, end, (begin+end)/2);
sort(data, begin, m);
begin= m+1;
}
}
public class QuickSortTCO {
<<sort-main>>
<<swap>>
<<pivoting>>
<<quick-tco>>
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: quicktco-algo-test
Testing QuickSortTCO 0 errors found
The running times of the quicksort implementations, with and without tail call optimization, are shown in Figure quick-vs-tco-plot-label.
We can compare various sorting algorithms by plotting their running time on the same figure :
Could this be expected ? What can you conclude from those comparisons.
We focused of the amount of data to process, but the actual data to process can also have an impact on the running times.
First, we will study how bubble sort behaves on sorted data. As an
aside, note the use of the standard library function
java.util.Arrays.sort
. Understanding sorting algorithms does not
mean that we should ignore the available implementations [fn:: quite the opposite, you should understand them!].
public static void main(String [] args){
int nbVals= Integer.parseInt(args[0]);
int [] data= RandomData.generate1d(nbVals, Integer.MIN_VALUE, Integer.MAX_VALUE);
java.util.Arrays.sort(data);
long start = System.nanoTime();
sort(data);
System.out.println(nbVals+"\t"+(System.nanoTime()-start));
}
public class BubbleSorted {
<<sorted-main>>
<<swap>>
<<bubble-sort-functions>>
}
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]:
Testing BubbleSorted 0 errors found
The running time of the bubble sort of already sorted data is shown in figure bubble-algo-sorted-timing-plot-label :
We compare the running times of the bubble sort of random and already sorted data in figure bubble-algo-random-vs-sorted-timing-plot-label :
What is the algorithmic complexity when the data was already sorted ? Could this be expected ? What are the consequences (if any)?
We now study the behavior of quicksort algorithm on “bad” data, an
example being all the values in the array left to the default value (0
for int
).
public static void main(String [] args){
int nbVals= Integer.parseInt(args[0]);
int [] data= bad(nbVals);
long start = System.nanoTime();
sort(data);
System.out.println(nbVals+"\t"+(System.nanoTime()-start));
}
public static int[] bad(int nb){
return new int[nb];
}
public class QuickSortBad {
<<quicksort-bad-data>>
<<swap>>
<<pivoting>>
<<quick-tco>>
}
What about the memory consumption[fn:: Watch out for stack overflow, even with tco ! -Xss512M
option to set stack size to 512 Mo.] ?
#+RESULTS[f2b650eb5296f72a1f7237c2a65b7fb3443acf5f]: quick-algo-bad-test
Testing QuickSortBad 0 errors found
What is the relationship between the running time of the quicksort algorithm and the number of elements to sort ? What is the memory consumption of quicksort, for random and for “bad” data ?
Why did the “bad” data make quicksort behave that way ? Is it only related to the values or could “bad” data be created with random values ?
Could there be some data making quicksort have an even worst performance ? (Otherwise, one calls is the worst case algorithmic complexity).
What if the data was made of a potentially very large number of
elements, but those elements could only take a limited number of
values (
In order to anticipate the running time of an algorithm, one can (try to) figure out its algorithmic complexity. It is the dominant factor of the number of elementary steps performed to process data, as a function of the relevant parameters, often the number of elements to process.
This algorithmic complexity might depend on the actual data to be processed. One can be interested in average performance, or worst case performance. While one cannot be so optimistic as to rely on best case performance, some specific data arrangement (e.g. sorted) can enable more efficient algorithms.
Which sorting algorithm would you choose if you had to sort dozens of elements ?
Searching if some specific value is part of a collection is easy but slow on unsorted data, as implemented in listing linear-search.
int indexOf(int[] data, int v){
int res= -1; // not found
for(int i=0; (i != data.length) && (res == -1); ++i){
if(data[i] == v){
res= i;
}
}
return res;
}
What is the algorithmic complexity of this search ?
It is possible to search efficiently a sorted array by dichotomy. One searches a collection by comparing the searched value with the median. Whether the median is lower or greater than the searched value, this one comparison allows to discard half (respectively the upper or lower half) of the collection, by transitivity of the ordering relationship. As seen in section sec-tco, it is easy to remove the recursive call, as it is in tail position. Does the TCO affect :
- the algorithmic complexity ?
- the performance ?
- the memory complexity ?
public class Dichotomy {
public static void main(String[] args){
int nbVals= Integer.parseInt(args[0]);
int[] data= new int[nbVals];
for(int i=0; i != data.length; ++i){
data[i]= 2*i;
}
System.out.println(lowerBound(data, Integer.parseInt(args[1])));
}
public static int indexOfOrdered(int[] data, int v){
int res=lowerBound(data, v);
if((res==data.length) || (data[res] != v)){
res= -1;
}
return res;
}
// index of first element >= v
public static int lowerBound(int[] data, int v){
return lowerBound(data, v, 0, data.length);
}
public static int lowerBound(int[] data, int v, int begin, int end){
if(begin == end){ return begin;}
int m= (begin + end)/2;
return data[m] < v ? lowerBound(data, v, m+1, end) : lowerBound(data, v, begin, m);
}
public static int lowerBoundTCO(int[] data, int v){
int begin=0, end=data.length;
while(begin != end){
int m= (begin + end)/2;
if(data[m] < v){
begin= m+1;
}else{
end= m;
}
}
return begin;
}
}
What is the algorithmic complexity of the dichotomic search ? (Hint : how many more elements can one process by having one more comparison ?)
It is also possible to index the data in order to improve access time. Explain the trade-offs according to the index size.
If the index structure is independent of the actual data (i.e. based on digits, most significant or least significant), it can be built without pre-scanning and could be easily updated. However, it could handle skewed very poorly : imagine indexing numbers according to the most significant byte or prefix sharing strings (e.g. “http://”) according to the first byte. This will be overcome by hashing (cf. infra).
As we have seen, it is very interesting to be able to work on sorted data. However, data is often inserted and/or removed during the execution of programs. Consider inserting and removing data in/from a sorted array implemented in listing insert-array.
int[] insert(int[] array, int v){
int [] res= new int[array.length+1];
int src=0, dest=0;
while(array[src] < v){
res[dest++]= array[src++];
}
res[dest++]= v;
while(src != array.length){
res[dest++]= array[src++];
}
return res;
}
In practice, one can avoid reallocating each time an element is added (cf. infra vectors), but one cannot avoid moving all the elements that are greater than the inserted value, if one is to keep contiguous memory layout. What is the algorithmic complexity of an insertion into an array ?
In order to overcome this restriction and more, we’ll explore other ways to store data.
We saw in section mean-algo an function public static float mean(float[] data)
to compute the mean of a whole array. We would now like to compute the
mean of a sliding window of w
elements (e.g. for a mean filter)
public static float[] meanFilter(float[] data, int w)
. Give an
implementation of meanFilter
and explain the algorithmic
complexity. Can you do better than data.length
?
In fact, when processing data, one is usually interested not only in a
model, but also it its accuracy. For each windows, we not only want
the mean, but also the Sum of Squares Errors, $SSE= \frac{1}{2}
∑i+wj=i( data[j]-\overline{datai … i+w})^2$ where
$\overline{datai … i+w}$ is the mean of
Give an implementation focusing on algorithmic complexity (i.e. not naive).
Given
As we will see later, it is sometimes possible to have sub-optimal (or just an estimate of the) answer with a much more efficient algorithm. Consider the task of just finding the bounding box (i.e. rectangle) of the points, and using it to estimate the desired distance (the exact answer would have been given by the bounding circle).