While doing research work with Prof. Kishore Kothapalli, and Prof. Dip Sankar Banerjee.
Abstract — The effect on PageRank computation time for different datatypes of the rank vector and the CSR representation of graphs is studied here. This is done on both a CPU, as well as a GPU. Results indicate that using a wider datatype for the CSR representation has a smaller impact to performance compared to using a wider datatype for the rank vector. This is despite the fact that the CSR representation has a much larger memory footprint, and thus higher memory access requirement, than the rank vector. This could be explained from the fact that the PageRank algorithm accesses the CSR representation in a linear-like fashion, which improves the cache hit ratio for a CPU, and improves memory coalescing for a GPU. On the other hand, it accesses the rank vector in random order. When a wider datatype is used, a higher memory bandwidth is required, and if memory accesses are random, cache memory can get occupied faster and require faster evictions, slowing down performance. In all cases however, the impact on performance for a wider datatype is higher for GPU, compared to CPU. This is likely because of architectural differences between the two devices, with the GPU having a wider memory bus than the CPU. (🎞️ slides)
Index terms — PageRank algorithm, Datatype adjustment, Rank vector, CSR representation, Memory access pattern.
Rank vector | CSR representation |
---|---|
Sequential | Sequential |
CUDA | CUDA |
nvGraph |
- Performance of PageRank using 32-bit floats vs 64-bit floats for the rank vector.
- Performance of CUDA-based PageRank using 32-bit floats vs 64-bit floats for the rank vector.
- Performance of nvGraph PageRank using 32-bit floats vs 64-bit floats for the rank vector.
- Performance of PageRank using 32-bit ints vs 64-bit ints for the CSR representation.
- Performance of CUDA-based PageRank using 32-bit ints vs 64-bit ints for the CSR representation.