In the poster "Efficient Top-K SimRank-based Similarity Join", Wenbo Tao presents an algorithm to find k
pairs of nodes with the largest SimRank values, using a random-walk based method. Tao limits random walks to five steps without loos of too much accuracy, and computes summation of one-way paths sim(a, x, l)
using dynamic programming, where a
is the source node, x
is the target node, and l
is the length of the path. Top-k SimRank-based similarity join problem can then be solved by identifying k
pairs of vectors with the largest inner product values. This can be solved with the matrix compression method by Low et al. in O(nD^ξ)
, where ξ
is the number of iterations. Their results show good performance improvement over SRJ Query by Zheng et al., which requires index building and selection of a threshold value.
While Tao explores the problem of finding top-k SimRank values across the entire graph, Tao does not consider the problem of finding top-k SimRank values within a subset of nodes, say within a community or a set of communities. Further, Tao's solution only applies to static graphs, and utilizes the original SimRank formulation which can be unintuitive in a number of network structures.