Subhajit Sahu wolfram77

## thesis-accelerating-pagerank-with-a-heterogeneous-two-phase-cpu-fpga-algorithm.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wolfram77
                / thesis-accelerating-pagerank-with-a-heterogeneous-two-phase-cpu-fpga-algorithm.md
            
            
              Created
              March 31, 2024 14:20
            
              
                Accelerating Pagerank With a Heterogeneous two Phase CPU-FPGA Algorithm : THESIS
              
          
    I think I wrote a summary of this thesis, but I cannot find it anymore.

  
## script-graph-mtx-add-self-loop.js
const fs = require('fs');
const path = require('path');
const readline = require('readline');


// Read the header of a graph file.
async function readGraphHeader(pth) {
  var fstream = fs.createReadStream(pth);

## script-create-gist.js
const cp   = require('child_process');
const fs   = require('fs');
const path = require('path');
const _ = require('lodash');


// Main function.
function main() {
  for (var file of fs.readdirSync('./')) {

## config-htoprc-3.0.5.ini
# Beware! This file is rewritten by htop when settings are changed in the interface.
# The parser is also very primitive, and not human-friendly.
fields=0 48 17 18 38 39 40 2 46 47 49 1
sort_key=46
sort_direction=-1
tree_sort_key=0
tree_sort_direction=1
hide_kernel_threads=1
hide_userland_threads=0
shadow_other_users=0

## script-convert-latex-to-markdown.js
const os = require('os');
const fs = require('fs');


// Read a file.
function readFile(pth) {
  var txt = fs.readFileSync(pth, 'utf8');
  return txt.replace(/\r\n?/g, '\n');

## output-pagerank-barrierfree-openmp-dynamic--strong-scaling-sharanga.log
2023-07-30 08:51:17 OMP_NUM_THREADS=40
2023-07-30 08:51:17 Loading graph /home/hema/Data/indochina-2004.mtx ...
2023-07-30 08:51:37 order: 7414866 size: 194109311 [directed] {}
2023-07-30 08:51:42 order: 7414866 size: 199021693 [directed] {} (selfLoopAllVertices)
2023-07-30 08:51:48 order: 7414866 size: 199021693 [directed] {} (transposeWithDegree)
{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0051604.6/0051604.6ms, 074 iter, 4.73e-10 err, 000 crashed] pagerankBasicOmp
{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0048078.0/0048078.0ms, 070 iter, 5.30e-10 err, 000 crashed] pagerankBasicNaiveDynamicOmp
{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0053318.4/0053318.4ms, 070 iter, 5.29e-10 err, 000 crashed] pagerankBasicDynamicFrontierOmp
{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0066079.3/0066079.3ms, 093 iter, 5.15e-10 err, 000 crashed] pagerankBarrierfreeOmp

## notes-sling-a-near-optimal-index-structure-for-simrank.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wolfram77
                / notes-sling-a-near-optimal-index-structure-for-simrank.md
            
            
              Last active
              September 27, 2023 14:19
            
              
                SLING: A Near-Optimal Index Structure for SimRank : NOTES
              
          
    In the paper "SLING: A Near-Optimal Index Structure for SimRank", Tian and Xiao present SimRank via Local updates and sampling (SLING), an efficient index structure which guarentees at most ε additive error, and answers single pair and single source SimRank queries in O(1/ε) and O(n/ε) time - which is near optimal. Further, SLING requires only O(n/ε) space and O(m/ε + n log ...) pre-computation time.
Tian and Xiao use √c reverse random walks, where at each step, we inspect the in-neighbors of current node and select one of them uniformly at random with a probability of √c, or stop with a probability of 1 - √c. Each √c walk has an expected length of 1/(1-√c). They then use hitting probabilities hˡ(i,k) and hˡ(j,k) to compute SimRank score s(i,j) with dₖ - a correction factor which represents the probability that two √c walks do not meet each other after the 0th step. They propose to pre-compute approximate versions of dₖ and hˡ(i,k) as the index, and use them upon query to es

  
## notes-efficient-top-k-simrank-based-similarity-join.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wolfram77
                / notes-efficient-top-k-simrank-based-similarity-join.md
            
            
              Created
              September 26, 2023 18:10
            
              
                Efficient Top-K SimRank-based Similarity Join : NOTES
              
          
    In the poster "Efficient Top-K SimRank-based Similarity Join", Wenbo Tao presents an algorithm to find k pairs of nodes with the largest SimRank values, using a random-walk based method. Tao limits random walks to five steps without loos of too much accuracy, and computes summation of one-way paths sim(a, x, l) using dynamic programming, where a is the source node, x is the target node, and l is the length of the path. Top-k SimRank-based similarity join problem can then be solved by identifying k pairs of vectors with the largest inner product values. This can be solved with the matrix compression method by Low et al. in O(nD^ξ), where ξ is the number of iterations. Their results show good performance improvement over SRJ Query by Zheng et al., which requires index building and selection of a threshold value.
Critical review

While Tao explores the problem of finding top-k SimRank values across the entire graph, Tao does not consider the problem of finding top-k SimRank values within

  
## notes-accelerating-pairwise-simrank-estimation-over-static-and-dynamic-graphs.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wolfram77
                / notes-accelerating-pairwise-simrank-estimation-over-static-and-dynamic-graphs.md
            
            
              Created
              September 26, 2023 13:29
            
              
                Accelerating pairwise SimRank estimation over static and dynamic graphs : NOTES
              
          
    In the paper "Accelerating pairwise SimRank estimation over static and dynamic graphs", Wang et al. motivate that sometimes people may issue a set of similarity queries, which may not be restricted to a particular source. For example, people may want to know "Which two scientists in the DB area, who have not collaborated with each other before, would likely collaborate in the future?", or "Which scientists in DB, who have collaborated with AI  researchers before, would keep on collaborating with  other AI scientists?"
Wang et al. note that iterative deterministic methods have two drwabacks. One is its BFS fixed-order expansion, which is not optimized for the local structure of query nodes. It may also contain unnecessary computation, which contributes little to the final SimRank score. Second, as k increases, the number of k-hop neighbors grows exponentially.
Most recent solutions for dynamic SimRank query are based on its probabilistic interpretation: s(a,b) is the expected f-meeting time of two revers

  
## notes-a-novel-and-fast-simrank-algorithm.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                wolfram77
                / notes-a-novel-and-fast-simrank-algorithm.md
            
            
              Last active
              September 25, 2023 16:51
            
              
                A Novel and Fast SimRank Algorithm : NOTES
              
          
    In the paper "A Novel and Fast SimRank Algorithm", Lu et al. avoid the approximation of SimRank with (1-c)I, or use of diagonal matrix to compute exact SimRank (which is costly to compute). Instead they use conjugate gradient algorithm along with jacobi preconditioner - after splitting matrices into "same" (S) and "disjoint" (D) components:

Preconditioned conjugate gradient method for undirected graph, and
Preconditioned bi-conjugate gradient stabilized method for directed graph

These methods speed up the convergence rate by finding the optimal search direction in each iteration. Further, they use common-neighbor sharing technique to optimize computation.
Lu et al. compare the algorithms with Optimal In-neighbors Partitioning (OIP), Successive Over-Relaxation (SOR), and Scalable SimRank Join (SSJ, stochastic) algorithms. They use a decay factor of c = 0.8 (although 0.6 is more common in literature), and an error bound of ε = 0.001. In SOR, they select different optimal w for each dat
	const fs = require('fs');
	const path = require('path');
	const readline = require('readline');




	// Read the header of a graph file.
	async function readGraphHeader(pth) {
	var fstream = fs.createReadStream(pth);
	const cp = require('child_process');
	const fs = require('fs');
	const path = require('path');
	const _ = require('lodash');



	// Main function.
	function main() {
	for (var file of fs.readdirSync('./')) {
	# Beware! This file is rewritten by htop when settings are changed in the interface.
	# The parser is also very primitive, and not human-friendly.
	fields=0 48 17 18 38 39 40 2 46 47 49 1
	sort_key=46
	sort_direction=-1
	tree_sort_key=0
	tree_sort_direction=1
	hide_kernel_threads=1
	hide_userland_threads=0
	shadow_other_users=0
	const os = require('os');
	const fs = require('fs');




	// Read a file.
	function readFile(pth) {
	var txt = fs.readFileSync(pth, 'utf8');
	return txt.replace(/\r\n?/g, '\n');
	2023-07-30 08:51:17 OMP_NUM_THREADS=40
	2023-07-30 08:51:17 Loading graph /home/hema/Data/indochina-2004.mtx ...
	2023-07-30 08:51:37 order: 7414866 size: 194109311 [directed] {}
	2023-07-30 08:51:42 order: 7414866 size: 199021693 [directed] {} (selfLoopAllVertices)
	2023-07-30 08:51:48 order: 7414866 size: 199021693 [directed] {} (transposeWithDegree)
	{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0051604.6/0051604.6ms, 074 iter, 4.73e-10 err, 000 crashed] pagerankBasicOmp
	{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0048078.0/0048078.0ms, 070 iter, 5.30e-10 err, 000 crashed] pagerankBasicNaiveDynamicOmp
	{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0053318.4/0053318.4ms, 070 iter, 5.29e-10 err, 000 crashed] pagerankBasicDynamicFrontierOmp
	{-5.000e-05/+5.000e-05 batchf, 000/001 threads 0000ms @ 0.00e+00 none failure} -> {0066079.3/0066079.3ms, 093 iter, 5.15e-10 err, 000 crashed] pagerankBarrierfreeOmp