Subhajit Sahu wolfram77

## notes-large-scale-graph-label-propagation-on-gpus.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-large-scale-graph-label-propagation-on-gpus.md
            
            
              Last active
              March 9, 2025 09:13
            
              
                Large-Scale Graph Label Propagation on GPUs; Ye et al. (2023) : NOTES
              
          
    My highlighted notes for the following paper:

Ye, C., Li, Y., He, B., Li, Z., & Sun, J. (2023). Large-scale graph label propagation on gpus. IEEE Transactions on Knowledge and Data Engineering, 36(10), 5234-5248.


## notes-gpu-accelerated-graph-clustering-via-parallel-label-propagation.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-gpu-accelerated-graph-clustering-via-parallel-label-propagation.md
            
            
              Created
              March 9, 2025 07:35
            
              
                GPU-Accelerated Graph Clustering via Parallel Label Propagation; Kozawa et al. (2017) : NOTES
              
          
    My highlighted notes for the following paper:

Kozawa, Y., Amagasa, T., & Kitagawa, H. (2017, November). GPU-accelerated graph clustering via parallel label propagation. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 567-576).


## notes-fast-community-detection-algorithm-with-gpus-and-multicore-architectures.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-fast-community-detection-algorithm-with-gpus-and-multicore-architectures.md
            
            
              Created
              March 9, 2025 05:04
            
              
                Fast Community Detection Algorithm With GPUs and Multicore Architectures; Somal and Narang (2011) : NOTES
              
          
    My highlighted notes for the following paper:

Soman, J., & Narang, A. (2011, May). Fast community detection algorithm with gpus and multicore architectures. In 2011 IEEE International Parallel & Distributed Processing Symposium (pp. 568-579). IEEE.


## notes-advances-in-inverse-lithography.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-advances-in-inverse-lithography.md
            
            
              Last active
              March 5, 2025 12:48
            
              
                Advances in Inverse Lithography (2022) : NOTES
              
          
    My highlighted notes for the following paper:

Cecil, T., Peng, D., Abrams, D., Osher, S. J., & Yablonovitch, E. (2022). Advances in inverse lithography. ACS Photonics, 10(4), 910-918.


## notes-low-latency-graph-streaming-using-compressed-purely-functional-trees.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-low-latency-graph-streaming-using-compressed-purely-functional-trees.md
            
            
              Last active
              February 25, 2025 03:44
            
              
                Low-Latency Graph Streaming using Compressed Purely-Functional Trees : NOTES
              
          
    My highlighted notes are below.

  
## notes-interface-for-sparse-linear-algebra-operations.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-interface-for-sparse-linear-algebra-operations.md
            
            
              Last active
              February 21, 2025 09:02
            
              
                Interface for Sparse Linear Algebra Operations : NOTES
              
          
    My highlighted notes for the following paper:

Abdelfattah A, Ahrens W, Anzt H, Armstrong C, Brock B, Buluc A, Busato F, Cojean T, Davis T, Demmel J, Dinh G. Interface for Sparse Linear Algebra Operations. arXiv preprint arXiv:2411.13259. 2024 Nov 20.


## output-pagerank-levelwise-multi-dynamic--8020.log
Loading graph /home/subhajit.sahu/Data/indochina-2004.mtx ...
order: 7414866 size: 194109311 {}

# Batch size 1e-07
- batch update size: 20
- components: 1749035
- blockgraph-levels: 524
- affected-vertices: 7220621
- affected-components: 1721204
order: 7414866 size: 195418449 {} [27803.355 ms; 000 iters.] [0.0000e+00 err.] pagerankMonolithicOmpSplit (static)

## notes-hydetect-a-hybrid-cpu-gpu-algorithm-for-community-detection.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-hydetect-a-hybrid-cpu-gpu-algorithm-for-community-detection.md
            
            
              Last active
              January 16, 2025 09:15
            
              
                HyDetect: A Hybrid CPU-GPU Algorithm for Community Detection; Bhowmick and Vadhiyar (2019) : NOTES
              
          
    HyDetect: A Hybrid CPU-GPU Algorithm for Community Detection; Bhowmick and Vadhiyar (2019)


Graph is parititoned for CPU and GPU.
Louvain is independently performed to get psuedo-communities.
Determine doubtful vertices which do not belong to communities formed on a device.
Doubtful vertices are exchanged between the devices.
Executes Louvain algorithm again from subgraph on devices, which includes communities formed earlier and doubtful vertices.
This results in new communities and a new set of doubtful vertices.
Doubtful vertices are exchanged again.
Graph is coarsened to form a reduced graph of new vertices.


## notes-flexgen-high-throughput-generative-inference-of-large-language-models-with-a-single-gpu.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-flexgen-high-throughput-generative-inference-of-large-language-models-with-a-single-gpu.md
            
            
              Last active
              January 16, 2025 09:34
            
              
                FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU : NOTES
              
          
    FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU; Sheng et al. (2023)


Motivated by latency insensitive tasks and high dependence on accelerators.
FlexGen can be run on a single commodity system with CPU, GPU, and disk.
Solves an LP problem, searching for efficient patterns to store and access tensors.
Compresses weights and attention cache to 4 bits with minimal accuracy loss (fine-grained group-wise quantization).
These enable FlexGen to have larger batch size choices and improve its throughput significantly.
Running OPT-175B on a 16GB GPU, FlexGen achieves 1 token/s throughput for the first time.
Runs HELM benchmark with a 30B model in 21 hours.


## notes-introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency.md

      
              2 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                wolfram77
                / notes-introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency.md
            
            
              Last active
              December 23, 2024 04:31
            
              
                 Introducing DRIFT Search: Combining global and local search methods to improve quality and efficiency : NOTES
              
          
    It seems DRIFT search uses vector similarity search with top-k communities in the top-most hierarchy, and then drills down to the lower hierarchy levels. They seem to use follow-up questions for this purpose. The answers are ranked based on relavance to the query. Need to check the paper for more details.
	Loading graph /home/subhajit.sahu/Data/indochina-2004.mtx ...
	order: 7414866 size: 194109311 {}

	# Batch size 1e-07
	- batch update size: 20
	- components: 1749035
	- blockgraph-levels: 524
	- affected-vertices: 7220621
	- affected-components: 1721204
	order: 7414866 size: 195418449 {} [27803.355 ms; 000 iters.] [0.0000e+00 err.] pagerankMonolithicOmpSplit (static)