Skip to content

Instantly share code, notes, and snippets.

@wolfram77
wolfram77 / notes-introducing-drift-search-combining-global-and-local-search-methods-to-improve-quality-and-efficiency.md
Last active December 23, 2024 04:31
Introducing DRIFT Search: Combining global and local search methods to improve quality and efficiency : NOTES

It seems DRIFT search uses vector similarity search with top-k communities in the top-most hierarchy, and then drills down to the lower hierarchy levels. They seem to use follow-up questions for this purpose. The answers are ranked based on relavance to the query. Need to check the paper for more details.

@wolfram77
wolfram77 / notes-graphrag-unlocking-llm-discovery-on-narrative-private-data.md
Last active December 23, 2024 04:01
GraphRAG: Unlocking LLM discovery on narrative private data : NOTES

Unlike Baseline RAG, which uses embedding search from a vector database to find matching query points in the source text, GraphRAG builds a knowledge graph from the text, which is summarized hierarchically based on community clusters.

From what I understand, the knowledge graph is built by extracting entities and relations from the text, and the community clusters are formed based on the similarity of the entities and relations. The hierarchical summarization is done by summarizing the clusters at different levels of abstraction.

GraphRAG seems to use GPT-4-turbo to build the knowledge graph. However, how are the edge weights calculated? Do the summaries generated affect how the weights are calculated in the next hierarchical level?

@wolfram77
wolfram77 / menu-indian-corner-restaurant-gotland-sweden.md
Created November 6, 2024 18:13
Indian Corner Restaurant Menu @ Gotland, Sweden : MENU

See below.

@wolfram77
wolfram77 / code-misra-gries-algorithm-using-vector-instructions.md
Created October 27, 2024 16:51
Misra-Gries algorithm for finding heavy hitters in a list of numbers, using vector instructions : CODE

I was trying out the Misra-Gries algorithm for finding heavy hitters in a list of numbers, using vector instructions. But let me fill in some context first. I am trying to minimize the memory needed by Louvain/Leiden algorithms for community detection, and hopefully a bit of performance too. Currently the algorithms use a full-size per-thread hashtables, with each thread using |V| space for storing the associated weights for the hashtable. However, we could instead store a small hashtable, using the Misra-Gries algorithm, in the cache - obviously this might affect the performance (in terms of quality of the returned communities).

It was cool to go through step-by-step and be able to minimize the number of lines of generated machine code (with minimal conditional jumps). When you understand the vector instructions, you can write them in a higher-level logic without resorting to writing the instructions yourself, as these can be quite complicated. We let the compiler do its thing, but guide it with a short f

@wolfram77
wolfram77 / notes-vertex-reordering-for-real-world-graphs-and-applications-an-empirical-study-2020.md
Last active October 25, 2024 21:48
Vertex Reordering for Real-World Graphs and Applications: An Empirical Evaluation : NOTES
@wolfram77
wolfram77 / notes-joint-partitioning-and-sampling-algorithm-for-scaling-graph-neural-network.md
Last active October 25, 2024 21:46
Joint Partitioning and Sampling Algorithm for Scaling Graph Neural Network : NOTES
@wolfram77
wolfram77 / handwritten-partitioning-with-community-detection-idea.md
Created October 25, 2024 21:42
Partitioning with community detection idea : HANDWRITTEN NOTES

See below.

@wolfram77
wolfram77 / drawing-weather-engineering-architecture.md
Created July 31, 2024 18:19
Weather engineering architecture : DRAWING

Weather engineering architecture; Sahu (2024)

@wolfram77
wolfram77 / notes-fast-and-efficient-end-to-end-graph-processing-with-shared-memory-accelerators.md
Last active July 31, 2024 18:12
Fast and Efficient End-to-End Graph Processing with Shared Memory Accelerators : NOTES

Fast and Efficient End-to-End Graph Processing with Shared Memory Accelerators; Mughrabi (2021)

Graph algorithms often require fine-grained, random access across substantially large data structures. Previous work on FPGA-based acceleration has required significant preprocessing and restructuring to transform the memory access patterns into a streaming format that is more friendly to off-chip hardware. However, the emergence of cache-coherent shared memory interfaces, such as Coherent Accelerator Processor Interface (CAPI), allows designers to more easily work with the natural in-memory organization of the data. This thesis introduces a vertex-centric shared-memory accelerator (AccelGraph) for graph algorithms optimized for high performance while effectively using coherent caching on the Field Programmable Gate Arrays (FPGA) hardware. The proposed design achieves speedups by selectively caching graph data for the accelerator while considering locality and reuse, compared to using the shared address space

@wolfram77
wolfram77 / notes-dothash-estimating-set-similarity-metrics-for-link-prediction-and-document-deduplication.md
Last active July 31, 2024 18:09
DotHash: Estimating Set Similarity Metrics for Link Prediction and Document Deduplication : NOTES