btbytes/README.md

## README.md

      
    Raw
  

              README.md
            
          
    A personal desktop PDF/documents search interface


Walk through the disk/directory(ies) that contain the PDFs
Store the following data in a "document store" of some kind that supports text search for later retrieval

{
  "sha256": "<sha256 hash of the file>",
  "filename": "<filename>",
  "path": "<path>",
  "contents": "<pdftotext (or similar) of the first 100 pages>"
}

a local web server interface with a "search box" to search through this data
optionally, a way to add notes, and tags to the above document uniquely.
optionally, flag "duplicate files", so that they are not returned in search results.