Skip to content

Instantly share code, notes, and snippets.

@eddyb
Last active April 15, 2023 06:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eddyb/5cba94f9fb6cea5bc28d1a5a7c137d59 to your computer and use it in GitHub Desktop.
Save eddyb/5cba94f9fb6cea5bc28d1a5a7c137d59 to your computer and use it in GitHub Desktop.
Matrix offline regex search (using JSON exports from Element)
@eddyb
Copy link
Author

eddyb commented Apr 15, 2023

Rough usage guidelines:

  1. export the entire Matrix room history as JSON from Element, place in e.g. $HOME/logs
    • may require using the "N most recent messages" mode, with N increased until the JSON dump starts with the creation of the room, specifically the "type":"m.room.member" for whoever created the room
      • in my case, N was around 500k, and that worked in the end, whereas "full history" seemed to get stuck (maybe due to new messages arriving while the export was happening?)
  2. convert to .jsonl.zst (one JSON object per line + zstd compression):
cat ~/logs/'matrix - My Room - Chat Export - yyyy-mm-ddThh-mm-ss.xxxZ'.json |
  jq '.messages[]' -c |
  zstd -19 -o ~/logs/'matrix - My Room - Chat Export - yyyy-mm-ddThh-mm-ss.xxxZ'.messages.jsonl.zst
  1. use the above script with that .messages.jsonl.zst file:
mx-search --messages_jsonl_zst=$HOME/logs/'matrix - My Room - Chat Export - yyyy-mm-ddThh-mm-ss.xxxZ'.messages.jsonl.zst --context=5 'my\.regex'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment