Skip to content

Instantly share code, notes, and snippets.

View mikeatm's full-sized avatar

Michael mikeatm

  • Technical University of Kenya
  • Nairobi
View GitHub Profile
@simonw
simonw / recover_source_code.md
Last active January 16, 2024 08:13
How to recover lost Python source code if it's still resident in-memory

How to recover lost Python source code if it's still resident in-memory

I screwed up using git ("git checkout --" on the wrong file) and managed to delete the code I had just written... but it was still running in a process in a docker container. Here's how I got it back, using https://pypi.python.org/pypi/pyrasite/ and https://pypi.python.org/pypi/uncompyle6

Attach a shell to the docker container

Install GDB (needed by pyrasite)

apt-get update && apt-get install gdb
@juris
juris / redis-cluster-backup.sh
Last active March 21, 2023 14:27
Redis Cluster backup script
#!/bin/sh
readonly cluster_topology=$(redis-cli -h redis-cluster cluster nodes)
readonly slaves=$(echo "${cluster_topology}" | grep slave | cut -d' ' -f2,4 | tr ' ' ',')
readonly backup_dir="/opt/redis-backup"
mkdir -p ${backup_dir}
for slave in ${slaves}; do
master_id=$(echo "${slave}" | cut -d',' -f2)
@tamuhey
tamuhey / tokenizations_post.md
Last active March 30, 2024 19:00
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links: