Michael mikeatm

## recover_source_code.md

      
              1 file
            
          
              103 forks
            
          
              20 comments
            
          
              666 stars
            
          
                simonw
                / recover_source_code.md
            
            
              Last active
              January 16, 2024 08:13
            
              
                How to recover lost Python source code if it's still resident in-memory
              
          
    How to recover lost Python source code if it's still resident in-memory

I screwed up using git ("git checkout --" on the wrong file) and managed to delete the code I had just written... but it was still running in a process in a docker container. Here's how I got it back, using https://pypi.python.org/pypi/pyrasite/ and https://pypi.python.org/pypi/uncompyle6
Attach a shell to the docker container

Install GDB (needed by pyrasite)

apt-get update && apt-get install gdb


## redis-cluster-backup.sh
#!/bin/sh

readonly cluster_topology=$(redis-cli -h redis-cluster cluster nodes)
readonly slaves=$(echo "${cluster_topology}" | grep slave | cut -d' ' -f2,4 | tr ' ' ',')

readonly backup_dir="/opt/redis-backup"
mkdir -p ${backup_dir}

for slave in ${slaves}; do
    master_id=$(echo "${slave}" | cut -d',' -f2)

## tokenizations_post.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              63 stars
            
          
                tamuhey
                / tokenizations_post.md
            
            
              Last active
              March 30, 2024 19:00
            
              
                How to calculate the alignment between BERT and spaCy tokens effectively and robustly
              
          
    How to calculate the alignment between BERT and spaCy tokens effectively and robustly


site: https://tamuhey.github.io/tokenizations/
Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm.
Here are the library and the demo site links:

repo: https://github.com/tamuhey/tokenizations
	#!/bin/sh

	readonly cluster_topology=$(redis-cli -h redis-cluster cluster nodes)
	readonly slaves=$(echo "${cluster_topology}" \| grep slave \| cut -d' ' -f2,4 \| tr ' ' ',')

	readonly backup_dir="/opt/redis-backup"
	mkdir -p ${backup_dir}

	for slave in ${slaves}; do
	master_id=$(echo "${slave}" \| cut -d',' -f2)