Skip to content

Instantly share code, notes, and snippets.

View mikeatm's full-sized avatar

Michael mikeatm

  • Technical University of Kenya
  • Nairobi
View GitHub Profile
@tamuhey
tamuhey / tokenizations_post.md
Last active March 30, 2024 19:00
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@simonw
simonw / recover_source_code.md
Last active January 16, 2024 08:13
How to recover lost Python source code if it's still resident in-memory

How to recover lost Python source code if it's still resident in-memory

I screwed up using git ("git checkout --" on the wrong file) and managed to delete the code I had just written... but it was still running in a process in a docker container. Here's how I got it back, using https://pypi.python.org/pypi/pyrasite/ and https://pypi.python.org/pypi/uncompyle6

Attach a shell to the docker container

Install GDB (needed by pyrasite)

apt-get update && apt-get install gdb
@batok
batok / paramiko_example.py
Created April 10, 2012 16:11
Paramiko example using private key
import paramiko
k = paramiko.RSAKey.from_private_key_file("/Users/whatever/Downloads/mykey.pem")
c = paramiko.SSHClient()
c.set_missing_host_key_policy(paramiko.AutoAddPolicy())
print "connecting"
c.connect( hostname = "www.acme.com", username = "ubuntu", pkey = k )
print "connected"
commands = [ "/home/ubuntu/firstscript.sh", "/home/ubuntu/secondscript.sh" ]
for command in commands:
print "Executing {}".format( command )
@juris
juris / redis-cluster-backup.sh
Last active March 21, 2023 14:27
Redis Cluster backup script
#!/bin/sh
readonly cluster_topology=$(redis-cli -h redis-cluster cluster nodes)
readonly slaves=$(echo "${cluster_topology}" | grep slave | cut -d' ' -f2,4 | tr ' ' ',')
readonly backup_dir="/opt/redis-backup"
mkdir -p ${backup_dir}
for slave in ${slaves}; do
master_id=$(echo "${slave}" | cut -d',' -f2)
@rsvp
rsvp / freqy.sh
Created February 18, 2012 15:43
freqy.sh : sort lines by their frequency count -- a very useful idiom as Linux bash script.
#!/usr/bin/env bash
# bash 3.2.25(1) Ubuntu 7.10 Date : 2012-02-15
#
# _______________| freqy : sort and order lines by frequency of appearance.
#
# Usage: freqy [optional: file(s)]
#
# Notes: Will work in a pipe.
# Does not alter the file(s) themselves.
#
@Snegovikufa
Snegovikufa / m2crypto_example.py
Last active December 8, 2021 14:33
m2crypto examle
# Since I ask this of people before using their code samples, anyone can
# use this under BSD.
import os
import M2Crypto
def empty_callback():
return
# Seed the random number generator with 1024 random bytes (8192 bits)

The issue:

..mobile browsers will wait approximately 300ms from the time that you tap the button to fire the click event. The reason for this is that the browser is waiting to see if you are actually performing a double tap.

(from a new defunct https://developers.google.com/mobile/articles/fast_buttons article)

touch-action CSS property can be used to disable this behaviour.

touch-action: manipulation The user agent may consider touches that begin on the element only for the purposes of scrolling and continuous zooming. Any additional behaviors supported by auto are out of scope for this specification.

@d3noob
d3noob / .block
Last active September 16, 2020 09:18
Map using leaflet.js and d3,js combined
license: mit