Skip to content

Instantly share code, notes, and snippets.

View Erechtheus's full-sized avatar

Philippe Erechtheus

  • German Research Center for Artificial Intelligence
View GitHub Profile
@tamuhey
tamuhey / tokenizations_post.md
Last active March 30, 2024 19:00
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links:

@ceres-c
ceres-c / CR95HF_ICODE_psw_dump.py
Created December 31, 2019 12:54
CR95HF Python script to read NXP ICODE tags in privacy mode
#!/usr/bin/python3
# Author: ceres-c 2019-12-29
# Authenticate to ICODE SLI tags
import hid
# Global defines & commands
password = [0x00, 0x00, 0x00, 0x00] # You have to find it yourself, try to search online in german ;-)