Philippe Erechtheus

## tokenizations_post.md

      
              1 file
            
          
              3 forks
            
          
              0 comments
            
          
              65 stars
            
          
                tamuhey
                / tokenizations_post.md
            
            
              Last active
              July 27, 2024 14:46
            
              
                How to calculate the alignment between BERT and spaCy tokens effectively and robustly
              
          
    How to calculate the alignment between BERT and spaCy tokens effectively and robustly


site: https://tamuhey.github.io/tokenizations/
Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm.
Here are the library and the demo site links:

repo: https://github.com/tamuhey/tokenizations


## CR95HF_ICODE_psw_dump.py
#!/usr/bin/python3

# Author: ceres-c 2019-12-29

# Authenticate to ICODE SLI tags

import hid

# Global defines & commands
password = [0x00, 0x00, 0x00, 0x00] # You have to find it yourself, try to search online in german ;-)
	#!/usr/bin/python3

	# Author: ceres-c 2019-12-29

	# Authenticate to ICODE SLI tags

	import hid

	# Global defines & commands
	password = [0x00, 0x00, 0x00, 0x00] # You have to find it yourself, try to search online in german ;-)