Andj andjc

## marc8_eacc.md

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                andjc
                / marc8_eacc.md
            
            
              Created
              May 22, 2024 07:05
            
              
                MARC-8 and EACC
              
          
    MARC-8 Code Tables:

codetables.xml – MARC-8 to Unicode XML mapping
eacc2uni.txt – MARC-8 to Unicode comma-delimited mapping file (EACC characters only)


## bidi_isolate.py
####################################################################################################
#
# Bidi isolation: bidiIsolate()
#    Enabling Languages Python port of unicodeBidi.ts: https://github.com/signalapp/Signal-Desktop/blob/ce0fb220411b97722e1e080c14faa65d23165784/ts/util/unicodeBidi.ts
#    Original code by Signal Messenger, LLC
#    Released under AGPL 3.0 license
#
####################################################################################################

import regex

## installation_commands.txt
sudo apt-get update
sudo apt-get install -y unzip git cmake python3-pip python3.11-venv libfreetype6-dev libharfbuzz-dev libfribidi-dev meson gtk-doc-tools libcairo2-dev libfontconfig-dev libjpeg-dev zlib1g-dev libpng-dev libtiff5-dev libfreetype6-dev liblcms2-dev libwebp-dev libxcb1-dev

mkdir ~/tmp
cd tmp
git clone https://github.com/HOST-Oman/libraqm.git
git clone https://github.com/ninja-build/ninja.git

cd ninja
./configure.py --bootstrap

## localised_dataframe_persian.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                andjc
                / localised_dataframe_persian.ipynb
            
            
              Created
              April 14, 2024 20:52
            
              
                localised_dataframe_persian.ipynb
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## localised_dataframe_persian.ipynb

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                andjc
                / localised_dataframe_persian.ipynb
            
            
              Created
              April 13, 2024 12:46
            
              
                localised_dataframe_persian.ipynb
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## rbbi.json
{
  "din": "!!quoted_literals_only; $CR = [\\p{Grapheme_Cluster_Break = CR}]; $LF = [\\p{Grapheme_Cluster_Break = LF}]; $Control = [[\\p{Grapheme_Cluster_Break = Control}]]; $Extend  = [[\\p{Grapheme_Cluster_Break = Extend}]]; $ZWJ = [\\p{Grapheme_Cluster_Break = ZWJ}]; $Regional_Indicator = [\\p{Grapheme_Cluster_Break = Regional_Indicator}]; $Prepend = [\\p{Grapheme_Cluster_Break = Prepend}]; $SpacingMark = [\\p{Grapheme_Cluster_Break = SpacingMark}]; $Virama  = [\\p{Gujr}\\p{sc=Telu}\\p{sc=Mlym}\\p{sc=Orya}\\p{sc=Beng}\\p{sc=Deva}&\\p{Indic_Syllabic_Category=Virama}]; $LinkingConsonant = [\\p{Gujr}\\p{sc=Telu}\\p{sc=Mlym}\\p{sc=Orya}\\p{sc=Beng}\\p{sc=Deva}&\\p{Indic_Syllabic_Category=Consonant}]; $ExtCccZwj   = [[\\p{gcb=Extend}-\\p{ccc=0}] \\p{gcb=ZWJ}]; $L = [\\p{Grapheme_Cluster_Break = L}]; $V = [\\p{Grapheme_Cluster_Break = V}]; $T = [\\p{Grapheme_Cluster_Break = T}]; $LV = [\\p{Grapheme_Cluster_Break = LV}]; $LVT = [\\p{Grapheme_Cluster_Break = LVT}]; $Extended_Pict = [:ExtPict:]; !!chain; 'AA'|'Aa'|

## UAX_29.py
# We start by loading up PyICU.
import PyICU as icu
# Let's create a test text. Notice it contains some punctuation.
test = u"This is (\"a\") test!"


# We create a wordbreak iterator. All break iterators in ICU are really RuleBasedBreakIterators, and we need to tell it which locale to take the word break rules from. Most locales have the same rules for UAX#29 so we will use English.
wb = icu.BreakIterator.createWordInstance(icu.Locale.getEnglish())

# An iterator is just that. It contains state and then we iterate over it. The state in this case is the text we want to break. So we set that.

## african_script_fonts.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                andjc
                / african_script_fonts.md
            
            
              Created
              December 6, 2023 05:29
            
              
                List of fonts supporting African scripts.
              
          
    African Script fonts

Adlam


ADLaM Display – OFL 1.1; 1 file (Regular)
Ebrima – Commercial; 2 files (Regular, Bold)
Kigelia – Commercial; 6 files (Light, Light Italic, Regular, Italic, Bold, Bold Italic)
Noto Sans Adlam – OFL 1.1; 4 files (Regular, Medium, SemiBold, Bold); 1 variable font.
Noto Sans Adlam Unjoined – OFL 1.1; 4 files (Regular, Medium, SemiBold, Bold); 1 variable font.


## is_confusable.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                andjc
                / is_confusable.md
            
            
              Last active
              December 5, 2023 04:14
            
              
                Get skeleton for confusable characters
              
          
    Exemplars for confusable characters (normalising confusable data)

Normally we preprocessing text, we want to normalise our data. Unicode Normalisation Forms KC and KD can be used for
converting compatibility characters during normalisation. This will handle soem confusable characters, but not all.
The function below attempts to normalise confusable characters.
In is_confusable() we parse a string using icu.SpoofChecker, which is based on
Unicode Technical Report #36 and Unicode Technical Standard #39.
UTS 39 defines two strings to be confusable if they map to the same skeleton.

  
## work-with-multiple-github-accounts.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                andjc
                / work-with-multiple-github-accounts.md
            
            
              Created
              August 7, 2023 15:33
                — forked from rahularity/work-with-multiple-github-accounts.md
            
              
                How To Work With Multiple Github Accounts on your PC
              
          
    How To Work With Multiple Github Accounts on a single Machine

Let suppose I have two github accounts, https://github.com/rahul-office and https://github.com/rahul-personal. Now i want to setup my mac to easily talk to both the github accounts.

NOTE: This logic can be extended to more than two accounts also. :)

The setup can be done in 5 easy steps:
Steps:


Step 1 : Create SSH keys for all accounts
Step 2 : Add SSH keys to SSH Agent
	####################################################################################################
	#
	# Bidi isolation: bidiIsolate()
	# Enabling Languages Python port of unicodeBidi.ts: https://github.com/signalapp/Signal-Desktop/blob/ce0fb220411b97722e1e080c14faa65d23165784/ts/util/unicodeBidi.ts
	# Original code by Signal Messenger, LLC
	# Released under AGPL 3.0 license
	#
	####################################################################################################

	import regex
	sudo apt-get update
	sudo apt-get install -y unzip git cmake python3-pip python3.11-venv libfreetype6-dev libharfbuzz-dev libfribidi-dev meson gtk-doc-tools libcairo2-dev libfontconfig-dev libjpeg-dev zlib1g-dev libpng-dev libtiff5-dev libfreetype6-dev liblcms2-dev libwebp-dev libxcb1-dev

	mkdir ~/tmp
	cd tmp
	git clone https://github.com/HOST-Oman/libraqm.git
	git clone https://github.com/ninja-build/ninja.git

	cd ninja
	./configure.py --bootstrap
	# We start by loading up PyICU.
	import PyICU as icu
	# Let's create a test text. Notice it contains some punctuation.
	test = u"This is (\"a\") test!"


	# We create a wordbreak iterator. All break iterators in ICU are really RuleBasedBreakIterators, and we need to tell it which locale to take the word break rules from. Most locales have the same rules for UAX#29 so we will use English.
	wb = icu.BreakIterator.createWordInstance(icu.Locale.getEnglish())

	# An iterator is just that. It contains state and then we iterate over it. The state in this case is the text we want to break. So we set that.