Skip to content

Instantly share code, notes, and snippets.

@ddimtirov
Created November 2, 2013 06:54
Show Gist options
  • Save ddimtirov/7276338 to your computer and use it in GitHub Desktop.
Save ddimtirov/7276338 to your computer and use it in GitHub Desktop.
Another script translating from one non-standard Cyrillic code page to another (by adding offset). This time in python. Circa 2004
#!/usr/bin/python
# -*- coding: windows-1251 -*-
r"""
:Authors: Dimitar A. Dimitrov
:Contact: dimiter[at]blue[dash]edge[dot]bg
:Copyright: This work is licensed under the X license.
For the full text of the license see http://www.opensource.org/licenses/xnet.php
:Version: 0.1
:Date: 2004-08-22
:Abstract: This script tries to extract cyrillic text from custom codepage.
"""
import os, sys
alpha_upper = 0x80
alpha_lower = 0xA0
input = open(sys.argv[1])
text = input.read()
input.close()
output = open(sys.argv[1] + ".cyr", "w")
for c in text:
code = ord(c)
if c == "�": c = "-"
elif c in "��": c = "|"
elif c == "�": c="�"
elif code in range(alpha_upper, alpha_upper + 31): c = chr(ord("�") + code - alpha_upper)
elif code in range(alpha_lower, alpha_lower + 31): c = chr(ord("�") + code - alpha_lower)
output.write(c)
output.close()
print "done."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment