Skip to content

Instantly share code, notes, and snippets.

View gumblex's full-sized avatar

Dingyuan Wang gumblex

View GitHub Profile
@gumblex
gumblex / num2chinese.py
Created February 8, 2015 02:46
Numbers to Chinese representations converter in Python. 中文数字转换
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Licensed under WTFPL or the Unlicense or CC0.
# This uses Python 3, but it's easy to port to Python 2 by changing
# strings to u'xx'.
import itertools
def num2chinese(num, big=False, simp=True, o=False, twoalt=False):
@gumblex
gumblex / figcaptcha.py
Created March 28, 2015 11:45
Use FIGlet (ASCII art) as CAPTCHA, with a noise generator
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright 2015 Gumble
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
@gumblex
gumblex / PathFitter.py
Created April 4, 2015 12:01
Path fitter in Python - An Algorithm for Automatically Fitting Digitized Curves
"""
Ported from Paper.js - The Swiss Army Knife of Vector Graphics Scripting.
http://paperjs.org/
Copyright (c) 2011 - 2014, Juerg Lehni & Jonathan Puckey
http://scratchdisk.com/ & http://jonathanpuckey.com/
Distributed under the MIT license. See LICENSE file for details.
All rights reserved.
@gumblex
gumblex / 词性标记.md
Last active July 12, 2022 07:05 — forked from luw2007/词性标记.md
词性标记: 包含 ICTPOS3.0词性标记集、ICTCLAS 汉语词性标注集、jieba 字典中出现的词性、simhash 中可以忽略的部分词性

词的分类

  • 实词:名词、动词、形容词、状态词、区别词、数词、量词、代词
  • 虚词:副词、介词、连词、助词、拟声词、叹词。

ICTPOS3.0词性标记集

n 名词

nr 人名

@gumblex
gumblex / getparallel.py
Last active June 4, 2022 21:58
Convert Tatoeba dumps into a SQLite database.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
'''
Get parallel corpus in Moses-style text from converted Tatoeba SQLite database.
Copyright (c) 2016 gumblex
This work is free. You can redistribute it and/or modify it under the
terms of the Do What The Fuck You Want To Public License, Version 2,
@gumblex
gumblex / ip.sh
Last active June 4, 2022 21:58
CGI script: Show visitor's IP and User Agent
#!/bin/bash
echo 'Status: 200 OK'
echo 'Content-Type: text/html; charset=utf-8'
echo
echo '<!DOCTYPE html>'
echo '<html><head>'
echo '<meta http-equiv="content-type" content="text/html; charset=UTF-8">'
echo '<meta name="description" content="'"$HTTP_USER_AGENT"'">'
# we use CloudFlare
@gumblex
gumblex / progress-all.sh
Created May 3, 2017 03:28
Find out each file position opened by a process
#!/bin/bash
for fd in /proc/$1/fd/*; do
if [ ! -f "$fd" ]; then continue; fi
fdnum=$(basename "$fd")
fdinfo=/proc/$1/fdinfo/$fdnum
name=$(readlink "$fd")
size=$(stat -c "%s" "$name" 2>/dev/null || stat -c "%s" "$fd")
progress=$(grep ^pos "$fdinfo" | awk '{print $2}')
if [ $size -eq "0" ]; then
echo '['$fdnum']' "$name"':' $progress'/'$size
@gumblex
gumblex / ansi_mandelbrot.py
Created May 5, 2016 05:57
Mandelbrot ASCII art from PyPy (independent version)
import os
import sys
import colorsys
"""
Black 0;30 Dark Gray 1;30
Blue 0;34 Light Blue 1;34
Green 0;32 Light Green 1;32
Cyan 0;36 Light Cyan 1;36
Red 0;31 Light Red 1;31
@gumblex
gumblex / zip64.py
Created November 14, 2015 11:16
Simple Python command line utility to create Zip64 files.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Simple command line utility to create Zip64 files.
For Python 3.3+
Most code are from the standard library `zipfile` and `shutil`.
"""
@gumblex
gumblex / 69-language-selector-zh-cn.conf
Created February 16, 2014 05:14
Ubuntu 界面中文字体fontconfig配置:文泉驿微米黑
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<match target="pattern">
<test name="lang">
<string>zh-cn</string>
</test>
<test qual="any" name="family">
<string>serif</string>