Skip to content

Instantly share code, notes, and snippets.

View zxteloiv's full-sized avatar

Haruki Kirigaya zxteloiv

View GitHub Profile
@zxteloiv
zxteloiv / merge_sort_parallel.py
Last active July 24, 2023 02:40
merge sort in parallel with the multiprocessing pool. But it seems slower than the single-core version.
from typing import Callable, Optional, Any, Union
import math
import multiprocessing
import random
import sys
import time
from functools import partial
KeyFuncType = Callable[[Any], Union[float, int]]
@zxteloiv
zxteloiv / nltk_utils.sh
Last active March 25, 2022 06:12
download the bunch of nltk data instead of the nltk.downloader module
#!/bin/bash
# refer to https://www.nltk.org/data.html for more information;
# this gist belongs to the "manual installation" section.
function nltk_fetch {
curl https://www.nltk.org/nltk_data/ | ggrep -Po 'url=[^ ]*' | awk -F '"' '/githubuser/ {print $2}' | awk -F 'packages/' '{print $2" "$0}' > copora-name-url.txt;
}
function nltk_download_item {
mkdir -p nltk_data/$(dirname $1);
@zxteloiv
zxteloiv / nested_number_list_to_tensors.py
Created January 12, 2021 10:28
Convert nested number lists in to pytorch tensors, which will be useful for batching a bunch of tensors, during text data processing.
from collections import defaultdict
import torch
def _nested_number_list_to_tensors(nested: list, padding=0, example=None):
"""Turn a list of list of list of list ... of integers to a tensor with the given padding"""
ndim_max = defaultdict(lambda: 0)
def _count_nested_max(nested, depth):
if not isinstance(nested, list):
return
@zxteloiv
zxteloiv / crossfold.sh
Created June 2, 2016 09:15
do 10-fold cross-validation using libsvm and the iris dataset.
#!/bin/bash
libsvmpath="./libsvm-3.21"
# 1. build the libsvm
cd $libsvmpath
make
if [ $? -ne 0 ]; then
echo "failed to build, exit";
@zxteloiv
zxteloiv / Caps2Ctrl.map
Created April 12, 2016 02:09
swap caps lock and ctrl
keymaps 0-2,4-5,8,12
keycode 58 = Control #This makes Caps act as Ctrl
keycode 29 = Caps_Lock #This makes Ctrl act as Caps
# alt_is_meta #This fixes the Alt key
@zxteloiv
zxteloiv / debugheader.h
Created March 3, 2016 01:31
C++ memory leak detection header. Got from Kingsoft many years ago.
/* -------------------------------------------------------------------------
// 文件名 : debugheader.h
// 创建者 :
// 创建时间 : 2008-3-21 12:06:48 pm
// 功能描述 :
//
// $Id: $
// -----------------------------------------------------------------------*/
#ifndef __DEBUGHEADER_H__
#define __DEBUGHEADER_H__
@zxteloiv
zxteloiv / GFW_ADBlock.conf
Created February 23, 2016 03:03 — forked from bao3/GFW_ADBlock.conf
由于 Surge for ios 的条目数量不能过大,导致手机内存不足,所以将 GFW 和 广告过滤集中后进行了人工挑选,压缩在 1万条左右,你仍然可以自己添加大约 900条规则,理论上足够你长久使用了,配置文件默认是直连,方便普通国内用户使用。配置文件中,我自己调整过的、有可能有误伤的条目我都放在了前面,方便你自己修改,不过运行自 2016/01 ~ 2016/02 我自己日常没有遇到问题。有误伤不要怕,打开软件可以自己调试添加。
[General]
#我的注释都是 # 开头,所以如果你用 vim,直接 :g/^#/d 就可以一次性清除所有注释
#开头这段skip包含以下几个目的:1,私网IP跳过,提高内网性能;2,苹果的一些服务跳过,比如公共热点wifi要先测试captive.apple.com。可解决很多内网的 TCP毛病,例如 kodi remote软件无法遥控
skip-proxy = 10.0.0.0/8,169.254/16,172.16.0.0/12,192.168.0.0/16,224.0.0.0/4, localhost, *.local,api.smoot.apple.com,configuration.apple.com,xp.apple.com,smp-device-content.apple.com,guzzoni.apple.com,captive.apple.com,*.ess.apple.com,*.push.apple.com,*.push-apple.com.akadns.net
#下面这一段则是完全跳过 Surge,最重要的一个是让 UDP包可以传输,解决很多内网毛病,例如DLNA,NFS或者btsync等组播类/UDP类应用
bypass-tun = 10.0.0.0/8, 169.254.0.0/16, 172.16.0.0/12, 192.168.0.0/16, 224.0.0.0/4, 0.0.0.0/8, 1.0.0.0/9, 1.160.0.0/11, 1.192.0.0/11, 10.0.0.0/8, 14.0.0.0/11, 14.96.0.0/11, 14.128.0.0/11, 14.192.0.0/11, 27.0.0.0/10, 27.96.0.0/11, 27.128.0.0/9, 36.0.0.0/10, 36.96.0.0/11, 36.128.0.0/9, 39.0.0.0/11, 39.64.0.0/10, 39.128.0.0/10, 42.0.0.0/8, 43.224.0.0/11, 45.64.0.0/10, 47.64.0.0/10, 49.0.0.0/9, 49.128.0.0/11, 49.192.0.0/10, 54.192.0.0/11, 58.0.0.0/9, 58.128.0.0/11, 58.192.0.0/10, 59.32.0.0/11, 5
@zxteloiv
zxteloiv / unzip_chn.py
Created December 21, 2015 10:25
unzip in another encoding
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
import os
import sys
import zipfile
if len(sys.argv) == 1:
sys.exit(0)
@zxteloiv
zxteloiv / HugeInt.cpp
Created December 3, 2015 09:53
An integer class with high precision support using strings
// test2.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string> // 用字符串来表示HugeInt,方便输出
using namespace std;
@zxteloiv
zxteloiv / extractpptx.py
Created November 27, 2015 11:55
extract text from pptx files using pptx library
#!/usr/bin/env python2
# coding: utf-8
from pptx import Presentation
import chardet
import sys
def main(filename):
prs = Presentation(filename)