Skip to content

Instantly share code, notes, and snippets.

Making use of the Stanford NLP Socket Server

Stanford NER tagger can be started listening to socket easily, which is documented as in the README file.

java -mx1000m -cp $HOME/resources/stanford/tagger/stanford-ner.jar edu.stanford.nlp.ie.NERServer -loadClassifier $HOME/resources/stanford/tagger/classifiers/english.all.3class.distsim.crf.ser.gz -port 1234

The POS tagger also has a built-in MaxentTaggerServer, however, we cannot directly use it.

@cllu
cllu / aync-http-requests.py
Created December 11, 2014 10:38
Python 3 asynchronous HTTP request with aiohttp and asyncio
import asyncio
import aiohttp
def _crawl_url(url):
try:
resp = yield from asyncio.wait_for(aiohttp.request('GET', url, allow_redirects=True), 10)
""":type resp: aiohttp.client.ClientResponse"""
resp.text = yield from asyncio.wait_for(resp.text(), 10)
return resp
@cllu
cllu / cllu_pinyin.dict.yaml
Last active January 11, 2016 18:47
Rime Custom Schema
# Rime dictionary
# encoding: utf-8
---
name: cllu_pinyin
version: "2014.12.24"
sort: by_weight
use_preset_vocabulary: true
# import dict from luna_pinyin.dict.yaml
import_tables:
@cllu
cllu / wechat2txt.py
Last active August 29, 2015 14:20 — forked from scturtle/wechat2txt.py
import os
import sys
import re
import hashlib
import csv
import time
import locale
import getopt
@cllu
cllu / README.md
Last active November 25, 2015 07:18
LeetCode OJ Tampermonkey script

LeetCode OJ modifier

  • hide the LeetCode Premium Subscription and Books links on the top navigation bar, since I have already subscribed.
  • hide the chat link, hide the footer
  • hide the FAQ block on the right column for the discussion page
  • display the question id on the problem page
  • display number of solved/total problems on the /company/ page
@cllu
cllu / README.md
Last active September 25, 2016 10:21
GitHub Wiki TamperMonkey script

GitHub Wiki TamperMonkey script

  • add word count
  • remove unused stuff
  • add Table of Content to the right column
  • Add keyboard shortcuts: double click content area to edit

The script matches only wikis on my own GitHub repos, change it according to your preferences:

@cllu
cllu / auto-connect-cuhk.py
Last active September 30, 2016 09:21
CUHK Network auto connection Python script
#!/usr/local/bin/python3
import requests
USER = "USER"
PASSWORD = "PASSWORD"
def login():
"""Post the login info to the CUHK authentication server"""
url = "https://securelogin.net.cuhk.edu.hk/cgi-bin/login"
@cllu
cllu / hn_seach.js
Last active December 9, 2015 08:43 — forked from meiamsome/hn_search.js
HackerNews Who is Hiring TamperMonkey Script
// ==UserScript==
// @name HackerNews WhosHiring
// @namespace http://tampermonkey.net/
// @version 0.1
// @description try to take over the world!
// @author You
// @match https://news.ycombinator.com/item?id=*
// @grant none
// ==/UserScript==
/* jshint -W097 */
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>Draft • Decorators</title>
<link rel="stylesheet" href="../../dist/Draft.css" />
</head>
<body>
<div id="target"></div>
<script src="../../node_modules/react/dist/react.js"></script>
@cllu
cllu / CustomParser.php
Last active March 31, 2016 06:21
MediaWiki custom parser
<?php
# Confirm MediaWiki environment
if (!defined('MEDIAWIKI')) die();
# Credits
$wgExtensionCredits['other'][] = array(
'name'=>'CustomParser',
'author'=>'Chunliang Lyu',
'url'=>'https://www.mediawiki.org/wiki/Extension:CustomParser',