Skip to content

Instantly share code, notes, and snippets.

View yoshi0309's full-sized avatar

Takumi Yoshida yoshi0309

View GitHub Profile
@yoshi0309
yoshi0309 / App.java
Created May 19, 2014 01:26
Apache Tika 1.5 - AutoDetectParser example
package yoshida.tika_sample;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.AutoDetectParser;
@yoshi0309
yoshi0309 / 0_reuse_code.js
Created November 14, 2016 01:10
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
@yoshi0309
yoshi0309 / ExceptionHandler.java
Created July 17, 2018 06:38
CuratorFramework Experiment handling RuntimeException which throws in a Listener.
import java.util.concurrent.TimeUnit;
public class ExceptionHandler
implements Thread.UncaughtExceptionHandler {
private final long WAIT_TIME = 60L;
@Override public void uncaughtException(Thread thread, Throwable e) {
// -------------------------------------------------
// RuntimeException which was throwed in TestListener should handled here, but not.
@yoshi0309
yoshi0309 / deleteByQuery.py
Last active October 26, 2018 13:27
delete documents by query result for Amazon CloudSearch.
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
import urllib
import urllib2
import json
# you need to set your domain endpoints.
SEARCH_ENDPOINT = "XXXXX.us-east-1.cloudsearch.amazonaws.com"
DOCUMENT_ENDPOINT = "XXXXX.us-east-1.cloudsearch.amazonaws.com"
@yoshi0309
yoshi0309 / td-agent.conf
Created October 27, 2014 01:59
td-agent.conf for Solr log for parsing query log.
<source>
type tail
path /opt/solr/solr-4.9.0/example/logs/solr.log
pos_file /var/log/td-agent/solr.log.pos
tag raw.solr.log
format /^(?<loglevel>[^ ]*) (?<hyp>[^ ]*) (?<time>[^ ]* [^ ]*) (?<class>[^ ]*) \[(?<core>[^ ]*)\] webapp=\/(?<webapp>[^ ]*) path=\/(?<path>[^ ]*) params={(?<params>[^ ]*)} hits=(?<hits>[^ ]*) status=(?<status>[^ ]*) QTime=(?<Qtime>[^ ]*) /
time_format %Y-%m-%d %H:%M:%S.%L;
</source>
<match raw.solr.log>
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import itertools
import csv
import datetime
import time
from math import sqrt
from operator import add