Skip to content

Instantly share code, notes, and snippets.

@naoa
naoa / category_jpfw5.php
Last active December 30, 2015 12:19
This script has the feature that extract wikipedia japanese category . %php category_jpfw5.php <database> <output_file>
<?php
$db = "127.0.0.1";
$db_name = $argv[1];
$table = "category";
$username = "mysql";
$password = "";
$output_file = $argv[2];
@naoa
naoa / index_dump.rb
Last active December 29, 2015 08:49
Groonga index dump
require "groonga"
database_path = ARGV[0]
ft_index = ARGV[1]
output_file = ARGV[2]
Groonga::Database.open(database_path)
index_column_name = "#{ft_index}.index"
index = Groonga[index_column_name]
index.table.open_cursor do |table_cursor|
index.open_cursor(table_cursor) do |cursor|
current_term_id = nil
@naoa
naoa / wiki_fts_1.php
Last active December 29, 2015 05:09
This script has the feature that select full text search from line text in text_file. %php wiki_fts_1.php <database> <text_file> <output_file>
<?php
$db = "127.0.0.1";
$db_name = $argv[1];
$table = "text";
$index = "title,text";
$username = "mysql";
$password = "";
$category_file = $argv[2];
@naoa
naoa / wiki_page_import.php
Last active December 29, 2015 04:49
This script has the feature that import wikipedia page XML. %php wiki_page_import.php <database> <wiki_articles_xml_dir> <output_file>
<?php
$db = "localhost";
$db_name = $argv[1];
$table = "text";
$username = "mysql";
$password = "";
$article = $argv[2];
$output_file = $argv[3];
#include <stdio.h>
int main(){
printf("hello world!\n");
}
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <groonga.h>
int
main(int argc, char **argv)
{
grn_ctx ctx;
grn_obj *db, *table, *column, *key_type, *value_type;
table_create Tags TABLE_PAT_KEY ShortText
[[0,0.0,0.0],true]
table_create Memos TABLE_HASH_KEY ShortText
[[0,0.0,0.0],true]
column_create Memos tags COLUMN_VECTOR Tags
[[0,0.0,0.0],true]
load --table Memos
[
{"_key": "Rroonga", "tags": ["Groonga", "Ruby"]},
{"_key": "Groonga", "tags": ["Groonga"]},
@naoa
naoa / size.sh
Last active August 29, 2015 14:14
% du -hc /share/all/post/base/36915/ #PGroonga 実際利用しているサイズ
518M /share/all/post/base/36915/
518M 合計
% ls -sl /share/all/post/base/36915/ |awk '{i+=$1} END{print i/1024}' #スパースを考慮して実際利用しているサイズ
517.461
% ls -sl /share/all/post/base/36915/pgrn* |awk '{i+=$1} END{print i/1024}' # Groongaのみが実際に利用しているサイズ
425.984
<?php
$db = "localhost";
$db_name = $argv[1];
$table = "text";
$article = $argv[2];
if ($handle = opendir($article)) {
while (false !== ($file = readdir($handle))) {
echo "-------$file------\n";
@naoa
naoa / groonga-token-counter.c
Last active August 29, 2015 14:08
gcc src/index_sample.c -o index_sample -Wall -O2 -lgroonga -I/usr/include/groonga
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <groonga.h>
#include <groonga/nfkc.h>
/*
Wikipedia ja 30万件 3.8G
real 1m12.745s
user 0m58.432s