Skip to content

Instantly share code, notes, and snippets.

View takuti's full-sized avatar
🏃‍♂️
𓈒 𓂂𓏸𓋪‪

Takuya Kitazawa takuti

🏃‍♂️
𓈒 𓂂𓏸𓋪‪
View GitHub Profile
@xerial
xerial / td-spark-usage.md
Last active September 20, 2022 11:19
td-spark usage notes

td-spark usage notes

What You Can Do With td-spark

  • Reading and writing tables in TD through DataFrames of Spark.
  • Running Spark SQL queries against DataFrames.
  • Submitting Presto SQL queries to TD and reading the query results as DataFrame.
  • If you use PySpark, you can use both Spark's DataFrames and Pandas DataFrames interchangeably.
@zyfnhct
zyfnhct / draft.md
Created September 27, 2018 06:28 — forked from kumarbhrgv/draft.md
Diversity in Recommendation Systems

Improvising diversity of personalized recommendation systems

Recent Research papers:

  • Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

    107 Citations : IEEE Transactions on Knowledge and Data Engineering
    we introduce and explore a number of item ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating data sets and different rating prediction algorithms

  • Recommendation Diversification Using Explanations: (Data Engineering, 2009. ICDE '09. IEEE 25th International Conference)

Traditionally, the problem is addressed through attribute-based diversification grouping items in the result set that share many common attributes (e.g., genre for movies) and selecting only a limited number of items from each group. It is, however,

@kumarbhrgv
kumarbhrgv / draft.md
Last active February 1, 2022 14:47
Diversity in Recommendation Systems

Improvising diversity of personalized recommendation systems

Recent Research papers:

  • Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

    107 Citations : IEEE Transactions on Knowledge and Data Engineering
    we introduce and explore a number of item ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating data sets and different rating prediction algorithms

  • Recommendation Diversification Using Explanations: (Data Engineering, 2009. ICDE '09. IEEE 25th International Conference)

Traditionally, the problem is addressed through attribute-based diversification grouping items in the result set that share many common attributes (e.g., genre for movies) and selecting only a limited number of items from each group. It is, however,

@myui
myui / auc.py
Last active June 8, 2018 08:08
def auc(num_positives, num_negatives, predicted):
l_sorted = sorted(range(len(predicted)),key=lambda i: predicted[i],
reverse=True)
fp_cur = 0.0
tp_cur = 0.0
fp_prev = 0.0
tp_prev = 0.0
fp_sum = 0.0
auc_tmp = 0.0
last_score = float("nan")
@faulker
faulker / update_json.sql
Created October 27, 2016 18:40
Example of how to update a Postgresql JSON field using jsonb_set
update tests_summary_data set data = (jsonb_set(to_jsonb(data), '{misc,gap,pa}', '-1', false))::json where data->'misc'->'gap'->>'pa' = '0';
@veselosky
veselosky / s3gzip.py
Last active May 8, 2023 21:42
How to store and retrieve gzip-compressed objects in AWS S3
# vim: set fileencoding=utf-8 :
#
# How to store and retrieve gzip-compressed objects in AWS S3
###########################################################################
#
# Copyright 2015 Vince Veselosky and contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
@ganmacs
ganmacs / gist:a1c81c37fa023516d23e
Last active August 29, 2015 14:23
cloud create lxc an git repository
#!/bin/bash
# const
USAGE="[USAGE]
$ ./setup <username> <reponame> <ip>
[Requirement]
exec by root
[Example]

First of all, make sure that your Treasure Data cluster is HDP2, not CDH4. Matrix Factorization is only supported in the up-to-date HDP2 cluster. HDP2 is allocated for users who signed Treasure Data after Feb 2015. CDH4 is allcoated for the others.

NOTE: please ask our customer support to use HDP2 if you get an error.

Data preparation

Download ml-20m.zip and unzip it.

@martindemello
martindemello / chain-of-responsibility.rb
Created February 20, 2015 21:30
chain of responsibility example in ruby
class PurchaseApprover
# Implements the chain of responsibility pattern. Does not know anything
# about the approval process, merely whether the current handler can approve
# the request, or must pass it to a successor.
attr_reader :successor
def initialize successor
@successor = successor
end
@sonots
sonots / fluentd_hacking_guide.md
Last active August 30, 2021 05:57
Fluentd ソースコード完全解説 (v0.10向け)

Fluentd ソースコード完全解説

英題:Fluentd Hacking Guide

目次

30分しかないため斜線部分は今回省く

  • Fluentd の起動シーケンスとプラグインの読み込み
  • Fluentd の設定ファイルのパース
  • Input Plugin から Output Plugin にデータが渡る流れ