Takuya Kitazawa takuti

## td-spark-usage.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              4 stars
            
          
                xerial
                / td-spark-usage.md
            
            
              Last active
              September 20, 2022 11:19
            
              
                td-spark usage notes
              
          
    td-spark usage notes


Official Doc

What You Can Do With td-spark


Reading and writing tables in TD through DataFrames of Spark.
Running Spark SQL queries against DataFrames.
Submitting Presto SQL queries to TD and reading the query results as DataFrame.
If you use PySpark, you can use both Spark's DataFrames and Pandas DataFrames interchangeably.


## draft.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              2 stars
            
          
                zyfnhct
                / draft.md
            
            
              Created
              September 27, 2018 06:28
                — forked from kumarbhrgv/draft.md
            
              
                Diversity in Recommendation Systems
              
          
    Improvising diversity of personalized recommendation systems

Recent Research papers:


Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

107 Citations :  IEEE Transactions on Knowledge and Data Engineering

we introduce and explore a number of item ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating data sets and different rating prediction algorithms


Recommendation Diversification Using Explanations: (Data Engineering, 2009. ICDE '09. IEEE 25th International Conference)


Traditionally, the problem is addressed through attribute-based diversification grouping items in the result set that share many common attributes (e.g., genre for movies) and selecting only a limited number of items from each group. It is, however,

  
## draft.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              10 stars
            
          
                kumarbhrgv
                / draft.md
            
            
              Last active
              February 1, 2022 14:47
            
              
                Diversity in Recommendation Systems
              
          
    Improvising diversity of personalized recommendation systems

Recent Research papers:


Improving Aggregate Recommendation Diversity Using Ranking-Based Techniques:

107 Citations :  IEEE Transactions on Knowledge and Data Engineering

we introduce and explore a number of item ranking techniques that can generate substantially more diverse recommendations across all users while maintaining comparable levels of recommendation accuracy. Comprehensive empirical evaluation consistently shows the diversity gains of the proposed techniques using several real-world rating data sets and different rating prediction algorithms


Recommendation Diversification Using Explanations: (Data Engineering, 2009. ICDE '09. IEEE 25th International Conference)


Traditionally, the problem is addressed through attribute-based diversification grouping items in the result set that share many common attributes (e.g., genre for movies) and selecting only a limited number of items from each group. It is, however,

  
## auc.py
def auc(num_positives, num_negatives, predicted):
    l_sorted = sorted(range(len(predicted)),key=lambda i: predicted[i],
                      reverse=True)
    fp_cur = 0.0
    tp_cur = 0.0
    fp_prev = 0.0
    tp_prev = 0.0
    fp_sum = 0.0
    auc_tmp = 0.0
    last_score = float("nan")

## update_json.sql
update tests_summary_data set data = (jsonb_set(to_jsonb(data), '{misc,gap,pa}', '-1', false))::json where data->'misc'->'gap'->>'pa' = '0';

## s3gzip.py
# vim: set fileencoding=utf-8 :
#
# How to store and retrieve gzip-compressed objects in AWS S3
###########################################################################
#
#   Copyright 2015 Vince Veselosky and contributors
#
#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at

## gist:a1c81c37fa023516d23e
#!/bin/bash

# const
USAGE="[USAGE]
$ ./setup <username> <reponame> <ip>

[Requirement]
exec by root

[Example]

## movielens_20m.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              2 stars
            
          
                myui
                / movielens_20m.md
            
            
              Last active
              August 29, 2015 14:22
            
          
    First of all, make sure that your Treasure Data cluster is HDP2, not CDH4.
Matrix Factorization is only supported in the up-to-date HDP2 cluster.
HDP2 is allocated for users who signed Treasure Data after Feb 2015. CDH4 is allcoated for the others.
NOTE: please ask our customer support to use HDP2 if you get an error.
Data preparation

Download ml-20m.zip and unzip it.

  
## chain-of-responsibility.rb
class PurchaseApprover
  # Implements the chain of responsibility pattern. Does not know anything
  # about the approval process, merely whether the current handler can approve
  # the request, or must pass it to a successor.
  attr_reader :successor

  def initialize successor
    @successor = successor
  end

## fluentd_hacking_guide.md

      
              1 file
            
          
              7 forks
            
          
              0 comments
            
          
              160 stars
            
          
                sonots
                / fluentd_hacking_guide.md
            
            
              Last active
              August 30, 2021 05:57
            
              
                Fluentd ソースコード完全解説 (v0.10向け)
              
          
    Fluentd ソースコード完全解説

英題：Fluentd Hacking Guide
目次

30分しかないため斜線部分は今回省く

Fluentd の起動シーケンスとプラグインの読み込み
Fluentd の設定ファイルのパース
Input Plugin から Output Plugin にデータが渡る流れ
	def auc(num_positives, num_negatives, predicted):
	l_sorted = sorted(range(len(predicted)),key=lambda i: predicted[i],
	reverse=True)
	fp_cur = 0.0
	tp_cur = 0.0
	fp_prev = 0.0
	tp_prev = 0.0
	fp_sum = 0.0
	auc_tmp = 0.0
	last_score = float("nan")
	# vim: set fileencoding=utf-8 :
	#
	# How to store and retrieve gzip-compressed objects in AWS S3
	###########################################################################
	#
	# Copyright 2015 Vince Veselosky and contributors
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#!/bin/bash

	# const
	USAGE="[USAGE]
	$ ./setup <username> <reponame> <ip>

	[Requirement]
	exec by root

	[Example]
	class PurchaseApprover
	# Implements the chain of responsibility pattern. Does not know anything
	# about the approval process, merely whether the current handler can approve
	# the request, or must pass it to a successor.
	attr_reader :successor

	def initialize successor
	@successor = successor
	end