Skip to content

Instantly share code, notes, and snippets.

package com.example
import org.apache.spark.sql.functions.avg
import org.apache.spark.sql.SparkSession
object IsolationLevelExperiment {
def main(args: Array[String]): Unit = {
// Prepare SparkSession
val spark = SparkSession
.builder()
@soonraah
soonraah / query_execution_listener_example.scala
Created June 14, 2021 01:29
A sample code for QueryExectionListener from Spark 3.0.0
// Register listener
spark
.listenerManager
.register(new QueryExecutionListener {
override def onSuccess(funcName: String, qe: QueryExecution, durationNs: Long): Unit = {
val num = qe.observedMetrics
.get("my_metrics")
.map(_.getAs[Long]("num"))
.getOrElse(-100.0)
@soonraah
soonraah / edit_360_degree_video_by_ffmpeg.md
Last active April 6, 2024 10:31
ffmpeg で360度動画を編集したときのメモ

ffmpeg で360度動画を編集

RICOH THETA V でダイビングの様子を撮影した360度動画に対して Mac で色補正などの編集を行いたかった。 適当なフリーソフトが見つからなかったのでいろいろ調べながら ffmpeg で編集した。 そのメモを残しておく。

条件など

  • input
  • mp4 形式 (THETA V で撮影した360度動画をそのまま取得)
@soonraah
soonraah / LinearClassifier.scala
Created June 5, 2016 11:07
A base class of online learning for binary linear classification
package mlp.onlineml.classification.binary
import breeze.linalg.{DenseMatrix, DenseVector}
/**
* A base class of binary linear classification
*
* @param w weight vector
* @param sigma covariance matrix
*/
@soonraah
soonraah / em_vs_mcmc.py
Created October 5, 2014 18:11
To compare EM algorithm and MCMC on GMM training.
import numpy as np
from sklearn import cross_validation, mixture
import pickle
import os
import pystan
import time
import matplotlib.pyplot as plt
def dump_stan_model(stan_model, compiled_file_name):
@soonraah
soonraah / multi_dimension_gmm_diagonal.stan
Created October 5, 2014 16:22
Stan code to train multi dimensional GMM (Gaussian Mixture Model) with diagonal covariance.
data {
int<lower=1> D; // number of dimensions
int<lower=1> N; // number of samples
int<lower=1> M; // number of mixture components
vector[D] X[N]; // data to train
}
parameters {
simplex[M] weights; // mixture weights
vector[D] mu[M]; // means
vector<lower=0.0>[D] sigma[M]; // standard deviation
@soonraah
soonraah / train_gmm.py
Created August 30, 2014 11:59
Python code to train GMM by PyStan.
# -*- coding: utf-8 -*-
from sklearn.datasets import make_classification
import numpy as np
import matplotlib.pyplot as plt
import pystan
NUM_MIXTURE_COMPONENTS = 4
NUM_DIMENSIONS = 2
@soonraah
soonraah / multi_dimensional_gmm.stan
Last active August 29, 2015 14:05
Stan code for multi dimension GMM with full covariance
data {
int<lower=1> D; // number of dimensions
int<lower=1> N; // number of samples
int<lower=1> M; // number of mixture components
vector[D] X[N]; // data to train
}
parameters {
simplex[M] weights; // mixture weights
vector[D] mu[M]; // means
cov_matrix[D] sigma[M]; // covariance matrix
@soonraah
soonraah / do_skmeans.R
Last active August 29, 2015 14:00
skmeans パッケージによるコサイン距離ベースの k-means 実行
library(slam)
library(skmeans)
# CSV からデータの読み込み
x <- read.csv("data.csv")
# skmeans で扱えるデータ構造(simple triplet matrix)に変換
s <- as.simple_triplet_matrix(x)
# k-means 実行
@soonraah
soonraah / DrawColoredPointDiagram
Created January 5, 2014 06:13
散布図で各点の色を指定する方法
library(ggplot2)
# CSV 読み込み
color_plot_data <- read.csv("points.csv")
# 座標データとして x, y 列、色データとして r, g, b 列を与えて散布図を描画
ggplot(data=color_plot_data, aes(x=x, y=y, col=rgb(r, g, b))) +
geom_point() +
scale_color_identity()