Skip to content

Instantly share code, notes, and snippets.

View okumin's full-sized avatar

okumin okumin

View GitHub Profile

How PlanMapper works

Purpose

PlanMapper helps Hive regenerate better query plans using runtime stats. It groups entities which are semantically the same. For example, A RelNode of Calcite to express WHERE id = 1 could be equivalent with a FilterOperator of Hive. A CommonMergeJoinOperator could be linked to a MapJoinOperator converted from the CommonMergeOperator.

Groups generated by PlanMapper express such relationship so that it can propagate the final runtime stats to RelNodes or Operators in each step. https://cwiki.apache.org/confluence/display/Hive/Query+ReExecution

Flow

@okumin
okumin / main.md
Last active October 19, 2023 15:11

Overview

In HIVE-12679, we have been trying to introduce a feature to make IMetaStoreClient pluggable. This document is a summary of the past discussions.

Problem statement

Apache Hive hardcodes the implementation of IMetaStoreClient, assuming it alreays talks to Hive Metastore. 99% of Hive users doesn't have any problems because they use HMS as a data catalog. However, some data platforms and their users use alternaive services as data catalogs.

  • Amazone EMR provides an option to use AWS Glue Data Catalog
  • Treasure Data deploys Apache Hive integrated with their own in-house data catalog
@okumin
okumin / keybase.md
Last active November 18, 2022 13:37

Keybase proof

I hereby claim:

  • I am okumin on github.
  • I am okumin (https://keybase.io/okumin) on keybase.
  • I have a public key ASBk4J-iYG52Dmu-OCTE37-7M9-YuYo6hxordcz7zr1QfQo

To claim this, I am signing this object:

Summary

MySQL

/ persistAsync persist recover
akka-2.3 1547 7964 450
akka-2.4-rc1 1504 9051 390
akka-2.4-batched 702 8806 410
akka-2.4-seq 615 11711 402
def findMofu(id: Int): Future[CacheError | IOError | NotFound, Mofu] = ???
def createMofu(mofu: Mofu): Future[CacheError | IOError | DuplicateError, Mofu] = ???

val result = findMofu(5).recoverWith {
  case NotFound =>
    createMofu(Mofu(5)).recoverWith {
      case DuplicateError => UnknownError
    }
}

8 Lazy Rebuilding

  • 1章: 導入
  • 2〜3章: 関数型データ構造の基本的な部分についての紹介
  • 4〜7章: 遅延評価と償却の関係 ← ここまでやった
  • 8〜11章: 関数型データ構造を設計するための汎用的なテクニック ← 残りはこの部分

この章でやること

  • batch rebuilding
class MofuSpec extends WordSpec with GeneratorDrivenPropertyChecks {
def measure(seq: Seq[Int], heap: Heap[Int]): Unit = {
val h = seq.foldLeft(heap) { (h, x) => h.insert(x) }
val h2 = seq.foldLeft(h) { (h, x) => h.deleteMin() }
assert(h2.isEmpty)
}
"mofu" should {
"fumo" in {
val initial = 300000
@okumin
okumin / Exercises.md
Last active August 29, 2015 14:14
Purely Functional Data Structures: 第2章

Excercises

スケイラですみません(◞‸◟) もちがってる可能性があります。

Exercise 2.1

def suffixes[A](xs: List[A]): List[List[A]] = xs match {
  case Nil => List(Nil)
@okumin
okumin / akka-persistence.md
Created September 28, 2014 08:55
akka-persistenceのプラグインをつくろう