Skip to content

Instantly share code, notes, and snippets.

View okumin's full-sized avatar

okumin okumin

View GitHub Profile

How PlanMapper works

Purpose

PlanMapper helps Hive regenerate better query plans using runtime stats. It groups entities which are semantically the same. For example, A RelNode of Calcite to express WHERE id = 1 could be equivalent with a FilterOperator of Hive. A CommonMergeJoinOperator could be linked to a MapJoinOperator converted from the CommonMergeOperator.

Groups generated by PlanMapper express such relationship so that it can propagate the final runtime stats to RelNodes or Operators in each step. https://cwiki.apache.org/confluence/display/Hive/Query+ReExecution

Flow

@okumin
okumin / main.md
Last active October 19, 2023 15:11

Overview

In HIVE-12679, we have been trying to introduce a feature to make IMetaStoreClient pluggable. This document is a summary of the past discussions.

Problem statement

Apache Hive hardcodes the implementation of IMetaStoreClient, assuming it alreays talks to Hive Metastore. 99% of Hive users doesn't have any problems because they use HMS as a data catalog. However, some data platforms and their users use alternaive services as data catalogs.

  • Amazone EMR provides an option to use AWS Glue Data Catalog
  • Treasure Data deploys Apache Hive integrated with their own in-house data catalog
@okumin
okumin / keybase.md
Last active November 18, 2022 13:37

Keybase proof

I hereby claim:

  • I am okumin on github.
  • I am okumin (https://keybase.io/okumin) on keybase.
  • I have a public key ASBk4J-iYG52Dmu-OCTE37-7M9-YuYo6hxordcz7zr1QfQo

To claim this, I am signing this object:

@okumin
okumin / akka-persistence.md
Created September 28, 2014 08:55
akka-persistenceのプラグインをつくろう
def findMofu(id: Int): Future[CacheError | IOError | NotFound, Mofu] = ???
def createMofu(mofu: Mofu): Future[CacheError | IOError | DuplicateError, Mofu] = ???

val result = findMofu(5).recoverWith {
  case NotFound =>
    createMofu(Mofu(5)).recoverWith {
      case DuplicateError => UnknownError
    }
}
<?php
require_once "twitteroauth/twitteroauth/twitteroauth.php";
$client = new TwitterOauth(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET);
$next_cursor = -1;
$target = "";
$f = fopen("./hogehoge.txt", "w");
while($next_cursor != 0) {
for($i = 0 ; $i < 5; $i++) {
@okumin
okumin / unionfind.py
Created September 15, 2012 13:30
Union find
# -*- coding: utf-8 -*-
class UnionFind:
def __init__(self):
self.refs = {}
def find(self, x):
if x in self.refs:
ref = self.find(self.refs[x])
self.refs[x] = ref
@okumin
okumin / FeedParser
Created August 12, 2012 19:17
フィードをパースするクラス
<?php
/**
* Copyright (c) 2012 okumin, http://okumin.com/
*
* Permission is hereby granted, free of charge, to any person obtaining
* a copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to

Summary

MySQL

/ persistAsync persist recover
akka-2.3 1547 7964 450
akka-2.4-rc1 1504 9051 390
akka-2.4-batched 702 8806 410
akka-2.4-seq 615 11711 402