Skip to content

Instantly share code, notes, and snippets.

@d
d / uber_headers.txt
Last active August 20, 2020 04:25
include what you use
Jesses-MacBook-Air ~/s/p/gporca (master=)> git describe --tags
7.0.0-alpha.0-7231-g32446a321d49
Jesses-MacBook-Air ~/s/p/gporca (master=)> ninja -C cmake-build-debug-clang-12 -t deps | grep -F ../ | grep -v -F libgpos | sort | uniq -c | sort -nr
944 ../libnaucrates/include/naucrates/dxl/gpdb_types.h
942 ../libnaucrates/include/naucrates/md/IMDId.h
928 ../libnaucrates/include/naucrates/dxl/operators/CDXLDatum.h
923 ../libnaucrates/include/naucrates/md/CDXLStatsDerivedRelation.h
923 ../libnaucrates/include/naucrates/md/CDXLStatsDerivedColumn.h
923 ../libnaucrates/include/naucrates/md/CDXLBucket.h
918 ../libnaucrates/include/naucrates/dxl/operators/CDXLOperator.h
@d
d / dtrace.sh
Last active March 7, 2018 02:30
instrumenting
# wait for gporca_test to launch
dtrace -xmangled -n 'pid$target::__ZN4gpos9CRefCount6AddRefEv:entry { @["AddRef"]=count(); }' -W gporca_test
# list all probes
dtrace -xmangled -l -n 'pid$target:gporca_test::' -n 'pid$target:libgpopt.3.dylib::' -n 'pid$target:libnaucrates.3.dylib::' -n 'pid$target:libgpdbcost.3.dylib::' -n 'pid$target:libgpos.3.dylib::' -p (pgrep gporca_test) > /tmp/gporca_test_probes.txt
# dynamically trace an already running orca process
dtrace -xmangled -n 'pid$target::__ZN4gpos9CRefCount6AddRefEv:return { @["AddRef"]=count(); }' -p (pgrep gporca_test)
@d
d / DQA.md
Last active January 11, 2018 22:57
DQA

Query

DDL:

CREATE TABLE foo (a int, b int, c int) DISTRIBUTED BY (a);

Query:

SELECT aggfn(DISTINCT b) FROM foo;
@d
d / split_row.sql
Created December 7, 2017 19:03
split rows bug
CREATE TABLE rank_exc(
id int,
year int,
gender char(1)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
SUBPARTITION BY RANGE (year)
SUBPARTITION TEMPLATE (
SUBPARTITION year1 START (2001),
@d
d / CXX11_and_ORCA.md
Last active February 3, 2018 01:21
C++11/14 and ORCA

Mordern C++

New language features that may boost productivity

  1. rvalue references, move assignment operator, and move constructors
  2. (using type aliases)[http://en.cppreference.com/w/cpp/language/type_alias]
  3. range-based for-loops
  4. auto
  5. Variadic templates

Standardized features that were GCC extensions

@d
d / CMakeLists.txt
Last active October 26, 2017 22:48
Postgres/Greenplum CMakeLists.txt
cmake_minimum_required(VERSION 3.0)
project(gpdb5 LANGUAGES C CXX)
include_directories(src/include)
include_directories(/usr/local/opt/openssl/include)
file(GLOB_RECURSE HEADERS
src/include/*.h
)
file(GLOB_RECURSE SRC_FILES
@d
d / YOLO.md
Last active June 21, 2017 21:34
Problem Statement

What challenge are we facing?

  1. As an ORCA pair adding a backwards-compatible change, we'd like to be able to rebuild Greenplum without code change so as to see our change in effect in the database.
  2. As a non-ORCA developer working on Greenplum Database, I'd like ./configure to fail if my local ORCA version will cause a compilation failure.

A modest proposal

Use semver to convey breaking changes:

  1. Let's start with Greenplum stating an "required version" of 2.9.1
  2. If Venky improved some implementation of stats (interface preserving), he bumps ORCA version from 2.9.1 to 2.10.0. Venky should be able to rebuild GPDB without changing it to effect his optimizer change. (Implied: 2.10.0 satsifies a "required version" of 2.9.1.
@d
d / create_unique_path.md
Created April 19, 2017 00:52
A reading of GPDB before a big bang merge of SEMI JOIN from 84 (e006a24ad152b3faec748afe8c1ff0829699b2e6)

We need to keep our diff here, maybe?

  1. costsize.c
  • join_in_selectivity
  • set_joinrel_size_estimates

Things changed here: joinpath.c

  • hash_inner_and_outer is split into two:
  1. hashclauses_for_join which takes the bulk of the upstream code; AND
@d
d / make_cluster.bash
Last active December 10, 2016 05:50
Nikos this is what we did
env BLDWRAP_POSTGRES_CONF_ADDONS='fsync=off optimizer_disable_missing_stats_collection=on' make -C /build/gpdb4/gpAux/gpdemo
@d
d / inline.py
Last active July 11, 2016 18:14
Eliminate `.inl` files
#!/usr/bin/env python3
# Usage:
# git grep -l '#include ".*\.inl"' | pypy3 inline.py -I libgpos/libgpos/include
# OR
# git grep -l '#include ".*\.inl"' | pypy3 inline.py -I libgpopt/include -I libnaucrates/include -I libgpdbcost/include -I server/include
import sys
import re
import os.path