View iterate.fish
xargs --verbose --no-run-if-empty -n1 --arg-file ~/src/d/naughty-booleans-sources.txt clang-query-6.0 -f (egrep -v '^//' ~/src/d/naughty-booleans-clang-query.txt | psub)
parallel --no-run-if-empty --keep-order -n1 --arg-file ~/src/d/naughty-booleans-sources.txt clang-query-6.0 --extra-arg="-fcolor-diagnostics" -f (egrep -v '^//' ~/src/d/naughty-booleans-clang-query.txt | psub)
View generate_reflow_diff.bash
#!/bin/bash
set -e -u
_main() {
cat > .clang-format <<YAML
---
Language: Cpp
BasedOnStyle: Google
AllowShortFunctionsOnASingleLine: None
View uber_headers.txt
Jesses-MacBook-Air ~/s/p/gporca (master=)> git describe --tags
v2.55.8
Jesses-MacBook-Air ~/s/p/gporca (master=)> ninja -C build -t deps | grep -F ../ | grep -v -F libgpos | sort | uniq -c | sort -nr
897 ../libnaucrates/include/naucrates/dxl/gpdb_types.h
895 ../libnaucrates/include/naucrates/md/IMDId.h
881 ../libnaucrates/include/naucrates/dxl/operators/CDXLDatum.h
876 ../libnaucrates/include/naucrates/md/CDXLStatsDerivedRelation.h
876 ../libnaucrates/include/naucrates/md/CDXLStatsDerivedColumn.h
876 ../libnaucrates/include/naucrates/md/CDXLBucket.h
871 ../libnaucrates/include/naucrates/dxl/operators/CDXLOperator.h
View dtrace.sh
# wait for gporca_test to launch
dtrace -xmangled -n 'pid$target::__ZN4gpos9CRefCount6AddRefEv:entry { @["AddRef"]=count(); }' -W gporca_test
# list all probes
dtrace -xmangled -l -n 'pid$target:gporca_test::' -n 'pid$target:libgpopt.3.dylib::' -n 'pid$target:libnaucrates.3.dylib::' -n 'pid$target:libgpdbcost.3.dylib::' -n 'pid$target:libgpos.3.dylib::' -p (pgrep gporca_test) > /tmp/gporca_test_probes.txt
# dynamically trace an already running orca process
dtrace -xmangled -n 'pid$target::__ZN4gpos9CRefCount6AddRefEv:return { @["AddRef"]=count(); }' -p (pgrep gporca_test)
View DQA.md

Query

DDL:

CREATE TABLE foo (a int, b int, c int) DISTRIBUTED BY (a);

Query:

SELECT aggfn(DISTINCT b) FROM foo;
View split_row.sql
CREATE TABLE rank_exc(
id int,
year int,
gender char(1)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
SUBPARTITION BY RANGE (year)
SUBPARTITION TEMPLATE (
SUBPARTITION year1 START (2001),
View CXX11_and_ORCA.md

Mordern C++

New language features that may boost productivity

  1. rvalue references, move assignment operator, and move constructors
  2. (using type aliases)[http://en.cppreference.com/w/cpp/language/type_alias]
  3. range-based for-loops
  4. auto
  5. Variadic templates

Standardized features that were GCC extensions

View CMakeLists.txt
cmake_minimum_required(VERSION 3.0)
project(gpdb5 LANGUAGES C CXX)
include_directories(src/include)
include_directories(/usr/local/opt/openssl/include)
file(GLOB_RECURSE HEADERS
src/include/*.h
)
file(GLOB_RECURSE SRC_FILES
View YOLO.md

What challenge are we facing?

  1. As an ORCA pair adding a backwards-compatible change, we'd like to be able to rebuild Greenplum without code change so as to see our change in effect in the database.
  2. As a non-ORCA developer working on Greenplum Database, I'd like ./configure to fail if my local ORCA version will cause a compilation failure.

A modest proposal

Use semver to convey breaking changes:

  1. Let's start with Greenplum stating an "required version" of 2.9.1
  2. If Venky improved some implementation of stats (interface preserving), he bumps ORCA version from 2.9.1 to 2.10.0. Venky should be able to rebuild GPDB without changing it to effect his optimizer change. (Implied: 2.10.0 satsifies a "required version" of 2.9.1.
View create_unique_path.md

We need to keep our diff here, maybe?

  1. costsize.c
  • join_in_selectivity
  • set_joinrel_size_estimates

Things changed here: joinpath.c

  • hash_inner_and_outer is split into two: 0. hashclauses_for_join which takes the bulk of the upstream code; AND