View split_row.sql
CREATE TABLE rank_exc(
id int,
year int,
gender char(1)
)
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
SUBPARTITION BY RANGE (year)
SUBPARTITION TEMPLATE (
SUBPARTITION year1 START (2001),
View CXX14_and_ORCA.md

New language features that may boost productivity

  1. rvalue references, move assignment operator, and move constructors
  2. range-based for-loops
  3. auto
  4. Variadic templates

Standardized features that were GCC extensions

  1. std::atomic instead of the legacy __sync GCC built-ins
View CMakeLists.txt
cmake_minimum_required(VERSION 3.0)
project(gpdb5 LANGUAGES C CXX)
include_directories(src/include)
include_directories(/usr/local/opt/openssl/include)
file(GLOB_RECURSE HEADERS
src/include/*.h
)
file(GLOB_RECURSE SRC_FILES
View YOLO.md

What challenge are we facing?

  1. As an ORCA pair adding a backwards-compatible change, we'd like to be able to rebuild Greenplum without code change so as to see our change in effect in the database.
  2. As a non-ORCA developer working on Greenplum Database, I'd like ./configure to fail if my local ORCA version will cause a compilation failure.

A modest proposal

Use semver to convey breaking changes:

  1. Let's start with Greenplum stating an "required version" of 2.9.1
  2. If Venky improved some implementation of stats (interface preserving), he bumps ORCA version from 2.9.1 to 2.10.0. Venky should be able to rebuild GPDB without changing it to effect his optimizer change. (Implied: 2.10.0 satsifies a "required version" of 2.9.1.
View create_unique_path.md

We need to keep our diff here, maybe?

  1. costsize.c
  • join_in_selectivity
  • set_joinrel_size_estimates

Things changed here: joinpath.c

  • hash_inner_and_outer is split into two: 0. hashclauses_for_join which takes the bulk of the upstream code; AND
View make_cluster.bash
env BLDWRAP_POSTGRES_CONF_ADDONS='fsync=off optimizer_disable_missing_stats_collection=on' make -C /build/gpdb4/gpAux/gpdemo
View inline.py
#!/usr/bin/env python3
# Usage:
# git grep -l '#include ".*\.inl"' | pypy3 inline.py -I libgpos/libgpos/include
# OR
# git grep -l '#include ".*\.inl"' | pypy3 inline.py -I libgpopt/include -I libnaucrates/include -I libgpdbcost/include -I server/include
import sys
import re
import os.path
View jesse_parallel.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--
CREATE TEMP TABLE foo (i int);
CREATE TEMP TABLE bar (j int);
SELECT * FROM foo UNION ALL SELECT * FROM bar;
-->
<dxl:DXLMessage xmlns:dxl="http://greenplum.com/dxl/2010/12/">
<dxl:Thread Id="0">
<dxl:OptimizerConfig>
<dxl:EnumeratorConfig Id="0" PlanSamples="0" CostThreshold="0"/>
View regression.diffs
*** ./expected/external_table.out Sat Jun 4 01:57:38 2016
--- ./results/external_table.out Sat Jun 4 01:57:39 2016
***************
*** 1452,1465 ****
(SELECT i, j FROM exttab_subtxs_1 WHERE i < 5 ) e1,
(SELECT i, j FROM exttab_subtxs_1 WHERE i < 10) e2
WHERE e1.i = e2.i;
! NOTICE: Found 4 data formatting errors (4 or more input rows). Rejected related input data.
! i | j
! ---+----------
View gist:d05a2e246a3eec10d046
https://dtb5pzswcit1e.cloudfront.net/product_files/Pivotal-CF/pcf-vsphere-1.4.2.0.ova?Expires=1432243477&Signature=QEAUQrad9KWq-~eNVCDhhVW7bIHzv48zdPNPg55oTda7~Qv82mFGCpUutRBo3Wdj3kBfFcAUhXJ7ffGTUwJHSQj92O3PdOYpD98ZKeFJCrpYYclNo8UOqFaN-8OOdlk~Sh2s1Ouaujddc8CBcoU2cQMI04BFaSEYS2-wxFU3Hps_&Key-Pair-Id=APKAJLAM6FL65BYZP7UQ