Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save OlegGorj/96e093e4f1276eb69e0b3af8f56e530a to your computer and use it in GitHub Desktop.
Save OlegGorj/96e093e4f1276eb69e0b3af8f56e530a to your computer and use it in GitHub Desktop.
PostgreSQL has Foreign Data Wrappers. A very powerful feature. Greenplum
has External Tables. Also a very powerful feature. In fact, it's pretty
much the same feature. As we catch up with PostgreSQL, we need to figure
out what we're going to do about that. In the long run, we clearly don't
want to maintain two features that do the same thing, but with different
syntax and different set of wrappers/plugins.
Goal
----
* Fully support all community FDWs in Greenplum
* Refactor existing external protocols to be FDWs, to reduce GPDB code
footprint and to support future FDW features.
* Keep external table syntax, for backwards-compatibility
History of PostgreSQL Foreign Data Wrappers
-------------------------------------------
The FDW feature has been improved in PostgreSQL over several releases:
8.4: DDL Syntax, pg_dump/restore
9.0: -
9.1: Basic read support
9.2: Statistics collection (ANALYZE)
9.3: Updatable foreign tables
9.4: -
9.5: Join pushdown. IMPORT FOREIGN SCHEMA command.
9.6: Advanced pushdown support (sorts, aggregates, updates, etc.)
For the same functionality we have with readable external tables, we
need to catch up with 9.1. For writeable external tables, we need 9.3.
Handler API differences
-----------------------
The PostgreSQL API consists of a number of callback functions. There are
callbacks that get called at plan time, at ANALYZE, at EXPLAIN, at
execution, and so forth.
The main callbacks, for scanning a foreign table, are:
BeginForeignScan
IterateForeignScan
EndForeignScan
These are called at execution. IterateForeignScan returns one tuple at a
time, and the executor calls it repeatedly, until there are no more
tuples left. This differs somewhat from the external protocol API. The
corresponding external protocol API consists of a callback function,
"readfunc". The executor calls it repeatedly, until there is no more
data left. The difference is that while the FDW IterateForeignScan
callback returns one tuple at a time, in a TupleTableSlot, "readfunc"
returns a chunk of raw data, which is expected to be in the same format
as used by COPY. The executor feeds the raw data to the COPY parsing
code, to get the tuples.
The updateable FDW and Writeable External Table APIs differ in a similar
way. The FDW API is tuple-oriented, while the External Table API writes
chunks of COPY-formatted raw data.
See https://www.postgresql.org/docs/current/static/fdw-callbacks.html
for details of the FDW API. (I'm not aware of a similar document for the
external protocol API. Anyone?)
The Master Plan
---------------
0. Catch-up/cherry-pick FDW syntax from PostgreSQL 8.4.
1. Catch-up/cherry-pick FDW support from PostgreSQL 9.3
2. Add glue code to make it possible to use the FDW API, with the
External Table syntax.
3. Refactor existing external protocols to use the FDW API.
4. Remove external scan and update code.
Alternatively, instead of refitting all existing protocols to the FDW
API, we could write a special adapter FDW that converts a stream of
COPY-formatted data into tuples, and vice versa. The adapter FDW
wouldn't connect to any particular data source, but it would call yet
another plugin to get the data. The API for those plugins would be the
same as the current external protocol API. That's appealing, but I think
it's easier to just refactor all the existing external protocols. There
aren't that many of them, and dealing with yet another adapter layer
seems more complicated in the long run. There will be some duplicated
code, to parse all the COPY options, like whether it's in CSV format,
delimiters, etc. But it's not too bad. The file_fdw in PostgreSQL does
the same, and file_fdw.c is about 1000 lines of C code. The set of COPY
options has been unchanged for a very long time, so I'm not too worried
about new options being added, that would then need to be copied to all
the protocols.
MPP
---
There is one crucial difference between PostgreSQL and Greenplum:
Greenplum is an MPP system. That's reflected in the external protocols,
but community FDWs know nothing about that. What exactly should happen,
when you do "SELECT * FROM foreign_table"? Is the foreign scan started
on every segment? Or is it performed fully in the master? Or both? If
the foreign scan is started on all segments, how does the FDW on each
segment know which slice of data to fetch?
That's handled slightly differently in different external protocols,
currently:
* With gpfdist, the segments pass along the segment-ID in the HTTP
requests they make to the
gpfdist server. The gpfdist server slices the data, based on the
segment-ID.
* With S3, each segment fetches only a subset of the files in the
bucket, based on its
segment-ID.
* gphdfs works similarly: each segment fetches a subset of the of the
data, based on its
segment-ID.
* With EXECUTE-type external tables, the command is executed on every
segment. A bunch of
environment variables are set, including GP_SEGMENT_ID and
GP_SEGMENT_COUNT, which you
can use in your script to split the data, if needed.
So, how will we handle this with Foreign Data Wrappers? There's probably
no one-size-fits-all solution. For existing external prtocols, converted
to FDWs, we should keep the behaviour the same. But for FDWs written for
PostgreSQL, that know nothing about MPP, we shouldn't just blindly run
the foreign scan on every segment. For example, with a Twitter FDW,
written for PostgreSQL, running the foreign scan on every segment would
fetch the same tweets on every segment, and you'd get duplicated rows.
I think the only safe option here is to insist that the FDW is executed
only on the master, unless the FDW somehow indicates that it knows about
MPP. Let's extend the FDW API with a new MPP-aware flag or callback. If
the FDW set the flag, then the foreign scan is executed on all segments.
Otherwise it's executed only on master.
We might also want an override option for that, so that the user can
force an FDW to be executed on every segment anyway. That might be
useful, e.g. with the file_fdw, if you have different files spread out
to the segments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment