Skip to content

Instantly share code, notes, and snippets.

@beeender
Last active September 18, 2019 08:01
Show Gist options
  • Save beeender/f062e3fa2d9166778c946c6bab85a5fb to your computer and use it in GitHub Desktop.
Save beeender/f062e3fa2d9166778c946c6bab85a5fb to your computer and use it in GitHub Desktop.

Endpoint Design Document v2.0.3

1. Interface Spec

Create Parallel Retrieve Cursor (On QD)

Syntax:

DECLARE cursor_name PARALLEL RETRIEVE CURSOR FOR query;


The parallel cursor cursor_name needs to be unique in current transaction.


Examples:

postgres=# BEGIN;
BEGIN
postgres=# DECLARE c1 PARALLEL RETRIEVE CURSOR FOR SELECT * FROM t1;
DECLARE

List Endpoints For Parallel Retrieve Cursor (On QD)

Syntax:

SELECT * FROM GP_ENDPOINTS;


The returned tuples include cursor_name, endpoint_name, auth_token, hostname, port and status. The auth_token will be used as password for the QE retrieve session authentication. hostname and port indicate the location where should the results be retrieved from. auth_token is session-unique. The same token will be returned when creating the second parallel cursor in the same session.

The parallel cursors with different cursor_name will use the same auth_token if they are created in the same session.

The endpoint_name is generated by QD to be used as the identification for retrieving on QE.


Examples:

postgres=# SELECT * FROM gp_endpoints_info();
          auth_token                | cursor_name | endpoint_name |  hostname   | port  | status
------------------------------------+-------------+---------------+-------------+-------+--------
 tk6dadb3614921e777b3d4db380d28a0b7 | c1          | c1_1          | Mrmbp.local | 25432 | READY
 tk6dadb3614921e777b3d4db380d28a0b7 | c1          | c1_2          | Mrmbp.local | 25433 | READY
 tk6dadb3614921e777b3d4db380d28a0b7 | c1          | c1_3          | Mrmbp.local | 25434 | READY
 tk6dadb3614921e777b3d4db380d28a0b7 | c2          | c2_1          | Mrmbp.local | 25432 | FINISHED
 tk6dadb3614921e777b3d4db380d28a0b7 | c2          | c2_2          | Mrmbp.local | 25433 | FINISHED
 tk6dadb3614921e777b3d4db380d28a0b7 | c2          | c2_3          | Mrmbp.local | 25434 | FINISHED
(6 rows)

Retrieve Mode Authentication (To Start A QE Session)

Syntax:

PGPASSWORD=auth_token postgres -c gp_session_role=retrieve -h hostname -p port


Multi parallel cursors can be retrieved from the same retrieve session as long as their auth_token is the same.


Examples:

PGPASSWORD=tk6dadb3614921e777b3d4db380d28a0b7 postgres -c gp_session_rol=retrieve -h Mrmbp.local -p 25432

Retrieve From Parallel Cursor (On QE in an authenticated retrieve session)

Syntax:

RETRIEVE { ALL | count } FROM ENDPOINT endpoint_name;


To retrieve the corresponding query results for the given parallel cursor endpoint_name.


Examples:

postgres=# RETRIEVE ALL FROM ENDPOINT c1_1;
 col1 | col2
------+------
 0    | 0
 1    | 1
(2 rows)

Check Parallel Retrieve Cursor Status (On QD)

Syntax:

CHECK PARALLEL RETRIEVE CURSOR cursor_name [NOWAIT];


When NOWAIT is omitted, the query hangs in current session until all of the results related to the parallel retrieve cursor have been retrieved on all endpoints or any error happens. Error message will be returned if there is any, otherwise a FINISHED status text will be returned.

When NOWAIT is supplied, the query returns immediately with the current parallel retrieve cursor status text (see the below table for status information). Or returns the error message if there is any.

Status Description
RUNNING At least one of the endpoints has not been fully retrieved and there is no error happened so far.
FINISHED All endpoints have been retrieved successfully.

If there is any error happened in the parallel retrieve cursor execution, the query will return with error messages.

NOTE: Errors in the parallel retrieve cursor execution won't be reported back to QD until the CHECK statement is called.


Examples:

postgres=# CHECK PARALLEL RETRIEVE CURSOR c1; <waiting...>
FINISHED

postgres=# CHECK PARALLEL RETRIEVE CURSOR c1 NOWAIT;
RUNNING

postgres=# CHECK PARALLEL RETRIEVE CURSOR c1 NOWAIT;
FINISHED

Close Parallel Cursor (On QD)

Syntax:

CLOSE cursor_name;


To close the parallel cursor with the given cursor_name and release corresponding resources on all the related QDs.


Examples:

postgre=# CLOSE c1;
CLOSE

Life Cycles

Create Destroy Remarks
Parallel Cursor When DECLARE When CLOSE or
transaction quits
Name is unique in transaction
Auth Token When the FIRST parallel cursor is declared in the session When session quits Unique in session. Will be the same for all the parallel cursors created in the same session.

Endpoints Status

SQL ENDPOINT STATUS REMARKS
DECLARE (master) READY The endpoint is ready to be retrieved.
RETRIEVE (master/segment) RETRIEVING The endpoint is being retrieved from.
(RETRIEVE DONE) FINISHED The retrieving from this endpoint is finished.

2. Implementation

SQL Process flow
DECLARE (master) Basically using the current normal cursor logic with a few differences since we have endpoints involved.:
a1. On master: CdbDispatchPlan with an alloced token.
a2. On master: Since CdbDispatchPlan returns as soon as the plan delivered, we need to make a busy wait on master to wait for the segments' endpoint status become READY.
a3. On segments: Start executing plan
a4. On segments: Replace DestReceiver with TupleQueueDestReceiver (Message queue will be created when it starts up.)
a5. On segments: Put token into share-mem and mark the status as READY
a6. On segments: Starts the sending process (hangs if retriever is not connected or the queue is full)
a7. On master: Sees all the endpoint status changes, return statement finished.
--------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
RETRIEVE (master/segment) b1. On segment(retrieve role): start a retrieve session.
b2. On segment(retrieve role): Fetch tuples from the message queue.
b3. On segment(QE role): Change the endpoint status to FINISHED
b4. On segment(QE role): Send a status tuple to QD's upper motion node
--------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
WAIT (master) c. If the endpoint doesn't exist, returns error
c. If the endpoint status is READY, wait until motion node status tuple received (by step b4)
c. If the endpoint status is FINISHED, just return
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment