kmikmy/checkpoint.md

## checkpoint.md

      
    Raw
  

              checkpoint.md
            
          
    void CreateCheckPoint(int flags)

チェックポイントのアクションはある期間にまたがって発生するが、論理的には1つのLSNの地点で発生するので、
これは非常に特別な操作でありWALレコードです。
WALレコードのREDO PTRの論理的な位置は、同じまたは物理的位置よりも前です。
我々はWALを再生するとき、我々は、その物理的な位置を介してチェックポイントレコードを探しだし、
REDO PTRを読み、実際にはそれ以前の論理的な位置でreplayを開始する。
位置はWALレコードの他の種類とすることができるので、我々は、論理的な位置でWALに何かを書かないことに注意してください。
(注釈：必ず物理的な位置でWALを書くということ）
このメカニズムで、私たちはチェックポイントしながら作業を継続することができます。
その結果、アクションのタイミングはここで重要なと、この機能（チェックポイント）はおそらくビジー状態のシステム上で実行すると数分かかるので注意してください
/*
 * Perform a checkpoint --- either during shutdown, or on-the-fly
 *
 * flags is a bitwise OR of the following:
 *  CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.
 *  CHECKPOINT_END_OF_RECOVERY: checkpoint is for end of WAL recovery.
 *  CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,
 *      ignoring checkpoint_completion_target parameter.
 *  CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occurred
 *      since the last one (implied by CHECKPOINT_IS_SHUTDOWN or
 *      CHECKPOINT_END_OF_RECOVERY).
 *
 * Note: flags contains other bits, of interest here only for logging purposes.
 * In particular note that this routine is synchronous and does not pay
 * attention to CHECKPOINT_WAIT.
 *
 * If !shutdown then we are writing an online checkpoint. This is a very special
 * kind of operation and WAL record because the checkpoint action occurs over
 * a period of time yet logically occurs at just a single LSN. The logical
 * position of the WAL record (redo ptr) is the same or earlier than the
 * physical position. When we replay WAL we locate the checkpoint via its
 * physical position then read the redo ptr and actually start replay at the
 * earlier logical position. Note that we don't write *anything* to WAL at
 * the logical position, so that location could be any other kind of WAL record.
 * All of this mechanism allows us to continue working while we checkpoint.
 * As a result, timing of actions is critical here and be careful to note that
 * this function will likely take minutes to execute on a busy system.
 */
void
CreateCheckPoint(int flags)
{
    /* use volatile pointer to prevent code rearrangement */
    volatile XLogCtlData *xlogctl = XLogCtl;
    bool        shutdown;
    CheckPoint  checkPoint;
    XLogRecPtr  recptr;
    XLogCtlInsert *Insert = &XLogCtl->Insert;
    XLogRecData rdata;
    uint32      freespace;
    XLogSegNo   _logSegNo;
    XLogRecPtr  curInsert;
    VirtualTransactionId *vxids;
    int         nvxids;

    /*
     * An end-of-recovery checkpoint is really a shutdown checkpoint, just
     * issued at a different time.
     */
    if (flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY))
        shutdown = true;
    else
        shutdown = false;

リカバリ中はチェックポイントできない。
    /* sanity check */
    if (RecoveryInProgress() && (flags & CHECKPOINT_END_OF_RECOVERY) == 0)
        elog(ERROR, "can't create a checkpoint during recovery");

チェックポイントは同時に一つしか起こらないことを保証するためにロックをとる。
現在のシステムでは一つのプロセスがあるだけなので、形式上のものである。
    /*
     * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
     * (This is just pro forma, since in the present system structure there is
     * only one process that is allowed to issue checkpoints at any given
     * time.)
     */
    LWLockAcquire(CheckpointLock, LW_EXCLUSIVE);

log_checkpoints (boolean)はユーザがセットできるパラメータ変数。
onだとチェックポイントおよびリスタートポイントをサーバログに記録するようにします。 書き出されたバッファ数や書き出しに要した時間など、いくつかの統計情報がこのログメッセージに含まれます。 このパラメータはpostgresql.confファイルまたはサーバのコマンドラインでのみ設定可能です。 デフォルトはoffです。
https://www.postgresql.jp/document/9.1/html/runtime-config-logging.html
    /*
     * Prepare to accumulate statistics.
     *
     * Note: because it is possible for log_checkpoints to change while a
     * checkpoint proceeds, we always accumulate stats, even if
     * log_checkpoints is currently off.
     */
    MemSet(&CheckpointStats, 0, sizeof(CheckpointStats));
    CheckpointStats.ckpt_start_t = GetCurrentTimestamp();

クリティカルセクションに入る。
シャットダウン時の挙動であればpg_controlにシャットダウン中であることを記録する。
    /*
     * Use a critical section to force system panic if we have trouble.
     */
    START_CRIT_SECTION();

    if (shutdown)
    {
        LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
        ControlFile->state = DB_SHUTDOWNING;
        ControlFile->time = (pg_time_t) time(NULL);
        UpdateControlFile();
        LWLockRelease(ControlFileLock);
    }

storage managerにチェックポインのための準備をさせる；これはREDO ptrを決定する前に起きなければならない。
smgrはチェックポイントが必要とされないと決定されるならばundoされる何かをする必要はない（？）
    /*
     * Let smgr prepare for checkpoint; this has to happen before we determine
     * the REDO pointer.  Note that smgr must not do anything that'd have to
     * be undone if we decide no checkpoint is needed.
     */
    smgrpreckpt();

    /* Begin filling in the checkpoint WAL record */
    MemSet(&checkPoint, 0, sizeof(checkPoint));
    checkPoint.time = (pg_time_t) time(NULL);