simonw/litestream.md Secret

## litestream.md

      
    Raw
  

              litestream.md
            
          
    Below is an extensive user guide written in markdown for the Litestream project. You can save this text as a README.md file in your project repository and update details as needed.

Litestream

Litestream is an open‑source continuous replication tool for SQLite databases. It continuously streams the WAL (Write‑Ahead Log) changes and automatically creates snapshot backups for your SQLite database. The replicated data can be stored locally (file system) or remotely on object storage services (such as Amazon S3, Google Cloud Storage, Azure Blob Storage, or even via SFTP). Litestream allows for point‑in‑time recovery, off‑site backups, and simplified high‑availability for SQLite databases.

Note: Litestream is written in Go. It includes both a Go library (for embedding replication into your applications) and a command‑line tool that reads a YAML configuration file and supports several sub‑commands to manage your backups.


Table of Contents


Features
Architecture Overview
Installation
Configuration

Global Settings
Database and Replica Configuration
Sample YAML Configuration


Command‑Line Interface (CLI)

databases
generations
replicate
restore
snapshots
wal
version


Advanced Topics

Replication Clients
Retention, Snapshotting and WAL Files
Encryption with Age
Metrics and Instrumentation


Embedding Litestream in your Go Application
Troubleshooting and FAQ
License


Features


Continuous replication: Litestream monitors a SQLite database and continuously replicates new WAL frames to a remote backup directory.
Snapshot backups: In addition to live WAL-streaming, the tool automatically takes snapshots of the database at configurable intervals.
Multiple replica targets: Use file system targets or cloud storage providers such as S3, GCS, ABS, and SFTP.
Point‑in‑time recovery: Restore the database from a specific generation, WAL index, or timestamp.
Configurable retention: Automatically delete obsolete snapshots and WAL segments according to a retention policy.
Optional encryption: Use Age for client‑side encryption on backup snapshots and WAL segments.
Built‑in metrics: Litestream exports Prometheus metrics so you can monitor WAL sizes, sync durations, and checkpoint counts.
CLI and library: Run Litestream as a service or embed its functionality into your software.


Architecture Overview

Litestream consists of several core pieces:


Database Manager (DB):

This component opens and manages the target SQLite database. It interacts with the WAL file generated by SQLite. On startup the DB verifies the WAL header and then continuously copies new frames from the WAL into “shadow WAL” files stored in a metadata directory (by default, a hidden directory with the suffix -litestream is created next to the main DB file).


Replication Engine:

The DB object can be configured with one or more Replicas. A Replica is responsible for synchronizing the “shadow WAL” frames (and snapshots) to a remote destination. There are built‑in replicas for:

Local disk (file system)
AWS S3
Google Cloud Storage (GCS)
Azure Blob Storage (ABS)
SFTP servers

Each replica is implemented with a corresponding client that reads the shadow WAL or snapshot files, applies LZ4 compression (and optionally encryption via Age), and then writes the file to the remote destination.


Command‑Line Interface (CLI):

Litestream provides a set of sub‑commands to manage databases. For example:

databases: Lists all managed databases from your configuration.
generations: Lists available generations along with lag and time range statistics.
replicate: Starts the live replication process.
restore: Recovers (restores) a database from a replica.
snapshots: Lists snapshot backups available for a given database.
wal: Lists available WAL segment files.
version: Prints the current build version.


Configuration:

Litestream is configured via a YAML file where you declare one or more databases and list the connected replica destinations. Global defaults can be set in the configuration and environment variables can be used to override values.


Metrics:

Prometheus metrics are exposed (via the HTTP server on a configurable bind address) so that operational data such as the number of sync operations and checkpoint durations can be observed.


Installation

From Source

To build Litestream from source you need Go installed (version 1.18+ is recommended):


Clone the repository:
git clone https://github.com/benbjohnson/litestream.git
cd litestream


Build the binary:
go build ./cmd/litestream


(Optional) Install the binary:
go install ./cmd/litestream

The binary will be installed into your $GOPATH/bin (or $HOME/go/bin if no GOPATH is set).


Prebuilt Binaries

Some releases (or community builds) may also be available as prebuilt binaries on the GitHub releases page.

Configuration

Litestream uses a YAML configuration file to determine which databases to manage and which replica targets to use. You can also pass a replica URL on the command line for one‑off operations (e.g. for restore).
Global Settings

The top‑level configuration file supports:


Addr:

The bind address for serving HTTP metrics and, optionally, pprof debugging information.


Exec:

A sub‑command to execute as a child process. Litestream will wait for the sub‑command to exit; this is useful for process supervision.


Access Key Defaults:

Global settings which may be automatically propagated to each replica configuration if not overridden.


Logging:

Configure logging level (DEBUG, INFO, WARN, ERROR) and output (text or JSON, stderr versus stdout).


Database and Replica Configuration

Each database is defined with a configuration block. Key properties include:


path:

The path to the SQLite database file.


meta‑path:

(Optional) A custom path for storing Litestream metadata such as shadow WAL and snapshots. By default this directory is created next to the database file with a name like .db‑litestream.


monitor-interval:

How often the database is checked for new changes.


checkpoint-interval:

Maximum amount of time to allow before triggering a checkpoint on the SQLite WAL file.


busy-timeout:

The time Litestream waits for SQLite to release locks.


min‑checkpoint‑page‑count / max‑checkpoint‑page‑count:

Thresholds to determine when a checkpoint should be automatically triggered.


Under each database there is a list of replica targets. Each replica block includes:


type:

The type of replica. This may be one of: file, s3, gcs, abs, or sftp. (If a URL is provided in the url field, that value is used to automatically derive the type.)


name:

An optional name by which to refer to the replica.


path / url:

For local replicas use the path setting (a local directory where shadow WAL and snapshots will be stored). For remote replicas, set the url field (for example, s3://mybucket/backups).


Retention Settings:

Optional parameters such as retention (duration to keep snapshots and WAL files) and retention-check-interval.


Sync and Snapshot Intervals:

Settings for how frequently a replica syncs new WAL segments and creates new snapshots, respectively.


Provider‑specific settings:

For S3 you might specify your AWS access-key-id, secret-access-key, region, bucket, and optional endpoint details. For GCS and ABS, set the respective bucket names and paths. For SFTP, set the host, user, password (or key‑path) accordingly.


Age (Encryption) Settings:

Under an age sub‑block you can list one or more identities (private keys) and recipients (public keys) for encrypting replicated data.


Sample YAML Configuration

Below is an example YAML configuration file:
# litestream.yml

# Bind address to serve metrics (Prometheus metrics endpoint)
addr: "localhost:2020"

# Global AWS credentials that will be propagated if not overridden.
access-key-id: "YOUR_GLOBAL_ACCESS_KEY"
secret-access-key: "YOUR_GLOBAL_SECRET_KEY"

# Logging configuration
logging:
  level: "INFO"
  type: "text"         # "json" is also supported
  stderr: false

dbs:
  - path: "/var/lib/sqlite/mydb.sqlite"
    # Optionally override meta directory (default: same directory prefixed with a dot)
    meta-path: "/var/lib/sqlite/.mydb-litestream"
    monitor-interval: 1s
    checkpoint-interval: 1m
    busy-timeout: 1s
    min-checkpoint-page-count: 1000
    max-checkpoint-page-count: 10000
    replicas:
      - type: "s3"
        name: "s3-backup"
        url: "s3://mybucket/sqlite-backups"
        retention: 24h
        retention-check-interval: 1h
        sync-interval: 1s
        snapshot-interval: 1h
        # Optionally override AWS credentials per replica
        access-key-id: "YOUR_S3_ACCESS_KEY"
        secret-access-key: "YOUR_S3_SECRET_KEY"
        region: "us-east-1"
      - type: "file"
        name: "local-backup"
        path: "/var/backups/mydb"
        retention: 72h
        sync-interval: 1s
You can use environment variable expansion in the configuration file if needed. For example:
dbs:
  - path: "$HOME/data/mydb.sqlite"
    replicas:
      - url: "${LITESTREAM_REPLICA_URL}"
If you do not want Litestream to expand environment variables, use the -no-expand-env flag when running a command.

Command‑Line Interface (CLI)

Litestream comes with several sub‑commands. They can be invoked from the command line as follows:
litestream <command> [arguments]

Below is a summary of each command and its available options.
databases

Lists all managed databases from your configuration along with the replica names.
Usage example:
litestream databases -config /path/to/litestream.yml
Output:
A table with columns such as "path" and "replicas" is printed.

generations

Lists all available generations for a database (or service endpoint). For every generation it displays the generation name (a unique identifier), the replication lag, and the start/end times of the backups.
Usage example:
litestream generations -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
Or to filter by replica URL:
litestream generations s3://mybucket/sqlite-backups
Parameters:

-config PATH: Path to configuration file (default: /etc/litestream.yml on Unix and platform‑specific on Windows).
-replica NAME: Filter to display generations only for a specific replica.


replicate

Starts the replication server. This command continuously monitors the SQLite database, syncs changes (shadow WAL frames) to the backup replicas, takes snapshots, enforces retention policies, and (optionally) supervises a child process.
Usage example:
litestream replicate -config /path/to/litestream.yml
Additional options include:

-exec CMD: If specified, Litestream will launch the given command as a subprocess. Litestream will shut down when that process exits.
-no-expand-env: Prevents expansion of environment variables in the configuration file.

During replication, Litestream will log information about connected replicas and metrics (if you have configured an HTTP metrics server via the addr setting).

restore

Restores a database backup from a replica. This command downloads the relevant snapshot and WAL segments (up to a specific index or timestamp) and applies them to recover the SQLite database.
Usage examples:


Restore the latest replica backup back into the original database location:
litestream restore -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite


Restore a database to a given point‑in‑time:
litestream restore -timestamp 2020-01-01T00:00:00Z -o /tmp/restore.sqlite /var/lib/sqlite/mydb.sqlite


Restore from a specific replica (for example, an S3 target) and specific generation:
litestream restore -replica s3 -generation 0123456789abcdef /var/lib/sqlite/mydb.sqlite


Additional restore parameters:

-o PATH: The output location for the restored database.
-replica NAME: (Optional) Restore from the given replica.
-generation NAME: (Optional) Restore from a specific generation backup.
-index INDEX: (Hex‑encoded, optional) Restore up to that WAL index.
-timestamp TIMESTAMP: Restore the database to the point in time specified.
-if-db-not-exists: If the target file already exists, exit with code 0.
-parallelism NUM: Number of WAL files to download in parallel (default is 8).


snapshots

Lists all snapshot backups available for a given database or replica. The output shows for every snapshot the replica it came from, the generation, index, file size, and creation timestamp.
Usage example:
litestream snapshots -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
Or for a specific replica:
litestream snapshots -replica s3 /var/lib/sqlite/mydb.sqlite

wal

Lists all WAL segment files available (along with index, offset, size, and creation time) for a given database or replica.
Usage example:
litestream wal -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
You can filter by replica (using -replica NAME) or specific generation (using the -generation flag).

version

Prints the current Litestream version.
Usage example:
litestream version

Advanced Topics

Replication Clients

Litestream supports multiple replica back‑ends. Here are a few examples:


Local File (file):

Replicates backup files to a local or network file system. Simply specify the file system path.


Amazon S3 (s3):

Works with S3‑compatible object storage. Pass your bucket name and optional endpoint (for non‑AWS S3, such as MinIO or Backblaze).


Google Cloud Storage (gcs):

Uses GCS. Specify the bucket name and object path.


Azure Blob Storage (abs):

Works with Microsoft Azure’s Blob Storage. Provide your account name, key, and container (bucket) name.


SFTP:

Upload backups through an SSH/SFTP connection.


Some clients (like S3, GCS, ABS, SFTP) can be configured either via an explicit URL (for one‑off operations) or in the YAML configuration. They each implement the same ReplicaClient interface which provides functions to list generations, write/read snapshots, and delete obsolete objects.
Retention, Snapshotting and WAL Files

Litestream maintains a “generation” for each backup cycle. When the database’s WAL cannot be verified against the last “shadow WAL” file (for example, when the WAL is truncated or overwritten) a new generation is started. A generation is typically associated with a snapshot (a full copy of the database) and a subsequent series of WAL segments. You can configure retention on a per‑replica basis so that older generations and their corresponding snapshots and WAL segments are automatically deleted after a set period.
Encryption with Age

For secure backups, Litestream supports client‑side encryption using the Age encryption tool. In your replica configuration you can supply an age sub‑block with a list of identities (private keys) for decryption and recipients (public keys) for encryption. When enabled, snapshots and WAL segments are encrypted before being uploaded to the remote replica.
Metrics and Instrumentation

Litestream is instrumented with Prometheus metrics that track:

Number of database and replica syncs
Total bytes written to snapshots and WAL files
Checkpoint operations and errors
Replication operations such as GET/PUT/DELETE counts and bytes

If you supply an addr in the configuration file, Litestream will start an HTTP server (for example on “localhost:2020”) that serves metrics at /metrics.
Internally Litestream uses Prometheus client libraries and exposes metrics such as:

litestream_db_size
litestream_wal_size
litestream_sync_count
litestream_checkpoint_count
Replica-specific counters (e.g. litestream_replica_wal_bytes)


Embedding Litestream in your Go Application

If you would like to embed Litestream’s replication functionality in your own Go applications, you can import the package:
import "github.com/benbjohnson/litestream"
and then use functions such as:

litestream.NewDB(path) to create a new DB instance.
db.Open(), db.Sync(ctx), and finally db.Close(ctx) to actively manage the database.
Configure and add replica targets by instantiating litestream.NewReplica(db, "replica-name")
Use Replica.Sync(ctx), Replica.Snapshot(ctx) and other methods to manage replication and snapshotting.

For more details, please review the inline documentation in the source files.

Troubleshooting and FAQ

Q: My backup is not being created even though I see WAL updates.

A: Check that your database is using WAL mode (Litestream runs the SQL command PRAGMA journal_mode = wal;). Also verify that the Litestream monitor interval and checkpoint intervals are properly configured. Look into the logs for any errors related to acquiring a read lock or copying WAL frames.
Q: I get errors that the generation has “changed” during replication.

A: This typically happens when the WAL file has been truncated by another SQLite process. In this case, Litestream starts a new generation and the new snapshot will have a different generation name. You might need to update your restore commands to point to the new generation.
Q: I see environment variables not getting expanded.

A: By default, environment variable expansion is enabled. If you run Litestream with the -no-expand-env flag, variables will remain literal.
Q: My replication client (S3/GCS/ABS) is returning errors.

A: Ensure that your provider‑specific credentials and bucket names are correctly set. Use the replica client’s URL format if you want a one‑off override rather than using the full YAML configuration.
For additional troubleshooting tips consult the repository’s issue tracker or documentation.

License

Litestream is licensed under the MIT License. See LICENSE for details.

This guide provides a comprehensive overview of how to install, configure, and use Litestream. For more detailed information on each package, refer to the source code documentation and comments in the GitHub repository.
Happy replicating!