Below is an extensive user guide written in markdown for the Litestream project. You can save this text as a README.md file in your project repository and update details as needed.
Litestream is an open‑source continuous replication tool for SQLite databases. It continuously streams the WAL (Write‑Ahead Log) changes and automatically creates snapshot backups for your SQLite database. The replicated data can be stored locally (file system) or remotely on object storage services (such as Amazon S3, Google Cloud Storage, Azure Blob Storage, or even via SFTP). Litestream allows for point‑in‑time recovery, off‑site backups, and simplified high‑availability for SQLite databases.
Note: Litestream is written in Go. It includes both a Go library (for embedding replication into your applications) and a command‑line tool that reads a YAML configuration file and supports several sub‑commands to manage your backups.
- Features
- Architecture Overview
- Installation
- Configuration
- Command‑Line Interface (CLI)
- Advanced Topics
- Embedding Litestream in your Go Application
- Troubleshooting and FAQ
- License
- Continuous replication: Litestream monitors a SQLite database and continuously replicates new WAL frames to a remote backup directory.
- Snapshot backups: In addition to live WAL-streaming, the tool automatically takes snapshots of the database at configurable intervals.
- Multiple replica targets: Use file system targets or cloud storage providers such as S3, GCS, ABS, and SFTP.
- Point‑in‑time recovery: Restore the database from a specific generation, WAL index, or timestamp.
- Configurable retention: Automatically delete obsolete snapshots and WAL segments according to a retention policy.
- Optional encryption: Use Age for client‑side encryption on backup snapshots and WAL segments.
- Built‑in metrics: Litestream exports Prometheus metrics so you can monitor WAL sizes, sync durations, and checkpoint counts.
- CLI and library: Run Litestream as a service or embed its functionality into your software.
Litestream consists of several core pieces:
-
Database Manager (DB):
This component opens and manages the target SQLite database. It interacts with the WAL file generated by SQLite. On startup the DB verifies the WAL header and then continuously copies new frames from the WAL into “shadow WAL” files stored in a metadata directory (by default, a hidden directory with the suffix-litestream
is created next to the main DB file). -
Replication Engine:
The DB object can be configured with one or more Replicas. A Replica is responsible for synchronizing the “shadow WAL” frames (and snapshots) to a remote destination. There are built‑in replicas for:- Local disk (file system)
- AWS S3
- Google Cloud Storage (GCS)
- Azure Blob Storage (ABS)
- SFTP servers
Each replica is implemented with a corresponding client that reads the shadow WAL or snapshot files, applies LZ4 compression (and optionally encryption via Age), and then writes the file to the remote destination.
-
Command‑Line Interface (CLI):
Litestream provides a set of sub‑commands to manage databases. For example:databases
: Lists all managed databases from your configuration.generations
: Lists available generations along with lag and time range statistics.replicate
: Starts the live replication process.restore
: Recovers (restores) a database from a replica.snapshots
: Lists snapshot backups available for a given database.wal
: Lists available WAL segment files.version
: Prints the current build version.
-
Configuration:
Litestream is configured via a YAML file where you declare one or more databases and list the connected replica destinations. Global defaults can be set in the configuration and environment variables can be used to override values. -
Metrics:
Prometheus metrics are exposed (via the HTTP server on a configurable bind address) so that operational data such as the number of sync operations and checkpoint durations can be observed.
To build Litestream from source you need Go installed (version 1.18+ is recommended):
-
Clone the repository:
git clone https://github.com/benbjohnson/litestream.git cd litestream
-
Build the binary:
go build ./cmd/litestream
-
(Optional) Install the binary:
go install ./cmd/litestream
The binary will be installed into your
$GOPATH/bin
(or$HOME/go/bin
if no GOPATH is set).
Some releases (or community builds) may also be available as prebuilt binaries on the GitHub releases page.
Litestream uses a YAML configuration file to determine which databases to manage and which replica targets to use. You can also pass a replica URL on the command line for one‑off operations (e.g. for restore).
The top‑level configuration file supports:
-
Addr:
The bind address for serving HTTP metrics and, optionally, pprof debugging information. -
Exec:
A sub‑command to execute as a child process. Litestream will wait for the sub‑command to exit; this is useful for process supervision. -
Access Key Defaults:
Global settings which may be automatically propagated to each replica configuration if not overridden. -
Logging:
Configure logging level (DEBUG, INFO, WARN, ERROR) and output (text or JSON, stderr versus stdout).
Each database is defined with a configuration block. Key properties include:
-
path:
The path to the SQLite database file. -
meta‑path:
(Optional) A custom path for storing Litestream metadata such as shadow WAL and snapshots. By default this directory is created next to the database file with a name like.db‑litestream
. -
monitor-interval:
How often the database is checked for new changes. -
checkpoint-interval:
Maximum amount of time to allow before triggering a checkpoint on the SQLite WAL file. -
busy-timeout:
The time Litestream waits for SQLite to release locks. -
min‑checkpoint‑page‑count / max‑checkpoint‑page‑count:
Thresholds to determine when a checkpoint should be automatically triggered.
Under each database there is a list of replica targets. Each replica block includes:
-
type:
The type of replica. This may be one of:file
,s3
,gcs
,abs
, orsftp
. (If a URL is provided in theurl
field, that value is used to automatically derive the type.) -
name:
An optional name by which to refer to the replica. -
path / url:
For local replicas use thepath
setting (a local directory where shadow WAL and snapshots will be stored). For remote replicas, set theurl
field (for example,s3://mybucket/backups
). -
Retention Settings:
Optional parameters such asretention
(duration to keep snapshots and WAL files) andretention-check-interval
. -
Sync and Snapshot Intervals:
Settings for how frequently a replica syncs new WAL segments and creates new snapshots, respectively. -
Provider‑specific settings:
For S3 you might specify your AWSaccess-key-id
,secret-access-key
,region
,bucket
, and optional endpoint details. For GCS and ABS, set the respective bucket names and paths. For SFTP, set the host, user, password (or key‑path) accordingly. -
Age (Encryption) Settings:
Under anage
sub‑block you can list one or more identities (private keys) and recipients (public keys) for encrypting replicated data.
Below is an example YAML configuration file:
# litestream.yml
# Bind address to serve metrics (Prometheus metrics endpoint)
addr: "localhost:2020"
# Global AWS credentials that will be propagated if not overridden.
access-key-id: "YOUR_GLOBAL_ACCESS_KEY"
secret-access-key: "YOUR_GLOBAL_SECRET_KEY"
# Logging configuration
logging:
level: "INFO"
type: "text" # "json" is also supported
stderr: false
dbs:
- path: "/var/lib/sqlite/mydb.sqlite"
# Optionally override meta directory (default: same directory prefixed with a dot)
meta-path: "/var/lib/sqlite/.mydb-litestream"
monitor-interval: 1s
checkpoint-interval: 1m
busy-timeout: 1s
min-checkpoint-page-count: 1000
max-checkpoint-page-count: 10000
replicas:
- type: "s3"
name: "s3-backup"
url: "s3://mybucket/sqlite-backups"
retention: 24h
retention-check-interval: 1h
sync-interval: 1s
snapshot-interval: 1h
# Optionally override AWS credentials per replica
access-key-id: "YOUR_S3_ACCESS_KEY"
secret-access-key: "YOUR_S3_SECRET_KEY"
region: "us-east-1"
- type: "file"
name: "local-backup"
path: "/var/backups/mydb"
retention: 72h
sync-interval: 1s
You can use environment variable expansion in the configuration file if needed. For example:
dbs:
- path: "$HOME/data/mydb.sqlite"
replicas:
- url: "${LITESTREAM_REPLICA_URL}"
If you do not want Litestream to expand environment variables, use the -no-expand-env
flag when running a command.
Litestream comes with several sub‑commands. They can be invoked from the command line as follows:
litestream <command> [arguments]
Below is a summary of each command and its available options.
Lists all managed databases from your configuration along with the replica names.
Usage example:
litestream databases -config /path/to/litestream.yml
Output:
A table with columns such as "path" and "replicas" is printed.
Lists all available generations for a database (or service endpoint). For every generation it displays the generation name (a unique identifier), the replication lag, and the start/end times of the backups.
Usage example:
litestream generations -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
Or to filter by replica URL:
litestream generations s3://mybucket/sqlite-backups
Parameters:
-config PATH
: Path to configuration file (default:/etc/litestream.yml
on Unix and platform‑specific on Windows).-replica NAME
: Filter to display generations only for a specific replica.
Starts the replication server. This command continuously monitors the SQLite database, syncs changes (shadow WAL frames) to the backup replicas, takes snapshots, enforces retention policies, and (optionally) supervises a child process.
Usage example:
litestream replicate -config /path/to/litestream.yml
Additional options include:
-exec CMD
: If specified, Litestream will launch the given command as a subprocess. Litestream will shut down when that process exits.-no-expand-env
: Prevents expansion of environment variables in the configuration file.
During replication, Litestream will log information about connected replicas and metrics (if you have configured an HTTP metrics server via the addr
setting).
Restores a database backup from a replica. This command downloads the relevant snapshot and WAL segments (up to a specific index or timestamp) and applies them to recover the SQLite database.
Usage examples:
-
Restore the latest replica backup back into the original database location:
litestream restore -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
-
Restore a database to a given point‑in‑time:
litestream restore -timestamp 2020-01-01T00:00:00Z -o /tmp/restore.sqlite /var/lib/sqlite/mydb.sqlite
-
Restore from a specific replica (for example, an S3 target) and specific generation:
litestream restore -replica s3 -generation 0123456789abcdef /var/lib/sqlite/mydb.sqlite
Additional restore parameters:
-o PATH
: The output location for the restored database.-replica NAME
: (Optional) Restore from the given replica.-generation NAME
: (Optional) Restore from a specific generation backup.-index INDEX
: (Hex‑encoded, optional) Restore up to that WAL index.-timestamp TIMESTAMP
: Restore the database to the point in time specified.-if-db-not-exists
: If the target file already exists, exit with code 0.-parallelism NUM
: Number of WAL files to download in parallel (default is 8).
Lists all snapshot backups available for a given database or replica. The output shows for every snapshot the replica it came from, the generation, index, file size, and creation timestamp.
Usage example:
litestream snapshots -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
Or for a specific replica:
litestream snapshots -replica s3 /var/lib/sqlite/mydb.sqlite
Lists all WAL segment files available (along with index, offset, size, and creation time) for a given database or replica.
Usage example:
litestream wal -config /path/to/litestream.yml /var/lib/sqlite/mydb.sqlite
You can filter by replica (using -replica NAME
) or specific generation (using the -generation
flag).
Prints the current Litestream version.
Usage example:
litestream version
Litestream supports multiple replica back‑ends. Here are a few examples:
-
Local File (file):
Replicates backup files to a local or network file system. Simply specify the file system path. -
Amazon S3 (s3):
Works with S3‑compatible object storage. Pass your bucket name and optional endpoint (for non‑AWS S3, such as MinIO or Backblaze). -
Google Cloud Storage (gcs):
Uses GCS. Specify the bucket name and object path. -
Azure Blob Storage (abs):
Works with Microsoft Azure’s Blob Storage. Provide your account name, key, and container (bucket) name. -
SFTP:
Upload backups through an SSH/SFTP connection.
Some clients (like S3, GCS, ABS, SFTP) can be configured either via an explicit URL (for one‑off operations) or in the YAML configuration. They each implement the same ReplicaClient interface which provides functions to list generations, write/read snapshots, and delete obsolete objects.
Litestream maintains a “generation” for each backup cycle. When the database’s WAL cannot be verified against the last “shadow WAL” file (for example, when the WAL is truncated or overwritten) a new generation is started. A generation is typically associated with a snapshot (a full copy of the database) and a subsequent series of WAL segments. You can configure retention on a per‑replica basis so that older generations and their corresponding snapshots and WAL segments are automatically deleted after a set period.
For secure backups, Litestream supports client‑side encryption using the Age encryption tool. In your replica configuration you can supply an age
sub‑block with a list of identities (private keys) for decryption and recipients (public keys) for encryption. When enabled, snapshots and WAL segments are encrypted before being uploaded to the remote replica.
Litestream is instrumented with Prometheus metrics that track:
- Number of database and replica syncs
- Total bytes written to snapshots and WAL files
- Checkpoint operations and errors
- Replication operations such as GET/PUT/DELETE counts and bytes
If you supply an addr
in the configuration file, Litestream will start an HTTP server (for example on “localhost:2020”) that serves metrics at /metrics
.
Internally Litestream uses Prometheus client libraries and exposes metrics such as:
litestream_db_size
litestream_wal_size
litestream_sync_count
litestream_checkpoint_count
- Replica-specific counters (e.g.
litestream_replica_wal_bytes
)
If you would like to embed Litestream’s replication functionality in your own Go applications, you can import the package:
import "github.com/benbjohnson/litestream"
and then use functions such as:
litestream.NewDB(path)
to create a new DB instance.db.Open()
,db.Sync(ctx)
, and finallydb.Close(ctx)
to actively manage the database.- Configure and add replica targets by instantiating
litestream.NewReplica(db, "replica-name")
- Use
Replica.Sync(ctx)
,Replica.Snapshot(ctx)
and other methods to manage replication and snapshotting.
For more details, please review the inline documentation in the source files.
A: Check that your database is using WAL mode (Litestream runs the SQL command PRAGMA journal_mode = wal;
). Also verify that the Litestream monitor interval and checkpoint intervals are properly configured. Look into the logs for any errors related to acquiring a read lock or copying WAL frames.
A: This typically happens when the WAL file has been truncated by another SQLite process. In this case, Litestream starts a new generation and the new snapshot will have a different generation name. You might need to update your restore commands to point to the new generation.
A: By default, environment variable expansion is enabled. If you run Litestream with the -no-expand-env
flag, variables will remain literal.
A: Ensure that your provider‑specific credentials and bucket names are correctly set. Use the replica client’s URL format if you want a one‑off override rather than using the full YAML configuration.
For additional troubleshooting tips consult the repository’s issue tracker or documentation.
Litestream is licensed under the MIT License. See LICENSE for details.
This guide provides a comprehensive overview of how to install, configure, and use Litestream. For more detailed information on each package, refer to the source code documentation and comments in the GitHub repository.
Happy replicating!
Ran this:
Cost around 11c.