Skip to content

Instantly share code, notes, and snippets.

@josephbolus
Last active April 22, 2024 23:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josephbolus/72e3d6350c8cead09cba0bbcbbf2ff8d to your computer and use it in GitHub Desktop.
Save josephbolus/72e3d6350c8cead09cba0bbcbbf2ff8d to your computer and use it in GitHub Desktop.
SeaweedFS Cluster Setup Guide

Configuring Replication at Startup

When you start the master server, you can specify the replication strategy using the -defaultReplication flag. This flag defines how data is replicated across the cluster. Here's how you can set it:

  • 000: no replication
  • 001: replicate once on the same rack
  • 010: replicate once on a different rack, but same data center
  • 100: replicate once on a different data center
  • 200: replicate twice on two different data center
  • 110: replicate once on a different rack, and once on a different data center

For example, if you want each file to have two replicas across the three nodes, you would set the replication strategy to 002. Adjust your master command in the Docker Compose file on each node like this:

command: "master -ip=node01 -peers=node01:9333,node02:9333,node03:9333 -defaultReplication=002"

Changing Replication after Setup

If you need to adjust replication settings after the cluster is up and running, you can either restart the master with new parameters, or use filer to set replication rules for different buckets or directories. Here’s how you might specify replication using the filer's REST API or through its directory structure:

  • You can define custom replication rules for specific directories by creating or modifying the .buckets file inside a directory managed by SeaweedFS. This allows different directories to have different replication settings.
  • The filer also supports adjusting replication through its API by specifying replication parameters when you create or modify files.

Example

To apply a custom replication rule for a specific directory after the cluster has started, you might do something like this:

Creating a Bucket with Custom Replication:

Use the filer to create a directory and specify a custom replication strategy:

curl -X PUT "http://node01:8888/path/to/directory?replication=002"

This sets the directory to replicate data across nodes using the 002 strategy.

Docker Swarm Notes:

https://github.com/seaweedfs/seaweedfs/wiki/SeaweedFS-in-Docker-Swarm

version: '3.7'
services:
master:
image: chrislusf/seaweedfs:latest # Use the latest SeaweedFS image -defaultReplication=001
network_mode: host
ports:
- "9333:9333" # Port for master server
command: "master -ip=${HOST} -port=9333 -mdir=/data -volumeSizeLimitMB=1024 -garbageThreshold=0.01 -peers=node01:9333,node02:9333,node03:9333"
volumes:
- "./data/master:/data:rw" # Persist data
volume:
image: chrislusf/seaweedfs:latest
network_mode: host
ports:
- "8080:8080" # Port for volume server
command: "volume -ip=${HOST} -port=8080 -dir=/data -mserver=node01:9333,node02:9333,node03:9333 -dataCenter=dc1"
depends_on:
- master
volumes:
- "./data/volume:/data:rw"
filer:
image: chrislusf/seaweedfs:latest
network_mode: host
ports:
- "8888:8888" # Port for filer
command: "filer -ip=${HOST} -port=8888 -ui.deleteDir=false -maxMB=4 -downloadMaxMBps=100 -master=node01:9333,node02:9333,node03:9333"
depends_on:
- master
volumes:
- "./data/filer:/data:rw"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment