Skip to content

Instantly share code, notes, and snippets.

@abasu0713
Last active June 17, 2024 17:07
Show Gist options
  • Save abasu0713/cb522b6ba4489ac15838465fc87faf17 to your computer and use it in GitHub Desktop.
Save abasu0713/cb522b6ba4489ac15838465fc87faf17 to your computer and use it in GitHub Desktop.
Deploy a high availability Ceph cluster using Microceph

Minimal High Availability Ceph cluster using Microceph in less than 10 min

This gist provides you with the steps to deploy a minimally viable High Availability (HA) Ceph Cluster. It follows the Microceph Multi-node install guide but adds a little more detail to make the deployment simpler.

Architecture

image *Minimal HA Ceph cluster

Prerequisites

3 Bare-metal nodes. The steps described in this gist should work on any Linux distribution that supports the Snap Daemon. Can be done in:

  1. Private subnet within any cloud provider. With mixed architecture VM nodes. Microcep is opninionated and installed using Snaps. So no need to worry about the underlying hardware.
  2. At your home using inexpensive commodity hardware like: Raspberry Pi/Orange Pi or NUCs.
  3. In Virtual Machines created using Proxmox or VMWare ESXI.

Node-1: Prepare Ceph Cluster

We are going to bootstrap out Ceph Cluster in a machine that we are going to call Node-1.

# Name your cluster node first. This will come in handy when we use FQDN for the object gateway if you use it from kubernetes clusters. 
sudo hostnamectl set-hostname <desired-hostname> 
# let's turn off Swap on the machine. Machines running Ceph Clusters shouldn't have swap turned on.
sudo swapoff -a 

# Install Ceph. Use the latest version which has all the fixes.. 
sudo snap install microceph --edge 
# Without an upgrade strategy for Ceph you shouldn't update it as part of OS or Snap daemon updgrades.
sudo snap refresh --hold microceph 

# Bootstraps the Ceph cluster
sudo microceph cluster bootstrap 

# Generate tokens for other nodes to join the cluster
sudo microceph cluster add node-2 
sudo microceph cluster add node-3 

Join other nodes for High Availability

We are going to run the following commands from rest of the nodes in order to join the main cluster.

# Name your cluster node first
sudo hostnamectl set-hostname <desired-hostname> 
# Turn off swap
sudo swapoff -a

# Install Ceph. Use the latest version which has all the fixes.. 
sudo snap install microceph --edge 
# Without an upgrade strategy for Ceph you shouldn't update it on any node.
sudo snap refresh --hold microceph 

# Join the main ceph cluster using the unique tokens generated on the master node for each child node.
sudo snap microceph cluster join <token> # Use tokens from above

Validate Ceph Cluster status

Let's validate the state of our cluster deployment. You can do this from any of the nodes after the joins have completed.

alias ceph="microceph.ceph"
sudo microceph status # Provides a summary of Ceph cluster nodes
sudo ceph status # Provides a detailed summary of the ceph cluster nodes

image

Add storage

Time to add storage to our Ceph Cluster. Repeat these steps on each of your Nodes within the cluster. In order for this to work - you will need a RAW device that is un-partitioned. You can use loop devices as well (but is not recommended for Production scenarios).

Since my cluster is running on SBCs that are running Debian Bookworm server - the OS footprint is really tiny. I have a primary drive that is 256 Gigs. But the OS only uses around 2-4 Gigs. So I am going to use some loop devices to add more storage to our Ceph Cluster.

sudo microceph disk add loop,75G,2 # Adding loop devices. This will add 2 75 Gigs OSDs as disk from your primary drive
sudo microceph disk add /dev/<drive-name> --wipe # In my case it's nvme0n1. 
sudo microceph disk add --all-available --wipe # In case you want to add all available RAW devices on the system

Validate Storage on Ceph Cluster

Time to validate everything is setup correctly for our Storage layer.

# The following command should different results from when we validated before storage addition. You will see the actual size of all your added OSDs put together. 
sudo microceph.ceph status
# Check each disk and where they are
sudo microceph disk list
HA Ceph cluster with OSDs *HA Ceph cluster with multiple OSDs spread across all nodes

With these few commands you now have an operational Ceph Cluster in under 10 min.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment