Skip to content

Instantly share code, notes, and snippets.

@portante
Last active March 18, 2016 14:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save portante/248407dbfb29c2515fc3 to your computer and use it in GitHub Desktop.
Save portante/248407dbfb29c2515fc3 to your computer and use it in GitHub Desktop.
Comment on Patch Set #4 posted March 10th for http://review.gluster.org/#/c/12250/
> I like where this is going. Perhaps we could consider the steps an
> admin has to take and orient the commands to address what they need
> to do.
>
> First, having to "kill -9" a process as an interface to stopping a
> brick seems dicey.
>
> So to that end, perhaps we could two sets of commands, the first to
> tell gluster to take the brick out of service, allowing the admin
> to then replace it, and a second command to brick the brick back
> into service.
>
> For example:
>
> gluster volume name: example-vol
> 6 node cluster, each member providing 12 bricks, three way replicated, distributed
> members are named node-0 ... node-5
> disks are named /srv/brick-00/brick.0 .. /srv/brick-11/brick.0
>
> Lets say the disk for /srv/brick-02/brick.0 on node-1 goes bad.
>
> Today I believe I have to:
>
> 1. ssh root@node-1
> 2. kill -9 $(gluster volume status | grep /srv/brick-02/brick | awk '{print $3}')
> 3. # replace physical disk
> 4. remount /srv/brick-02
> 5. mkdir /srv/brick-02/brick.1
> 6. # on a FUSE mnt, do the directory dance and the xattr dance
> 7. # ensure other node brick trusted IDs for healing are correct
> 8. gluster volume replace-brick example-vol node-1:/srv/brick-02/brick.0 node-1:/srv/brick-02/brick.1 commit force
>
> Perhaps we could do the following instead:
>
> 1. ssh root@node-1
> 2. gluster volume replace-brick example-vol node-1:/srv/brick-02/brick.0 offline
> 3. # replace physical disk
> 4. remount /srv/brick-02
> 5. mkdir /srv/brick-02/brick.0
> 6. gluster volume replace-brick example-vol node-1:/srv/brock-02/brick.0 online
>
> Then the new step #6 would take care of the old steps #6 & #7. We
> would lose the "kill -9" and replace with a declaration of
> intention command that tells gluster to "take this brick off line,
> stopping the brick process, and removing references to the mount
> point".
>
> What do you all think?
>
> Please excuse my ignorance of the ins-and-outs of gluster commands
> if I have the above steps off a bit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment