Last active
March 18, 2016 14:47
-
-
Save portante/248407dbfb29c2515fc3 to your computer and use it in GitHub Desktop.
Comment on Patch Set #4 posted March 10th for http://review.gluster.org/#/c/12250/
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> I like where this is going. Perhaps we could consider the steps an | |
> admin has to take and orient the commands to address what they need | |
> to do. | |
> | |
> First, having to "kill -9" a process as an interface to stopping a | |
> brick seems dicey. | |
> | |
> So to that end, perhaps we could two sets of commands, the first to | |
> tell gluster to take the brick out of service, allowing the admin | |
> to then replace it, and a second command to brick the brick back | |
> into service. | |
> | |
> For example: | |
> | |
> gluster volume name: example-vol | |
> 6 node cluster, each member providing 12 bricks, three way replicated, distributed | |
> members are named node-0 ... node-5 | |
> disks are named /srv/brick-00/brick.0 .. /srv/brick-11/brick.0 | |
> | |
> Lets say the disk for /srv/brick-02/brick.0 on node-1 goes bad. | |
> | |
> Today I believe I have to: | |
> | |
> 1. ssh root@node-1 | |
> 2. kill -9 $(gluster volume status | grep /srv/brick-02/brick | awk '{print $3}') | |
> 3. # replace physical disk | |
> 4. remount /srv/brick-02 | |
> 5. mkdir /srv/brick-02/brick.1 | |
> 6. # on a FUSE mnt, do the directory dance and the xattr dance | |
> 7. # ensure other node brick trusted IDs for healing are correct | |
> 8. gluster volume replace-brick example-vol node-1:/srv/brick-02/brick.0 node-1:/srv/brick-02/brick.1 commit force | |
> | |
> Perhaps we could do the following instead: | |
> | |
> 1. ssh root@node-1 | |
> 2. gluster volume replace-brick example-vol node-1:/srv/brick-02/brick.0 offline | |
> 3. # replace physical disk | |
> 4. remount /srv/brick-02 | |
> 5. mkdir /srv/brick-02/brick.0 | |
> 6. gluster volume replace-brick example-vol node-1:/srv/brock-02/brick.0 online | |
> | |
> Then the new step #6 would take care of the old steps #6 & #7. We | |
> would lose the "kill -9" and replace with a declaration of | |
> intention command that tells gluster to "take this brick off line, | |
> stopping the brick process, and removing references to the mount | |
> point". | |
> | |
> What do you all think? | |
> | |
> Please excuse my ignorance of the ins-and-outs of gluster commands | |
> if I have the above steps off a bit. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment