From c9eff98ed354d87eb4bf0b39b568efdb24c413b2 Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Thu, 20 Jul 2023 09:05:41 +0300 Subject: [PATCH] Update replace-brick documentation (#782) * docs: update replace-brick notes We no longer need to do manual steps before replacing a brick. It is enough to use `replace-brick` without killing brick processes or setting xattrs manually (this has been true for *years*). * docs: minor improvements to replace-brick text Use code blocks for brick name and simplify formatting in places to be consistent with other examples. Also, it seems that MkDocs renders this Markdown slightly differently than GitHub. * docs: reverse order of pure vs distributed replicate The "pure" replicate example comes first in the docs, so it seems natural to list it first here. * Update docs/Administrator-Guide/Managing-Volumes.md Co-authored-by: Karthik Subrahmanya --------- Co-authored-by: Karthik Subrahmanya --- docs/Administrator-Guide/Managing-Volumes.md | 87 ++------------------ 1 file changed, 6 insertions(+), 81 deletions(-) diff --git a/docs/Administrator-Guide/Managing-Volumes.md b/docs/Administrator-Guide/Managing-Volumes.md index 191790b7..6b280f68 100644 --- a/docs/Administrator-Guide/Managing-Volumes.md +++ b/docs/Administrator-Guide/Managing-Volumes.md @@ -185,7 +185,7 @@ operation to migrate data from the removed-bricks to the rest of the volume. To replace a brick on a distribute only volume, add the new brick and then remove the brick you want to replace. This will trigger a rebalance operation which will move data from the removed brick. -> NOTE: Replacing a brick using the 'replace-brick' command in gluster is supported only for distributed-replicate or _pure_ replicate volumes. +> NOTE: Replacing a brick using the 'replace-brick' command in gluster is supported only for _pure_ replicate or distributed-replicate volumes. Steps to remove brick Server1:/home/gfs/r2_1 and add Server1:/home/gfs/r2_2: @@ -265,73 +265,9 @@ This section of the document describes how brick: `Server1:/home/gfs/r2_0` is re Steps: -1. Make sure there is no data in the new brick Server1:/home/gfs/r2_5 +1. Make sure there is no data in the new brick `Server1:/home/gfs/r2_5` 2. Check that all the bricks are running. It is okay if the brick that is going to be replaced is down. -3. Bring the brick that is going to be replaced down if not already. - - - Get the pid of the brick by executing 'gluster volume status' - - # gluster volume status - Status of volume: r2 - Gluster process Port Online Pid - ------------------------------------------------------------------------------ - Brick Server1:/home/gfs/r2_0 49152 Y 5342 - Brick Server2:/home/gfs/r2_1 49153 Y 5354 - Brick Server1:/home/gfs/r2_2 49154 Y 5365 - Brick Server2:/home/gfs/r2_3 49155 Y 5376 - - - Login to the machine where the brick is running and kill the brick. - - # kill -15 5342 - - - Confirm that the brick is not running anymore and the other bricks are running fine. - - # gluster volume status - Status of volume: r2 - Gluster process Port Online Pid - ------------------------------------------------------------------------------ - Brick Server1:/home/gfs/r2_0 N/A N 5342 <<---- brick is not running, others are running fine. - Brick Server2:/home/gfs/r2_1 49153 Y 5354 - Brick Server1:/home/gfs/r2_2 49154 Y 5365 - Brick Server2:/home/gfs/r2_3 49155 Y 5376 - -4. Using the gluster volume fuse mount (In this example: `/mnt/r2`) set up metadata so that data will be synced to new brick (In this case it is from `Server1:/home/gfs/r2_1` to `Server1:/home/gfs/r2_5`) - - - Create a directory on the mount point that doesn't already exist. Then delete that directory, do the same for metadata changelog by doing setfattr. This operation marks the pending changelog which will tell self-heal damon/mounts to perform self-heal from `/home/gfs/r2_1` to `/home/gfs/r2_5`. - - mkdir /mnt/r2/ - rmdir /mnt/r2/ - setfattr -n trusted.non-existent-key -v abc /mnt/r2 - setfattr -x trusted.non-existent-key /mnt/r2 - - - Check that there are pending xattrs on the replica of the brick that is being replaced: - - getfattr -d -m. -e hex /home/gfs/r2_1 - # file: home/gfs/r2_1 - security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 - trusted.afr.r2-client-0=0x000000000000000300000002 <<---- xattrs are marked from source brick Server2:/home/gfs/r2_1 - trusted.afr.r2-client-1=0x000000000000000000000000 - trusted.gfid=0x00000000000000000000000000000001 - trusted.glusterfs.dht=0x0000000100000000000000007ffffffe - trusted.glusterfs.volume-id=0xde822e25ebd049ea83bfaa3c4be2b440 - -5. Volume heal info will show that '/' needs healing.(There could be more entries based on the work load. But '/' must exist) - - # gluster volume heal r2 info - Brick Server1:/home/gfs/r2_0 - Status: Transport endpoint is not connected - - Brick Server2:/home/gfs/r2_1 - / - Number of entries: 1 - - Brick Server1:/home/gfs/r2_2 - Number of entries: 0 - - Brick Server2:/home/gfs/r2_3 - Number of entries: 0 - -6. Replace the brick with 'commit force' option. Please note that other variants of replace-brick command are not supported. +3. Replace the brick with 'commit force' option. Please note that other variants of replace-brick command are not supported. - Execute replace-brick command @@ -344,25 +280,14 @@ Steps: Status of volume: r2 Gluster process Port Online Pid ------------------------------------------------------------------------------ - Brick Server1:/home/gfs/r2_5 49156 Y 5731 <<<---- new brick is online + Brick Server1:/home/gfs/r2_5 49156 Y 5731 <---- new brick is online Brick Server2:/home/gfs/r2_1 49153 Y 5354 Brick Server1:/home/gfs/r2_2 49154 Y 5365 Brick Server2:/home/gfs/r2_3 49155 Y 5376 - - Users can track the progress of self-heal using: `gluster volume heal [volname] info`. - Once self-heal completes the changelogs will be removed. - - # getfattr -d -m. -e hex /home/gfs/r2_1 - getfattr: Removing leading '/' from absolute path names - # file: home/gfs/r2_1 - security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 - trusted.afr.r2-client-0=0x000000000000000000000000 <<---- Pending changelogs are cleared. - trusted.afr.r2-client-1=0x000000000000000000000000 - trusted.gfid=0x00000000000000000000000000000001 - trusted.glusterfs.dht=0x0000000100000000000000007ffffffe - trusted.glusterfs.volume-id=0xde822e25ebd049ea83bfaa3c4be2b440 + - Users can track the progress of self-heal using: `gluster volume heal [volname] info`, or by checking the size of the new brick. - - `# gluster volume heal info` will show that no heal is required. + - `# gluster volume heal info` will show that no heal is required when the data is fully synced to the replaced brick. # gluster volume heal r2 info Brick Server1:/home/gfs/r2_5