Docu review done: Mon 03 Jul 2023 17:13:04 CEST
Table of Content
Getting stat
Command | Description |
---|---|
watch -n1 -d 'cat /proc/drbd' | shows you the actuale state and connection |
drbd-overview | shows you the state and connection with bit less details |
Create drbd on lvm
To create the drbd
, you first need to setup the disk
/partition
/lv
short summary below with lv
:
$ pvcreate /dev/sdx
$ vgcreate drbdvg /dev/sdx
$ lvcreate --name r0lv --size 10G drbdvg
and you need to have ofcourse the package installed ;)
$ apt install drbd-utils
Next is to create the drbd
configuration.
In our sample we use r0
as resource name.
In here you specify the hosts which are part of the drbd
cluster and where the drbd gets stored at.
This config needs to be present on all
drbd
cluster members, same goes of course for the packagedrbd-utils
and the needed space where to store thedrbd
$ cat << EOF > /etc/drbd.d/r0.res
resource r0 {
device /dev/drbd0;
disk /dev/drbdvg/r0lv;
meta-disk internal;
on server01 {
address 10.0.0.1:7789;
}
on server02 {
address 10.0.0.2:7789;
}
}
EOF
Now we are ready to create the resource r0
in drbd
and start up the service
$ drbdadm create-md r0
$ systemctl start drbd.service
You can also startup the drbd manually by running the following:
$ drbdadm up r0
Make sure that the members are now conntected to each other, by checking drbd-overview
or cat /proc/drbd
$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 12341234123412341234123
0: cs:Connected ro:Secondary/Secondayr ds:UpToDate/UpToDate C r-----
ns:0 nr:100 dw:100 dr:0 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
If it looks like the above, you are good to go, if not, then you need to figure out, why the connection is not getting established, check tcpdump and so on
Now we set one of the members to primary
$ drbdadm primary --force r0
If you are facing issues with the command bove, use this one:
$ drbdadm -- --overwrite-data-of-peer primary r0
Extend drbd live
To extend a drbd, you first need to extend the underlying lv
/pv
/partition
/md
or what ever you use on all drbd
cluster members, in our sample we go with lv
#connect to master and extend lvm
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv> # e.g. lvextend -L +24G /dev/drbdvg/r0lv
#connect to slave and do the same ( be carefull it mus have the !!! SAME SIZE !!! )
$ lvextend -L +[0-9]*G /dev/<drbdvg>/<drbdlv> # e.g. lvextend -L +24G /dev/drbdvg/r0lv
Now you should start to monitor the drbd
state, with one of the commands in Getting stat
On the primary server, we perform the resize command.
Right afyer you have executed it, you will see that drbd
starts to sync from scratch the “data” to other cluster members.
$ drbdadm resize r0
This resync can take a while, depending on your drbd size, network, hardware,…
If you have more then one
drbd
resoucre, you could use instea of the resoucre name the keywordall
, but make sure that you have prepared everything$ drbdadm resize all
Lets assume the resync finished, now you are ready to extend the filesystem inside the drbd
itself, again run this on the primary server
$ xfs_growfs /mnt/drbd_r0_data
Remove DRBD resource/device
Lets assume we want to remove the resource r1
First you need to see which resources you have
$ drbd-overview
NOTE: drbd-overview will be deprecated soon.
Please consider using drbdtop.
0:r0/0 Connected Secondary/Primary UpToDate/UpToDate
1:r1/0 Connected Secondary/Primary UpToDate/UpToDate
If the system where you are currently connected is set to Secondary
you are good already, otherwiese you need to change it first to have that state.
Now you can disconnect it by running drbdadm disconnect r1
, drbd-overview
or a cat /proc/drbd
wil show you tan the state StandAlone
Next step is to detech it like this drbdadm detach r1
. If you check again drbd-overview
it will look differnt to cat /proc/drbd
$ drbd-overview | grep r1
1:r1/0 . . .
$ cat /proc/drbd | grep "1:"
1: cs:Unconfigured
Good so far, as you dont want to keep data on there, you should wipe it
$ drbdadm wipe-md r1
Do you really want to wipe out the DRBD meta data?
[need to type 'yes' to confirm] yes
Wiping meta data...
DRBD meta data block successfully wiped out.
echo "yes" | drbdadm wipe-md r1
is working, if you need it in a script
Now we are nealy done, nex is to remove the minor.
The minor wants to have the resource number, which you can see in the drbd-overview 2>&1
, just pipe it to the greps grep -E '^ *[0-9]:' | grep -E "[0-9]+"
$ drbdsetup del-minor 1
Now we are good to go and remove the resource fully
$ drbdsetup del-resource r1
Last step, is to remove the resources file beneath /etc/drbd.d/r1.res
if you don’t have it automated ;)
Solving issues
one part of drbd is corrupt
assuming
r0
is your resoruce name
First we want to diconnect the cluster, run the commands on one of the server, mostly done on the corrupted one
$ drbdadm disconnect r0
$ drbdadm detach r0
If they are not disconnected, restart the drbd
service
Now remove the messedup device and start to recreate it
$ drbdadm wipe-md r0
$ drbdadm create-md r0
If you had to stop the drbd
service, make sure that it is started again.
Next step is to go to the server which holds the working data and run:
$ drbdadm connect r0
If its not working or they are in the Secondary/Secondary
state run (only after they are in sync):
$ drbdadm -- --overwrite-data-of-peer primary r0
Situation Primary/Unknown - Secondary/Unknown
Connect to the slave and run
$ drbdadm -- --discard-my-data connect all
Secondary returns:
r0: Failure: (102) Local address(port) already in use. Command 'drbdsetup-84 connect r0 ipv4:10.42.13.37:7789 ipv4:10.13.37.42:7789 --max-buffers=40k --discard-my-data' terminated with exit code 10
Then just perform a
drbdadm disconnect r0
and run again the command from above
Connect to the master
$ drbdadm connect all
Situation primary/primay
Option 1
Connect to the server which should be secondary
Just make sure that this one really has no needed data onit
$ drbdadm secondary r0
Option2
Connnect to the real master and run to make it the only primary
$ drbdadm -- --overwrite-data-of-peer primary r0
Now you have the state
Primary/Unknown
andSecondary/Unknown
Connect to the slave and remove the data
$ drbdadm -- --discard-my-data connect all
Situation r0 Unconfigured
drbd
shows status on slave:
$ drbd-overview
Please consider using drbdtop.
0:r0/0 Unconfigured . .
run drbd up
to bring the device up again
$ drbdadm up r0
and check out the status
$ drbd-overview
Please consider using drbdtop.
0:r0/0 SyncTarget Secondary/Primary Inconsistent/UpToDate
[=================>..] sync'ed: 94.3% (9084/140536)K
situation Connected Secondary/Primary Diskless/UpToDate
$ cat /proc/drbd
version: 8.4.10 (api:1/proto:86-101)
srcversion: 473968AD625BA317874A57E
0: cs:Connected ro:Secondary/Primary ds:Diskless/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
Rrecreate the resource, as seems like it was not fully created and bring the resouce up
$ drbdadm create-md r0
$ drbdadm up r0