Dual-primary DRBD on a Cluster File System using OCFS2 with Encrypted Disk using LUKS

When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload. In this guide, I’ll be using two Ubuntu-14 instances on Amazon Web Services using free tier account with separate public and private networks and allotted 2 disks on each host, 1 for the system and 1 for our storage. The instances are called node01 and node02 to make things easier to follow.

When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload.

In this guide, I’ll be using two Ubuntu-14 instances on Amazon Web Services using free tier account with separate public and private networks and allotted 2 disks on each host, 1 for the system and 1 for our storage. The instances are called node01 and node02 to make things easier to follow.

For the purposes of this guide, we assume the following environment:

  • Our two DRBD nodes each have bridge network interface br0 on  eth1, with IP addresses 10.0.0.1/24 and 10.0.0.2/24 assigned to it, respectively.
  • Our disk name are /dev/sda1 and /dev/sdb1 on both nodes.
  • No other services are using TCP ports 7788 through 7799 on either host.
  • The local firewall configuration allows both inbound and outbound TCP connections between the hosts over these ports.

First, we need to install and set-up DRBD with two primary nodes and  OCFS2 as our cluster file system.

# apt-get install drbd8-utils ocfs2-tools

Creating a DRBD resource suitable for OCFS2. Since OCFS2 is a shared cluster file system expecting concurrent read/write storage access from all cluster nodes, any DRBD resource to be used for storing a OCFS2 filesystem must be configured in dual-primary Mode.

We will name our drbd resource as “datastore” with file name “datastore.res” located at “/etc/drbd.d”. You may choose any name that you want.

# touch /etc/drbd.d/datastore.res
# vi /etc/drbd.d/datastore.res
resource datastore {
        Logical common/shared device for storage
        device  /dev/drbd0;
        # meta information
        meta-disk internal;
        # disk address
        disk /dev/sdb1;

        startup { become-primary-on both; }
        
        net {
                # allow-two-primaries yes;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
        }
        # define node 1 information
        on node01 { address 10.0.0.1:7789; }
        # define node 2 information
        on node02 { address 10.0.0.2:7789; }

It is not recommended to set the allow-two-primaries option to yes upon initial configuration. Uncomment after the initial resource synchronization has Completed.

The net section is telling DRBD to do the following:

  • allow-two-primaries – Generally, DRBD has a primary and a secondary node. In this case, we will allow both nodes to have the filesystem mounted at the same time. Do this only with a clustered filesystem. If you do this with a non-clustered filesystem like ext2/ext3/ext4 or reiserfs, you will have data corruption. Seriously!
  • after-sb-0pri discard-zero-changes – DRBD detected a split-brain scenario, but none of the nodes think they’re a primary. DRBD will take the newest modifications and apply them to the node that didn’t have any changes.
  • after-sb-1pri discard-secondary – DRBD detected a split-brain scenario, but one node is the primary and the other is the secondary. In this case, DRBD will decide that the secondary node is the victim and it will sync data from the primary to the secondary automatically.
  • after-sb-2pri disconnect – DRBD detected a split-brain scenario, but it can’t figure out which node has the right data. It tries to protect the consistency of both nodes by disconnecting the DRBD volume entirely. You’ll have to tell DRBD which node has the valid data in order to reconnect the volume. Use extreme caution if you find yourself in this scenario.

Modify /etc/drbd.d/global_common.conf. The file /etc/drbd.d/global_common.conf is configured so the synchronization is as much safe as possible by using protocol “C”, a slow synchronization but more safe.

global {
        usage-count yes;
        minor-count 16;
}
        common {
        protocol C;

}

Set up the partition used by DRBD for the replication on both node, this is optional and only used if the disk or volume is used prior to this setup.

# umount /dev/sdb1
# e2fsck -f /dev/sdb1
# resizefs /dev/sdb1 <size-of-disk>

Now we can create the volume and start DRBD:

# drbdadm create-md datastore
# /etc/init.d/drbd start
# cat /proc/drbd

Start initial resource synchronization by promoting one drbd node as master. The synchronization will take hours depending on the size of storage

# drbdsetup /dev/dbrd0 primary –o
# watch -n1 cat /proc/drbd

After synchronization is done, edit on both node datastore.res on /etc/drbd.d/ and uncomment the following parameters to enable primary-primary storage.

# allow-two-primaries yes;

Restart drbd service and run the command below to verify if status is primary/primary

# cat /proc/drbd

version: 8.4.3 (api:1/proto:86-101)

built-in
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r—–
    ns:1093794 nr:829116 dw:2248869 dr:2211744 al:276 bm:414 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

On both nodes, do disk encryption using cryptsetup.

# apt-get install cryptsetup
# cryptsetup luksFormat /dev/drbd0
# cryptsetup luksOpen /dev/drbd0 drbd0

Create an OCFS2 partition on the primary node only.

# mkfs.ocfs2 -N 2 -L ocfs2drbd /dev/mapper/drbd0

Pacemaker on Ubuntu 14 and Ubuntu 16 does not support pacemaker data lock Manager so we will be using OCFS2 DLM for legacy system. OCFS2 uses a central configuration file, /etc/ocfs2/cluster.conf.

node:
    ip_port = 7777
    ip_address = 10.0.0.1
    number = 0
    name = <resolvable hostname of node 1>
    cluster = ocfs2
node:
    ip_port = 7777
    ip_address = 10.0.0.2
    number = 1
    name = <resolvable hostname of node 2>
    cluster = ocfs2
cluster:
    node_count = 2
    name = ocfs2

Configuring the O2CB driver, responsible for DLM. Afterwards, mount the file system to /opt/stock folder to verify that is working, create the folder if it does not exist.

# dpkg-reconfigure ocfs2-tools

On both nodes, create a mount point, mount the volumes and configure them to automatically mount at boot time.

# mkdir -p /mnt/storage
# echo “/dev/mapper/drbd0  /mnt/storage  ocfs2  noauto,noatime  0 0” >> /etc/fstab
# mount.ocfs2 /dev/mapper/drbd0  /mnt/storage
# chmod 755 /opt/stock/def && chmod 755 /opt/stock/disk

On both nodes, Start OCFS2 and enable it at boot up:

# systemctl enable ocfs2 && systemctl enable o2cb
# systemctl start ocfs2 && systemctl start o2cb

At this point, it should be all done. If you want to test OCFS2, copy a file into your /mnt/storage mount on one node and check that it appears on the other node. If you remove it, it should be gone instantly on both nodes. This is a great opportunity to test reboots of both machines to ensure that everything comes up properly at boot time.

lordfrancs3
lordfrancs3

Lordfrancis3 is a member of PinoyLinux since its establishment in 2011. With a wealth of experience spanning numerous years, he possesses a profound understanding of managing and deploying intricate infrastructure. His contributions have undoubtedly played a pivotal role in shaping the community's growth and success. His expertise and dedication reflect in every aspect of the journey, as PinoyLinux continues to champion the ideals of Linux and open-source technology. LordFrancis3's extensive experience remains an invaluable asset, and his commitment inspires fellow members to reach new heights. His enduring dedication to PinoyLinux's evolution is truly commendable.

Articles: 32

Leave a Reply

Your email address will not be published. Required fields are marked *

Protected by CleanTalk Anti-Spam