Integrating LVM with Hadoop and providing Elasticity to DataNode Storage

Monil Goyal
4 min readMar 14, 2021

--

This article is about Integrating LVM with Hadoop and providing Elasticity to DataNode Storage.

I am using Rhel8 on an oracle virtual machine and I have attached two secondary harddisks to it already.

Go to the section Attaching Virtual hard disk on Virtual OS to know how to create virtual hard disk

Use the following command to know your hard Disks has been connected successfully.

fdisk -l 

Here, None of them is mounted anywhere.

First, we have to create a physical volume.

Steps to create Logical volumes :

  1. Create Physical volumes
  2. Create a Volume group for a physical volume
  3. Create a logical volume in the above-created volume group
  4. Format Logical volume to mount.
  5. Mount logical volume on a folder to share the storage to the Hadoop cluster
  6. Update the folder name in hdfs-site.xml file in datanode
  7. Take the storage from the volume group and increase the size of the logical volume on the fly.
  8. Extending Volume group for more Storage

Step 1: Create Physical volumes

pvcreate Disk_Name

Our Physical volumes for both the disk have been created successfully.

To know details of Physical volumes use the following commands

pvdisplay Disk_name

Step 2: Create a Volume group for one or more disk

vgcreate vg_name  pv_name   →  Create Volume groumevgdisplay vg_name              To display details of volume group

Step3: Create a logical volume in the above-created volume group

lvcreate  --size size_valueG  --name  lv_name vg_name

To display all resources we can use the following command

pvs           →      To list all Physical volumes 
vgs → To list all Volume groups
lvs → To list all logical volumes

Step 4: Format Logical volume to mount

mkfs.ext4 /dev/vg_name/lv_name

Step 5: Mount logical volume on a folder to share the storage to the Hadoop cluster

  1. create a folder to mount the logical volume
  2. Mount the logical volume to the folder

Step 6: Update the folder name in hdfs-site.xml file in datanode and starting datanode

hdfs-site.xml file

hdfs-site.xml

Core-site.xml file

Core-site.xml

Starting Datanode

hadoop-daemon.sh start datanode
jps → To check
hadoop dfsadmin -report           →     To cluster information

Our cluster has almost 5GB of shared storage

Step 7: Take the storage from the volume group and increase the size of the logical volume on the fly.

lvextend --size +size_noG /dev/vg_name/mylv1

Now, We have to format the increased partition to store data in it

resize2fs /dev/vg_name/mylv1

Now, check the result.

The size of hard disk is 8 GB and if we require more storage then we can extend the Volume group

Step 8: Extending Volume group for more Storage

vgextend  vg_name  another_pv_name

Now, our volume group has increased to 16 GB

Now, We can extend our Logical volume.

Format and then use it.

Thanks, Everyone for reading…

Keep Learning Keep Sharing !!!

--

--