Configuring Hadoop Cluster Using Ansible PlayBook
4 min readMar 20, 2021
In this article, I demonstrated how to create playbook to launch hadoop cluster.
I used virtual machines in local system to demonstrate this practical, you can also use cloud instances.
Steps to follow during this practical
Step 1: Update Inventory In Controller Node
→ IP of NameNode is 192.168.43.70
→ IP of DataNode is 192.168.43.189
Step 2: Create playBook
→ Ansible code to installing java and hadoop software
# installing required software
- hosts: Namenode,datanode
tasks:
- name: 'copy hadoop software'
copy:
src: /root/hadoop-1.2.1-1.x86_64.rpm
dest: /root/
notify: "install hadoop software"
- name: "copy java jdk software"
copy:
src: /root/jdk-8u171-linux-x64.rpm
dest: /root/
notify: "install jdk software"
handlers:
- name: "install hadoop software"
shell: "rpm -ivh /root/hadoop-1.2.1-1.x86_64.rpm --force"
- name: "install jdk software"
shell: "rpm -ivh /root/jdk-8u171-linux-x64.rpm --force"
→ Ansible code to configuring Namenode
# configuring Namenode
- hosts: Namenode
vars_prompt:
- name: Namenode_dir
prompt: Namenode directory?
private: no
tasks:
- name: 'creating directory'
file:
state: directory
path: "{{Namenode_dir}}"
notify: "format namenode directory"
- name: "running handlers"
meta: flush_handlers
- name: 'configure core-site.xml'
blockinfile:
path: "/etc/hadoop/core-site.xml"
insertafter: "<configuration>"
block:
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9001</value>
</property>
- name: 'configure hdfs-site.xml'
blockinfile:
path: "/etc/hadoop/hdfs-site.xml"
insertafter: "<configuration>"
block:
<property>
<name>fs.name.dir</name>
<value>{{Namenode_dir}}</value>
</property>
- name: "start namenode"
shell: "hadoop-daemon.sh start namenode"
handlers:
- name: "format namenode directory"
shell: "echo Y| hadoop namenode -format"
→ Ansible code to configure datanode
# Configuring datanode
- hosts: datanode
vars_prompt:
- name: Datanode_dir
prompt: Datanode directory?
private: no
tasks:
- name: 'creating directory'
file:
state: directory
path: "{{Datanode_dir}}"
- name: 'configure core-site.xml'
blockinfile:
path: "/etc/hadoop/core-site.xml"
insertafter: "<configuration>"
block:
<property>
<name>fs.default.name</name>
<value>hdfs://{{groups['Namenode'][0]}}:9001</value>
</property>
- name: 'configure hdfs-site.xml'
blockinfile:
path: "/etc/hadoop/hdfs-site.xml"
insertafter: "<configuration>"
block:
<property>
<name>fs.data.dir</name>
<value>{{Datanode_dir}}</value>
</property>
- name: "start datanode"
shell: "hadoop-daemon.sh start datanode"
Step 3: Running PlayBook in Controller Node
→ Give a directory name to create a directory that will have all storage shared by datanode.
→ Give directory name to create a directory that will share storage to Master.
→ Check whether Namenode has started or not
→ Check whether datanode has started or not
→ Check information about cluster.