How to install Hadoop cluster(3 node cluster) on Vmware Player.

By Tzu-Cheng Chuang 4-25-2014

Purpose: Easily setting up a Hadoop environment for testing and evaluation

This demonstration has been tested with the following software versions: CentOS 6.5 x86_64, Ambari 1.5.1, Vmware Player 6.0.2 (Ambari WebUI will be used to install HDFS, YARN+MapReduce2, Pig, Hive, Hbase, Oozie, Zookeeper ...etc.)


1. Download CentOS-6.5-x86_64-bin-DVD1.iso Image from CentOS website.

2. Install CentOS 6.5 with Virtual machine name: hadoop-1, user name: user1 , password: password, disk size: 20GB, memory: 1048MB, 1 processor.

3. Do the same installation for creating hadoop-2, and hadoop-3 Guest OS

4. (Optional)Install "Development Tools" (such as GNU C, C++ compiler) on all 3 hosts
    [root@localhost jasontgi]# yum groupinstall "Development Tools"

5. Change hostname, domain name on all 3 hosts On each host, run this command with corresponding hostname (hadoop-1, hadoop-2, hadoop-3) Set hostname manually without rebooting the box
    [root@localhost jasontgi]# hostname
On each most, modify /etc/sysconfig/network file, enter
    [root@localhost jasontgi]# vim  /etc/sysconfig/network
Modify HOSTNAME value to each corresponding value
Save and close the file.

Check ip address
    [root@localhost jasontgi]# ifconfig -a
Edit hosts file, modifying /etc/hsots file
You need to set or change the host that is set to your IP address on server. localhost hadoop-1 hadoop-2 hadoop-3
Restart the CentOS networking and other services (if any)
    [root@localhost jasontgi]# service network restart
Log out and log in back to verify network hostname
    [root@hadoop-1 jasontgi]# hostname -f
    [root@hadoop-1 jasontgi]# dnsdomainname
4. Install java-7-jdk
(1)Install java-1.7.0-openjdk-devel on each host
    [root@hadoop-1 jasontgi]# yum install java-1.7.0-openjdk-devel.x86_64
(4) Set environment variable by modifying ~/.bashrc file, put the following two lines in the end of the file
    export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64
    export PATH=$PATH:$JAVA_HOME/bin

5. Configure SSH server so that ssh from hadoop-1 to hadoop-2, hadoop-3 doesn’t need a passphrase
(1) Generate RSA pair key on each host
    [root@hadoop-1 ~]# ssh-keygen –t rsa 
(2) Enable SSH access to local machine on each host
    [root@hadoop-1 ~]# cat ~/.ssh/ >> ~/.ssh/authorized_keys
(3) Enable SSH access from hadoop-1 to hadoop-2, hadoop-3 machine on each host
    [root@hadoop-1 ~]# ssh-copy-id -i ~/.ssh/ hadoop-2
    [root@hadoop-1 ~]# ssh-copy-id -i ~/.ssh/ hadoop-3
    [root@hadoop-2 ~]# ssh-copy-id -i ~/.ssh/ hadoop-1
    [root@hadoop-3 ~]# ssh-copy-id -i ~/.ssh/ hadoop-1

6. Disable IPv6 by adding the following to /etc/sysctl.conf file, put the following two lines in the end of the file
    #disable ipv6
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1
To disable in the running system:
    [root@hadoop-1 ~]#sysctl -w net.ipv6.conf.all.disable_ipv6=1
    [root@hadoop-1 ~]#sysctl -w net.ipv6.conf.default.disable_ipv6=1

7. Disable Firewall (It will make Hadoop installation easier) On each host, run the following command
    [root@hadoop-1 ~]# service iptables save
    [root@hadoop-1 ~]# service iptables stop
    [root@hadoop-1 ~]# chkconfig iptables off
8. Disable SELinux
    [root@hadoop-1 ~]# setenforce 0
    [root@hadoop-1 ~]# ssh hadoop-2 "setenforce 0"
    [root@hadoop-1 ~]# ssh hadoop-3 "setenforce 0"
9. Update Openssl (The dafault openssl in CentOS 6.5 has issues) On each host run the following to update Openssl
    [root@hadoop-1 ~]# yum update openssl
10. Enable ntpd on each host
    [root@hadoop-1 ~]# chkconfig ntpd on
    [root@hadoop-1 ~]# ssh hadoop-2 "chkconfig ntpd on"
    [root@hadoop-1 ~]# ssh hadoop-3 "chkconfig ntpd on"
    [root@hadoop-1 ~]# service ntpd restart
    [root@hadoop-1 ~]# ssh hadoop-2 "service ntpd restart"
    [root@hadoop-1 ~]# ssh hadoop-3 "service ntpd restart"
11. Install Ambari
(1) Download Ambari repository to hadoop-1 Please refer to
    [root@hadoop-1 ~]# cd /etc/yum.repos.d/
    [root@hadoop-1 yum.repos.d]# wget

(2) Install, Setup, and Start Ambari server
    [root@hadoop-1 ~]# yum install ambari-server
    [root@hadoop-1 ~]# ambari-server setup
    [root@hadoop-1 ~]# ambari-server start

(3) Deploy Hadoop cluster using Ambari Web UI
Open a web browser on any host, and go to
Log in with username admin and password admin and follow on-screen instructions.

For the installation, it takes around 1 hour to 1.5 hours.

12. After the installation, make sure postgres start on boot
    [root@hadoop-1 ~]# chkconfig postgresql on
    [root@hadoop-1 ~]# service postgresql start
13. log on to hadoop-1 and list files on HDFS user directory
    [user1@hadoop-1 ~]$ hadoop fs -ls /user