Running the balancer in Cloudera Hadoop
I just started to play with Cloudera Manager 5.0.1 and a small fresh setup cluster. It has six datanodes with a total capacity of 16.84 TB, one Namenode and another node for the Cloudera Manager and other services. From start on, I was wondering how to start the HDFS balancer.
To run the balancer you need to add the balancer role to any node in you cluster!
I will show you how to do that with a few simple steps (I assume you have a cluster with CDH >= 4.3.0 up and running with at least the HDFS service).
Login to the cloudera manager
Select the service "HDFS" from the Cluster you like to enable the balancer
Select the "Instances" tab, set the checkbox for the node you like to add the balancer role (I selected the NameNode host) and click "Add"
Add the role "Balancer" to this node
Click "Continue" to add the role.
That's it! No need to restart, no need to change anything else!
Now you can run the balancer from the "Actions" menu available in the HDFS service in th top right corner.
For additional information refer to the official Adding Role Instances guide and the guide for Running the Balancer.