Category Archives: Hadoop

Fun with Hadoop-Part 1

I deployed openstack private cloud previously at my home.

I have completed the series on my company’s official blog at http://pythian.com/blog/author/bhagat/

I was playing with bigdata and hadoop for some time now and decided to make use of my cloud infra. So I deployed 7 node hadoop cluster and did some fun stuff with it.

The first problem I faced however was computing power.

I had only one compute node and I was short on computing power to run the cluster. To overcome this I added 2 additional compute nodes in the cloud infra. But that is for another series.

I want to talk about hadoop so lets get started with it.

I decided to make use of Hortonworks hadoop with ambari server as the documentation and resources available to me were mostly for Hortonworks hadoop.

I first played with it on aws free tier but quickly realized that it might be costlier option for me to just play and learn hadoop.

So that made me decide to deploy my own hadoop cluster at my own place.

I am writing the contents of the follow up post, stay tuned for it.

On request from my company I have completed the series on Pythian official blog.

You can find it here once it is published.