Implementing a Twitter Firehose in CDH5

While trying to implement the tutorial from the series How-to: Analyze Twitter Data with Apache Hadoop I stumbled upon two issues: CDH installed using parcels, which was the recommended method. The tutorial assumes that the installation was performed using packages. As a consequence, most of the libraries and programs are installed differently. Because of CDH5 […]

Cloudera Beta 5 Installation

In a previous article I have explained how to create a simple 4-nodes Hadoop cluster using Cloudera 4. Cloudera has released a beta of the version 5, so I decided to give it a try! Installation The procedure remains unchanged, apart from the installer binary path. The following binary installer should be used: The command […]

How To Fix HBase Browser “Localhost:9090” Error

In Hue 2.5, on Cloudera Manager 4.8, the HBase Browser is not configured to be operational out of the box. At first you only receive an error message: It took me some time, but I finally found in the documentation how to enable thrift for HBase Browser: Extract from Cloudera documentation: A Hue Service Enabling […]

Creating A Simple Hadoop Cluster With VirtualBox

I wanted to get familiar with the big data world, and decided to test Hadoop. Initially I used Cloudera’s pre-built virtual machine with their full Hadoop suite pre-configured (called Cloudera QuickStart VM), and gave it a try. It was a really interesting and informative experience. The QuickStart VM is fully functional and you can test […]