Tuesday, April 8, 2014

Blue Raspberry Pi Cluster - 10 Nodes Running ElasticSearch, Hadoop, and Tomcat!

This is my latest project. It's a 10-node Raspberry Pi Cluster! This was built as part of the Utah Big Mountain Data Conference. It is a competition prize and will be given away as a promotional item from NosqlRevolution LLC which is a Nosql/ElasticSearch/Hadoop/Machine Learning consulting company that I founded.

The idea behind the box is to be able to show and demonstrate many of the concepts that are being talked about during the conference. It also gives an idea of how an individual may be able to work and study these concepts in a very small form factor.

From the Application node, the USB and HDMI ports are extended to the outside of the box. A network port from the 16-port switch is also extended. You can plug in a keyboard, mouse, video, and network and then use the box similar as you would a Linux PC.

The cluster is currently pulling Big Data related tweets directly into ElasticSearch via the Twitter river plugin. Periodically, some of the tweets are pulled into the Hadoop cluster for some basic processing, then written back to ElasticSearch for display. A Java REST application with an AngularJS front end provides a search interface for the tweets and displays the results and basic trending provided by Hadoop.

There you go. A single box that let's us run all of this software and demonstrate and end-to-end "Big Data" system.

Stats:

  • 10 Raspberry Pi Model B with 512MB ram and 32GB CD each.
  • Total cluster memory - 5GB
  • Total cluster storage - 320GB
  • 5 Hadoop nodes - 1 name node and 4 data nodes
  • 4 ElasticSearch nodes
  • 1 Application node running tomcat
  • 1 16-port network switch


Please join us on April 12, 2014 in Salt Lake City, UT. See www.uhug.org and utahbigmountaindata.com and utahcodecamp.com for more information.

  - Craig