Tuesday, March 12, 2013

Acunu - Eric Evans - Castle-enhanced Cassandra

Castle-enhanced Cassandra is a 20 minute video comes from Berlin Buzzwords and was given by Eric Evans of Acunu.

Castle is a new FLOSS storage backend for Linux as a LKM. It's designed to work with the SSTables that Casandra uses for storage. It's a write-optimized storage system that's optimized to work with both rotational disks and SSDs.

It looks like a great project, and glad something like this is open source. It would be interesting to see of other projects can take advantage of this project. There are other NOSQL projects that, while not using SSTs, use similar write-append strategies.

Thanks Acunu!

  - Craig

Wednesday, March 6, 2013

Wired - Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon

Nice article on Wired.com that revolves around the open source Apache incubator project Mesos.

Mesos is considered to be cluster management software. It is designed to take your entire data center and virtualize the resources for applications running under Mesos.

You don't have to set up a dedicated cluster of machines for a single purpose like running Hadoop. Mesos allows you to set up multiple Hadoop cluster instances over the same hardware set to make more effecient use of resources.

Google's system is referred to as Borg, which is fitting. It is being upgraded soon to Omega. The underlying goal is the same, to effeciently use computing resources for all of the required tasks.

Mesos is being supported by a number of engineers and companies including Twitter and some former Google engineers who worked on Borg.

  - Craig

Tuesday, February 5, 2013

Testing MapReduce with MRUnit By Mansoor Ashraf

http://m-mansur-ashraf.blogspot.com/2013/02/testing-mapreduce-with-mrunit.html.

Here is the home page for MRUnit. http://mrunit.apache.org/

I ran across this blog today and thought it was really worthwhile.I don't know how many of you are into writing MR jobs, but having MRUnit is an invaluable piece of kit to have available.

Many MR jobs are very time consuming and even running small jobs can be problematic in terms of time spinning up a job and reviewing the results, but there may not be resources available to run your job and test the results. On top of that, there can be additional expenses involved if you're running something like Elastic Map Reduce on AWS.

MRUnit is also a good learning tool, giving you a change try try out MR features or test new algorithms in a quick and safe environment. Unit tests in general allow other people to come up to speed on how your code functions quickly, provides usage documentation, and some protection against your code being broken.

Go ahead and give it at try.

  - Craig