Monday, June 18, 2012

Introducing Distributed Execution and MapReduce Framework

Read through Introducing Distributed Execution and MapReduce Framework today. I had read about some of the plans for Infinispan a while ago and was quite interested to see how it was coming along.

If you're not familiar with Infinispan, it's a Java-based, in-memory data grid sponsored by JBoss/Redhat. It's really a very cool project. What I think is most impressive about Infinispan is the ability to run Map/Reduce jobs across the data grid using the Callable interface. It looks like they have an extended DistributedCallable and DistributedExecutorServiceto help with running tasks on the data grid.

They are modeling their Map/Reduce framework after Hadoop while providing what could a more familiar Callable/ExecutorService model. It may lower the barrier to entry for a subset of developers more familiar with standard java threading models. I guess the only real drawback is that since this is an in-memory data grid, you're limited on how much data that you can traverse. However, it should be very fast since you're memory-centric instead of file-centric as with Hadoop.

  - Craig