Friday, June 8, 2012

Big Search with Big Data Principles - Eric Pugh

This is a presentation given by Eric Pugh at Lucene Revolution 2012. There is a lot of good information in Big Search with Big Data Principles, thought the audio is a bit hokey at the beginning.

Eric goes through some of his experiences with working with a large vendor, some of the obsticles encountered and solutions implemented.

Got hundreds of millions of documents to search? DataImportHandler blowing up while indexing? Random thread errors thrown by Solr Cell during document extraction? Query performance collapsing? Then you've searching at Big Data scale. This talk will focus on the underlying principles of Big Data, and how to apply them to Solr. This talk isn't a deep dive into SolrCloud, though we'll talk about it. It also isn't meant to be a talk on traditional scaling of Solr. Instead we'll talk about how to apply principles of big data like "Bring the code to the data, not the data to the code" to Solr. How to answer the question "How many servers will I need?" when your volume of data is exploding. Some examples of models for predicting server and data growth, and how to look back and see how good your models are! You'll leave this session armed with an understanding of why Big Data is the buzzword of the year, and how you can apply some of the principles to your own search environment.

  - Craig