Archive-It: Scaling Beyond a Billion Archival Webpages - Aaron Binns
- 1,657 views
See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011 ...
See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011
Description of Archive-It, the Internet Archive's subscription, self-serve web archiving service, focused on the full-text search system. With nearly 200 partners and over 2000 collections the custom Lucene-based system handles 3+ million index updates per day across an index that totals over 1.3 billion documents. This session will give a detailed description of the architecture and implementation of the Archive-It search system; highlighting many of the challenges due to the scale as well as complex use cases.
- Total Views
- Views on SlideShare
- Embed Views