HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket

Like this? Share it with your network

Share

HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket

  • 1,631 views
Uploaded on

Solbase is an exciting new open-source, real-time search engine being developed at Photobucket to service the over 30 million daily search requests Photobucket handles. Solbase replaces Lucene’s......

Solbase is an exciting new open-source, real-time search engine being developed at Photobucket to service the over 30 million daily search requests Photobucket handles. Solbase replaces Lucene’s file system-based index with HBase. This allows the system to update in real-time and linearly scale to serve millions of daily search requests on a large dataset. This session will explore the architecture of Solbase as well as some of Lucene/Solr’s inherent issues we overcame. Finally, we’ll go over performance metrics of Solbase against production traffic.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,631
On Slideshare
1,494
From Embeds
137
Number of Embeds
2

Actions

Shares
Downloads
75
Comments
0
Likes
1

Embeds 137

http://www.cloudera.com 136
http://blog.cloudera.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Simply put, solbaseis an open-source, real-time search platform based on Lucene, Solr and Hbase
  • Why are we so interested in search?Search is 40% of total page views at Photobucket, comprising around 30 million requests per dayWe’re only indexing 500 million images because we only index the public ones with metadata, and because of the limitations of the previous architectureComes out to about 400 requests per second, sometimes more, sometimes less
  • WhySolbase? Photobucket was using Lucene and Solr prior to Solbase and was handling the volume, why create a new platform?4 main reasons: Memory issues with existing platform, indexing time, Speed and capacity, flexibililty
  • Lucene’s Field cache for sorting and fiiltering became very problematic for usEach field is ‘cached’ in an array that is allocated to the size of the maximum document IDAs an example, our current document space is 500 million documents. Each ‘sortable’ or ‘filterable’ field (eg document creation date) is an integer, so an array is created with 500 million integers, or 2 gigabytes.Since it is a weak reference, as other usage in the system grows and gets close to the max VM memory size, the array is garbage collected (which is expensive). But the next sorted search (all of PB’s searches) will immediately allocate and populate (from the index) the array.This backs up all requests and creates a vicious cycle of large memory creation and garbage collection, until the VM eventually melts down
  • Every 100 ms improvement in response time equates to approximately 1 extra page view per visit.Ends up being hundreds of millions of extra page views per monthEven a small speed improvement can equal a large amount of page views and hence moneyLucene is designed and architected around indices that are stored in large files. It’s caching mechanism (other than the query cache) relies on the operation system to have the files cached in memoryWhen memory is low, the OS removes the files from memory, creating potentially serious performance degredation There are other solutions, such as caching filters, etc, but none we found compelling for real time
  • Example: search by user album, file size, type, image dimensions, geo, etc
  • Modify Lucene and Solr to use Hbase as the source of index and document data.We wanted to continue to use the pieces of Lucene and Solr that were working great for us, such as the analyzer, parser, the sharding architecture.First we wanted scalable solution as we have 3 million daily uploaded medias that can potentially be indexed. Ever growing data sizeMap/Reduce layer can give us power to analyze indexed dataAnd lastly and most importantly, keys are already ordere. In other words, it is already sorted for inverted index.
  • First thing we had to do was come up with a way of storing the indices in hbase in a performant way. As we mentioned before, a big-table like data base shares some conceptual similarity with an inverted index, mainly it has a set of ordered keys that can be quickly scanned to access ordered dataOur first table is the Term/Document tableThis table consists of a set of rows where each row specifies a term/document combination and its accompanying metadata. Variable length portions of the key which are strings (the field, term) are delimited by the 0xffff four bytesAll data is encoded in the key, No column families or columns. Through experimentation we came to conclusion that this was the fastest and most efficient implementationOther possibility was key of <field>0xfff<term> , a column family of ‘document’ with a column qualifier of ‘document id’, and then the column data would be the metadata. This would be a more ‘canonical’ use of a big-table like database. It turns out that due to the way the Hbase client works, etc, this was not as efficient at our scaleAlso tried other hybrids of the two types of layouts, including chunking; nothing was as good as a key-onlyEncoded metadata contains “normalization”, “offsets”, “positions” needed for scoring and finding placements. It also contains sort/filter values.TV tableOne column family, no need for qualifier. Stores inverted index info given termDoc tableField and allTerms column family.Field are specific to user’s index. Qualifier used to differentiate user content
  • Now that we have setup our basic data layout, we’d like to query against our inverted index and fetch some data out of it. Since all of keys are already lexicographically ordered, if we can build begin and end keys given term, we can efficiently fetch data out of hbase table.So when a term is requested by a query, an Hbase scan is created for that term, using a beginning key of <field>0xffff<term>0xffff0x0000 and an end key of <field>0xffff<term>0xffff0xffff. This retrieves all of the documents and associated metadata and the Hbase client allows this to be done in a cursored wayInformation is then stored in a single byte array which is used as the cached value of the query.
  • In Solr, each instance of Solr has to have physical index file under local file system. Adding a new shard in solr means, you’d have to split existing indices into half and distribute across all of sharding machines.In Solbase, indices are accessed via central database, Hbase, we can easily divide data set into smaller chunk using start and end keys. Therefore, it’s allowed us to create dynamic sharding so to speak.For example, if there were 10 shards and a total number of docs in our indices is 100. then the first shard would be responsible for documents 1-10, the second for 11-20, third for 21-30 and so on.We are going to run on 16 shards spread among 4 servers, with 4 clusters.
  • Encoded Metadata is already using a single byte to embed normalization, positions, and offsets. we have 5 extra bits left and turned them into indicate if term document has sort or filter values.Each Term/Document data object size got slightly bigger, but we now have solved Lucene’s memory constraints when using sort/filter queries.Plus it is very fast, there is no lookup of a sort value for each document since the sorts and filters are embedded in the index
  • skip if everyone knows - Brief explanation of map/reduce framework – distributed processing architecture pioneered by googleLarge data set can be easily indexed initially using map/reduce framework. HDFS can distribute large data set, and map/reduce framework will process large data set in distributed processing manner.For real time, new set of data can be sent over to Solbase via Solr update api. In our use case, we have cron to fetch delta data every so often, preprocess those data and send them over to SolbaseSolbase has to modify cache object properly and finally do a store into Hbase index tables.
  • From running a true production load test we have gotten the following results:Largest term, ‘me’ has 14 million documents out of a total document space of 500 million, takes 13 seconds to load from Hbase. From cache, ‘me’ has a response time of 500 msAverage query time for native Solr/Lucene (including squid) : 169 msAverage query time for Solbase: 109 ms or 35% decreaseWith a full production load, we are able to process approximately 300 real-time updates per second
  • We have forked off of Solr and Lucene to come over some of inherent issues. Off of lucene 2.9.3Per data center, we have a dedicated HBase/Solbase cluster. Replication between data center syncs the data.

Transcript

  • 1. Kyungseog OhMay 22, 2012HBaseCon
  • 2. Solbase is an open-source, real-timesearch platform based on Lucene,Solr and HBase built at PhotobucketWhat is Solbase?
  • 3. • 40% of total page views• 500 million ‘docs’ or images• 30 million search requests per day• 120 Gigabyte size• Previous infrastructure built on Solr/LuceneSearch at Photobucket
  • 4. • Memory issues• Indexing time• Speed• Capacity and ScalabilityWhy Solbase?
  • 5. • Field Cache – Sortable and filterable fields stored in a java array the size of the maximum document number• Example – Every doc is sorted by an integer field, for 500 million documents the array is 2 GB in sizeLucene Memory Issues
  • 6. • Solr indexing took 15-16 hours to rebuild the indices• We wanted to provide near real-time updatesIndexing Time
  • 7. • Every 100 ms improvement in response time equates to approximately 1 extra page view per visit.• Can end up being hundreds of millions of extra page views per monthSpeed
  • 8. • Impractical to add significant number of new docs and data (Geo, Exif, etc)• Difficult to divide data set to create brand new shard• Fault tolerance is not built inCapacity & Scalability
  • 9. Modify Lucene and Solr to use HBase asthe source of index and documentdataThe Concept
  • 10. Term/Document Tablescreate TV, d, {COMPRESSION=>SNAPPY,NAME=>d,VERSION=>1,REPLICATION_SCOPE=>1}create Docs, field, allTerms, timestamp,{COMPRESSION=>SNAPPY,NAME=>field,VERSION=>1,REPLICATION_SCOPE=>1},{COMPRESSION=>SNAPPY,NAME=>allTerms,VERSION=>1,REPLICATION_SCOPE=>1},{COMPRESSION=>SNAPPY,NAME=>timestamp,VERSION=>1, REPLICATION_SCOPE=>1}Solbase tables
  • 11. Term Queries are HBase range scansStart key<field><delimiter><term><delimiter><begin doc id>0x00000000End key<field><delimiter><term><delimiter><end doc id>0xffffffffQuery Methodology
  • 12. SolrSharding Master Shard Shard Shard Shard Index File Index File Index File Index FileSolbaseSharding Master Shard Shard Shard Shard HBaseSolbase – Distributed Processing
  • 13. • Extra bits in Encoded Metadata • Solved Lucene’s sort/filter field cache issueSolbase – Sorts & Filters
  • 14. • Initial Indexing – Leveraging Map/Reduce Framework• Real-Time Indexing – Using Solr’s update APISolbase – Indexing Process
  • 15. • Term ‘me’ takes 13 seconds to load from HBase, 500 ms from cache – ‘me’ has ~14M docs, the largest term in our indices• Most terms not in cache take < 200 ms• Most cached terms take < 20 ms• Average query time for native Solr/Lucene: 169 ms• Average query time for Solbase: 109 ms or 35% decrease• ~300 real-time updates per secondResults
  • 16. • Compatibility issue with latest Solr• CDH3 latest build• HBase/Solbase clusters per data centerHBase configuration/Limitation
  • 17. • https://github.com/Photobucket/Solb ase• https://github.com/Photobucket/Solb ase-Lucene• https://github.com/Photobucket/Solb ase-SolrRepos
  • 18. Q&A