Using Joomla, Zoo & SOLR to power Asia's Largest Auction House


Published on

This presentation is a walk through of our adventures in integrating various aspects on Joomla, 3PD extensions & SOLR.

The highlight in this presentation is the use of Apache SOLR to create a responsive, filtered, sortable, searchable 'image grid' with continuous pagination. This behaves a lot like Google's image search where you can keep on scrolling to get more results.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Using Joomla, Zoo & SOLR to power Asia's Largest Auction House

  1. 1. SOLR + Joomla powering the catalog of Asia's Largest Auction House
  2. 2. Parth Lawate Strategic Marketing Manager Joomla CEO, Techjoomla, Tekdi Web Solutions @parthlawate, @techjoomla Cook Bookworm Gardener JUG Pune Joomla Freak Trekking Entrepreneur Joomla Day India Open Source Software Architecht Marketing Content Strategy Hiking
  3. 3. Tekdi Technologies Pvt. Ltd. @tekdinet IOS Apps CRM Magento E Learning Ecommerce Joomla Custom Apps Android CMS HTML5 Social Networks
  4. 4. Techjoomla. For All things Joomla @techjoomla jGive People Suggest jomLike JTicketing J!Bolo Broadcast Invitex Email Beautifier SocialAds J!MailAlerts REST API Payments API Social API Quick2Cart
  5. 5. Quick Facts ● The client is a major Art & Auction house in India & is one of the largest in Asia. ● Data collation over a period of 20+ years ● Over 500,000 records with complex interrelations.
  6. 6. Quick Facts ● Complex data structure with 100+ parameters /fields on each data type ● Graphics Heavy – All artifacts have High resolution images
  7. 7. The Technical Challenge ● Over 100,000 records in the first phase of migration ● Extremely complex data relations ● Complex Data types & Record parameter volume & complexity
  8. 8. The Human Challnge ● Use of MS Excel for years to manage their knowledge base before we came on board ● Working with the client's research & archivist team who had almost no knowledge of any kind of web technologies ● Getting the team of traditional archivists to adopt a modern system.
  9. 9. The Solution ● The data complexity & relations called for using a CCK ● We Chose Zoo to serve as a base for all the customisations to come ● Custom apps based on this architechture.
  10. 10. Term Glossary ● Classification – First level cartegorisation eg. ANTQ ● Sub Classification – arm ● Artifact – Actual Record ● Masterlists – Records that can be used as Associated records or as a link between 2 or more records
  11. 11. Starting Small ● 9 Classifications ● 50 + Subsclassifications ● 50,000 Artifacts
  12. 12. The Work with Zoo ● Custom field types ● Custom association plugins in order to create records from relations ● Custom views
  13. 13. The Early Search ● Custom extension for parametric search ● One table per classificaton ● CRON based indexer ● MySQL powered with Natural language support ● Using MySQL soundex for 'did you mean' feature
  14. 14. We Want Excel ! ● Though we got the archivists to use web forms.. they still missed the ease of excel ● So we gave it to them ! With Hanson table based Mass Edit view for Zoo.
  15. 15. Bulk Processing's gotta be there ! ● Bulk Edit ● Bulk Delete ● Bulk Add ● Custom Importing Tools with volume processing & automapping
  16. 16. The Data today ● 12 Classifications ● 100 + sub classifications ● 8 Masterlists ● 200,000 artifacts The Baby's growing up !
  17. 17. More Data called for an architechture upgrade
  18. 18. Need for a better search ● 200,000 Records ● Zoo Data structure isnt optimised for search ● MYSQL based indexer would hit limits down the line. Unions across 9 tables (which could increase) would make it slower
  19. 19. Need for a better search... ● Single & 2 letter autosuggest not supported by Mysql (3 char min limit for LIKE) ● Normal search was not as fast as expected (Brought down load time for ~0.8secs to 0.3 secs)
  20. 20. Getting the data ready for SOLR ● MYSQL Indexer from earlier phase modified to create a Data normaliser to push data to SOLR ● CLI script that reads records to populate SOLR index ● Using the PHP-SOLR library
  21. 21. PHP-SOLR Library Browser SOLR Main Index Suggestions Index planned Getting SOLR into the picture ● Custom Search replaced by SOLR ● SOLR hosted on Separate Amozon instance ● Initial Implementation was only for search
  22. 22. Benefits ● Much better natural language search, ● Better relevance scoring ● Full reindex everyday ● Even browsing is now SOLR powered which means MORE SPEED ! ● Record counts per category & sub-category easily achieved using faceting ● Now using SOLR's suggester module ● Using separate 'cores' for main index and suggest terms index
  23. 23. Whats coming ? ● Autosuggest directly works off SOLR (currently piped through PHP) ● Implement delta indexing, currently not implemented due to multitude of relational data. ● Change in a bottom level record needs to flow through to all associations
  24. 24. What else is so awesome about this ?
  25. 25. HTML5 Local Storage ● HTML5 Local storage is being used to cache data locally & load used data faster ● Sets the road for offline use in the future !
  26. 26. Google Image Search anyone ? ● Ajax Grid pagination like Google Images ● Preloading & caching of images, CDN backed delivery
  27. 27. IOS App for IPad ● Powered by RESTful Webservices writen on top of Joomla using com_api ● Initial version developed in HTML5+Cordova (Phonegap) ● Supports offline use of alredy viewed data
  28. 28. Even More ! The Project is under continuous development. The features here only cover development at the point this presentation was made. ● Online Sale of Images , Downloads & Rights Managements ● Research & Teaching tools ● Social Network ● Subscription based privileged access
  29. 29. Thank You ! ● Questions ? ● Interested in developing something similar ? Drop us an email ! Twitter @techjoomla | @parthlawate