Your SlideShare is downloading. ×
  • Like
Importing wikipedia in Plone
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Importing wikipedia in Plone

  • 643 views
Published

Eric BREHAULT – Plone Conference 2013

Eric BREHAULT – Plone Conference 2013

Published in Technology , Self Improvement
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
643
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Importing Wikipedia in Plone Eric BREHAULT – Plone Conference 2013
  • 2. ZODB is good at storing objects ● Plone contents are objects, ● we store them in the ZODB, ● everything is fine, end of the story.
  • 3. But what if ... ... we want to store non-contentish records? Like polls, statistics, mail-list subscribers, etc., or any business-specific structured data.
  • 4. Store them as contents anyway That is a powerfull solution. But there are 2 major problems...
  • 5. Problem 1: You need to manage a secondary system ● you need to deploy it, ● you need to backup it, ● you need to secure it, ● etc.
  • 6. Problem 2: I hate SQL No explanation here.
  • 7. I think I just cannot digest it...
  • 8. How to store many records in the ZODB? ● Is the ZODB strong enough? ● Is the ZCatalog strong enough?
  • 9. My grandmother often told me "If you want to become stronger, you have to eat your soup."
  • 10. Where do we find a good soup for Plone? In a super souper!!!
  • 11. souper.plone and souper ● It provides both storage and indexing. ● Record can store any persistent pickable data. ● Created by BlueDynamics. ● Based on ZODB BTrees, node.ext.zodb, and repoze.catalog.
  • 12. Add a record >>> soup = get_soup('mysoup', context) >>> record = Record() >>> record.attrs['user'] = 'user1' >>> record.attrs['text'] = u'foo bar baz' >>> record.attrs['keywords'] = [u'1', u'2', u'ü'] >>> record_id = soup.add(record)
  • 13. Record in record >>> record['homeaddress'] = Record() >>> record['homeaddress'].attrs['zip'] = '6020' >>> record['homeaddress'].attrs['town'] = 'Innsbruck' >>> record['homeaddress'].attrs['country'] = 'Austria'
  • 14. Access record >>> from souper.soup import get_soup >>> soup = get_soup('mysoup', context) >>> record = soup.get(record_id)
  • 15. Query >>> from repoze.catalog.query import Eq, Contains >>> [r for r in soup.query(Eq('user', 'user1') & Contains('text', 'foo'))] [<Record object 'None' at ...>] or using CQE format >>> [r for r in soup.query("user == 'user1' and 'foo' in text")] [<Record object 'None' at ...>]
  • 16. souper ● a Soup-container can be moved to a specific ZODB mount- point, ● it can be shared across multiple independent Plone instances, ● souper works on Plone and Pyramid.
  • 17. Plomino & souper ● we use Plomino to build non-content oriented apps easily, ● we use souper to store huge amount of application data.
  • 18. Plomino data storage Originally, documents (=record) were ATFolder. Capacity about 30 000.
  • 19. Plomino data storage Since 1.14, documents are pure CMF. Capacity about 100 000. Usally the Plomino ZCatalog contains a lot of indexes.
  • 20. Plomino & souper With souper, documents are just soup records. Capacity: several millions.
  • 21. Typical use case ● Store 500 000 addresses, ● Be able to query them in full text and display the result on a map. Demo
  • 22. What is the limit? Can we import Wikipedia in souper? Demo with 400 000 records Demo with 5,5 millions of records
  • 23. Conclusion ● Usage performances are good, ● Plone performances are not impacted. Use it!
  • 24. Thoughts ● What about a REST API on top of it? ● Massive import is long and difficult, could it be improved?
  • 25. Makina Corpus For all questions related to this talk, please contact Éric Bréhault eric.brehault@makina-corpus.com Tel : +33 534 566 958 www.makina-corpus.com