0
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
FunnelWeb
Easy Content Conversions
Dylan Jay
PretaWeb
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Content Conversions suck

Large existing sites

Static html or old CMS

Har...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
History

2008 - Obrien Intranet

2009 – pretaweb.funnelweb (deprecated)

Pl...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Demo
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
funnelweb.recipe

Add to buildout
[funnelweb]
recipe = funnelweb
crawler-url=...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
bin/funnelweb

Crawls

Caches locally

Filters

Removes template

Restruc...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Common Options

crawler:site_url

crawler:ignore

ploneupload:target

temp...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Command Line

bin/funnelweb --crawler:max=50
--localupload:output=var/funnelw...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Viewing the Pipeline

bin/funnelweb --pipeline
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Custom pipeline

bin/funnelweb –pipeline > pipeline.cfg

{edit} pipeline.cfg...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Making your own blueprint
class MyBlueprint(object):
classProvides(ISectionBlu...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.webcrawler

transmogrify.webcrawler

Crawls site or cache for c...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.htmlcontentextractor

transmogrify.htmlcontentextractor

Provid...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.siteanalyser

transmogrify.siteanalyser.relinker

Moves, rename...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.ploneremote

Remoteconstructor

Adds content to plone via xmlrp...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Other blueprints

transmogrify.pathsorter

Puts folders before content and c...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Where to get it

http://github.com:djay/funnelweb.git

http://github.com:dja...
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
#TODO
• Extract content styles into visual editor
dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Thanks
• djay@pretaweb.com
• IRC: djjay
• Twitter: djay75
Upcoming SlideShare
Loading in...5
×

Funnelweb ploneconf2010

583

Published on

PloneConf2010 talk about easy content conversion framework called funnelweb. Makes importing any site easy.

1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total Views
583
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Funnelweb ploneconf2010"

  1. 1. dylan@pretaweb.comPlone Conf 2010 Dylan Jay FunnelWeb Easy Content Conversions Dylan Jay PretaWeb
  2. 2. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Content Conversions suck  Large existing sites  Static html or old CMS  Hard to quote on  Content audit  Use plone to fix content  Convert Docs to Pages (coming...)
  3. 3. dylan@pretaweb.comPlone Conf 2010 Dylan Jay History  2008 - Obrien Intranet  2009 – pretaweb.funnelweb (deprecated)  Plone UI > Actions > Import  2010 – transmogrify.* release on pypi  2010 – collective.developermanual  sphinx to plone  2010 – funnelweb Recipe + Script  Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim Knap
  4. 4. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Demo
  5. 5. dylan@pretaweb.comPlone Conf 2010 Dylan Jay funnelweb.recipe  Add to buildout [funnelweb] recipe = funnelweb crawler-url=http://www.whitehouse.gov
  6. 6. dylan@pretaweb.comPlone Conf 2010 Dylan Jay bin/funnelweb  Crawls  Caches locally  Filters  Removes template  Restructures  Determines title,hidden etc  Uploads to plone
  7. 7. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Common Options  crawler:site_url  crawler:ignore  ploneupload:target  template1:description  template1:text  *-disable
  8. 8. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Command Line  bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug
  9. 9. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Viewing the Pipeline  bin/funnelweb --pipeline
  10. 10. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Custom pipeline  bin/funnelweb –pipeline > pipeline.cfg  {edit} pipeline.cfg  bin/funnelweb --pipeline=pipeline.cfg
  11. 11. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Making your own blueprint class MyBlueprint(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.previous = previous def __iter__(self): for item in self.previous: dosomethingto(item) yield item <utility component=".myblueprint.MyBluePrintr" name="transmogrify.myblueprint" />
  12. 12. dylan@pretaweb.comPlone Conf 2010 Dylan Jay transmogrify.webcrawler  transmogrify.webcrawler  Crawls site or cache for content  transmogrify.webcrawler.typerecognitor  Sets Plone content type based on mime-type  transmogrify.webcrawler.cache  Saves content to disk
  13. 13. dylan@pretaweb.comPlone Conf 2010 Dylan Jay transmogrify.htmlcontentextractor  transmogrify.htmlcontentextractor  Provide XPath for title, description, text etc.  transmogrify.htmlcontentextractor.auto  Guesses XPaths from content
  14. 14. dylan@pretaweb.comPlone Conf 2010 Dylan Jay transmogrify.siteanalyser  transmogrify.siteanalyser.relinker  Moves, renames, url tidying  transmogrify.siteanalyser.title  Guess page titles  transmogrify.siteanalyser.defaultpage  Move index pages into folders  transmogrify.siteanalyser.attach  Move attachments closer to pages
  15. 15. dylan@pretaweb.comPlone Conf 2010 Dylan Jay transmogrify.ploneremote  Remoteconstructor  Adds content to plone via xmlrpc  Remoteschemaupdater  Updates content of existing object  Remotenavigationexcluder  Hides content not in orginal sites navigation  Remoteworkflowupdater  Publish content  Remoteredirector  Creates aliases for items that have moved
  16. 16. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Other blueprints  transmogrify.pathsorter  Puts folders before content and content in right order  collective.transmogrifier.sections.condition  Useful to drop certain content
  17. 17. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Where to get it  http://github.com:djay/funnelweb.git  http://github.com:djay/transmogrify.*  Pypi release TBA
  18. 18. dylan@pretaweb.comPlone Conf 2010 Dylan Jay #TODO • Extract content styles into visual editor
  19. 19. dylan@pretaweb.comPlone Conf 2010 Dylan Jay Thanks • djay@pretaweb.com • IRC: djjay • Twitter: djay75
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×