0
Large Files
                       Without the Trials

                        Aaron VanDerlip and Sally Kleinfeldt
      ...
Acknowledgments
                       • Bioneers provides environmental education
                         and social con...
Acknowledgments


                       • Aaron VanDerlip - Project Manager
                       • Kapil Thangavelu - D...
What is a Big File?


                       • Anything that makes you wait...


Friday, May 28, 2010
Plone Problems with
                              Big Files

                       1.Uploading/Downloading
              ...
Uploading Big Files




                       • Both the user and a Zope thread are
                         waiting for ...
Friday, May 28, 2010

Typically Zope has to process the entire Request coming from Apache. This can cause Zope to
block if...
Uploading Big Files

                       • Browser encodes file in multipart mime
                         format
      ...
Downloading Big Files


                       • ...the same thing happens in reverse



Friday, May 28, 2010
Learning from Rails
                       • Get file encoding/unencoding and read/
                         write operatio...
Learning from Rails

                       • Uploads: Apache plus mod_porter
                         http://therailsway....
Mod Porter
                       • Parses the multipart mime data
                       • Writes the file to disk
       ...
Mod Porter




Friday, May 28, 2010

Mod Porter process the multipart mime data quickly and writes it to disk. It then sen...
Apache Config for
                               Mod Porter
                       LoadModule apreq_module /usr/lib/Apache2...
X-Sendfile

                       • HTTP header
                       • Set an X-Sendfile header and the path of a
       ...
Apache Config for
                                X-Sendfile
                       LoadModule xsendfile_module /usr/lib/Apa...
Using X-Sendfile
                            from Python
                       def download(self, response, file_path):

 ...
Blob Storage
                       • Uploads
                        • Blob.consumeFile moves file from
                  ...
Upload Process




Friday, May 28, 2010

File Data is written to local disk. Blob.consumeFile is called with parameters fr...
What About Really
                           Really Big Files?
                       • Use FTP
                       • S...
UI




Friday, May 28, 2010
Uploading with FTP




Friday, May 28, 2010

For very large file uploads (that may run into browser limits), the file is upl...
ore.bigfile
                       • Minimally intrusive, works with the grain of
                         Plone
          ...
ore.bigfile
                                  Limitations

                       • Upload directory is hardcoded
         ...
Versioning Big Files




Friday, May 28, 2010

CMFEditions has a limit on file size of 34 MB

It also makes a new file copy ...
Solution
                       • Bypass CMFEditions - no file size limitation
                       • Create a new versio...
Conclusion
                       • ore.bigfile solves the Big File problem for a
                         particular use c...
UI




Friday, May 28, 2010
http://svn.objectrealms.net/
                  view/public/browser/ore.bigfile

                              Questions

Fr...
Upcoming SlideShare
Loading in...5
×

Large Files without the Trials

3,323

Published on

Sally Kleinfeldt and Aaron VanDerlip describe ore.bigfile, a minimalist solution to the problem of uploading, downloading, and versioning very large files in Plone.

Published in: Technology
2 Comments
6 Likes
Statistics
Notes
No Downloads
Views
Total Views
3,323
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
33
Comments
2
Likes
6
Embeds 0
No embeds

No notes for slide

Transcript of "Large Files without the Trials"

  1. 1. Large Files Without the Trials Aaron VanDerlip and Sally Kleinfeldt Plone Symposium East 2010 Friday, May 28, 2010
  2. 2. Acknowledgments • Bioneers provides environmental education and social connectivity through conferences, radio and TV, books, and online materials • Engaged Jazkarta to build a file asset server based on Plone to help them organize, capture, and store multimedia and textual content with files as large as 5 GB. Friday, May 28, 2010
  3. 3. Acknowledgments • Aaron VanDerlip - Project Manager • Kapil Thangavelu - Developer Friday, May 28, 2010 Bioneers funded a project “for a file-asset server system based on Plone”, that would “support the upload and retrieval of files as large as 5GB”.
  4. 4. What is a Big File? • Anything that makes you wait... Friday, May 28, 2010
  5. 5. Plone Problems with Big Files 1.Uploading/Downloading 2.Versioning Friday, May 28, 2010
  6. 6. Uploading Big Files • Both the user and a Zope thread are waiting for the file transfer Friday, May 28, 2010
  7. 7. Friday, May 28, 2010 Typically Zope has to process the entire Request coming from Apache. This can cause Zope to block if it has to process large Request bodies
  8. 8. Uploading Big Files • Browser encodes file in multipart mime format • Zope must undo this encoding • CPU and memory intensive, and SLOW • Zope thread is blocked during this process Friday, May 28, 2010
  9. 9. Downloading Big Files • ...the same thing happens in reverse Friday, May 28, 2010
  10. 10. Learning from Rails • Get file encoding/unencoding and read/ write operations out of Plone • Web servers are really good at this - Apache, Nginx, and Lighttpd • Our implementation uses Apache • Apache file streaming is fast and threads are cheap Friday, May 28, 2010 Elizabeth Leddy mentioned the similarities between Ruby and Python web apps yesterday, adopting Rails tools where appropriate
  11. 11. Learning from Rails • Uploads: Apache plus mod_porter http://therailsway.com/tags/porter • Downloads: Apache plus mod_xsendfile http://john.guen.in/past/2007/4/17/ send_files_faster_with_xsendfile/ • ...and of course ZODB Blob storage Friday, May 28, 2010
  12. 12. Mod Porter • Parses the multipart mime data • Writes the file to disk • Changes the Request to contain a pointer to the temp file on disk • All done efficiently in C code inside your Apache process Friday, May 28, 2010
  13. 13. Mod Porter Friday, May 28, 2010 Mod Porter process the multipart mime data quickly and writes it to disk. It then sends the modified and lighter weight Request to Zope.
  14. 14. Apache Config for Mod Porter LoadModule apreq_module /usr/lib/Apache2/modules/mod_apreq2.so LoadModule porter_module /usr/lib/Apache2/modules/mod_porter.so # Apache has a default read limit of 64MB, set it higher APREQ2_ReadLimit 2G ... Porter On # Files below this size will not be handled by mod-porter PorterMinSize 14M # Where the uploaded files are stored PorterDir /mnt/uploads-Apache Friday, May 28, 2010
  15. 15. X-Sendfile • HTTP header • Set an X-Sendfile header and the path of a file on your response • Apache does the rest Friday, May 28, 2010
  16. 16. Apache Config for X-Sendfile LoadModule xsendfile_module /usr/lib/Apache2/modules/mod_xsendfile.so ... EnableSendfile On XSendFile on # Config to send file resources directly from blob storage XSendFilePath /mnt/bioneers/var/blobstorage Friday, May 28, 2010
  17. 17. Using X-Sendfile from Python def download(self, response, file_path): response.setHeader("X-Sendfile", file_path) Friday, May 28, 2010
  18. 18. Blob Storage • Uploads • Blob.consumeFile moves file from Apache’s temp area to blob storage (ZODB/blob.py) • Uses os.rename, file never enters Plone • Downloads • Served directly from blob storage Friday, May 28, 2010
  19. 19. Upload Process Friday, May 28, 2010 File Data is written to local disk. Blob.consumeFile is called with parameters from the Request containing the location of the file.
  20. 20. What About Really Really Big Files? • Use FTP • Supports continuation and batching • Handles files too large for browser limits • Content editors use FTP to transfer files to an upload directory Friday, May 28, 2010 SFTP guarantees continuation
  21. 21. UI Friday, May 28, 2010
  22. 22. Uploading with FTP Friday, May 28, 2010 For very large file uploads (that may run into browser limits), the file is uploaded using SFTP to support continuation. The file name is passed via Plone to Blob.consumeFile and the file is processed in a similar manner
  23. 23. ore.bigfile • Minimally intrusive, works with the grain of Plone • Provides Big File content type • IFrontendFileServer interface defines two methods that provide web server support for upload and download • Apache and Nginx implementations provided Friday, May 28, 2010
  24. 24. ore.bigfile Limitations • Upload directory is hardcoded • Possibility of error on very large images which Mod Porter intercepts Friday, May 28, 2010
  25. 25. Versioning Big Files Friday, May 28, 2010 CMFEditions has a limit on file size of 34 MB It also makes a new file copy for every version, even if only metadata changed
  26. 26. Solution • Bypass CMFEditions - no file size limitation • Create a new version only when file changes (not metadata) • Allow old versions to be purged • Version information stored on Big File object using annotations Friday, May 28, 2010
  27. 27. Conclusion • ore.bigfile solves the Big File problem for a particular use case, not feature complete • It does so by taking advantage of mature web server technology • The code is minimally intrusive • It provides a strategy for implementation we can learn from as we improve Plone’s Big File story Friday, May 28, 2010
  28. 28. UI Friday, May 28, 2010
  29. 29. http://svn.objectrealms.net/ view/public/browser/ore.bigfile Questions Friday, May 28, 2010 Why not Tramline? - older, not blob-aware, no ftp, no versioning - requires modification of mod_python
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×