Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Book Widget -- Embedding automated photo-document publication on the web and in mobile devices

2,176 views

Published on

In which the author explains how to design large-scale cloud platforms for document processing

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Book Widget -- Embedding automated photo-document publication on the web and in mobile devices

  1. 1. Book Widget Embedding automated photo-document publication on the web and in mobile devices E. O’Brien-Strain, A. Hunter, J. Liu, Q. Lin, D. Tretter, J. Wang, X. Zhang, and P. Wu Hewlett-Packard Labs
  2. 2. In which the author ... explains how to design large-scale cloud platforms for document processing
  3. 3. Outline <ul><li>Motivation </li></ul><ul><li>Architectural Principles </li></ul><ul><li>Platform Implementation </li></ul><ul><li>Example Application – Facebook Photo-Books </li></ul>
  4. 4. Motivation <ul><li>We have a wealth of auto-publishing algorithms </li></ul><ul><li>Want to provide them for third parties to use </li></ul><ul><li>By building a cloud-based platform that </li></ul><ul><ul><li>Is flexible and programmable </li></ul></ul><ul><ul><li>Is secure and private </li></ul></ul><ul><ul><li>Is infinitely scalable </li></ul></ul><ul><ul><li>Has high availability and responsiveness </li></ul></ul><ul><ul><li>Reconciles WYSIWYG and style-driven use models </li></ul></ul>
  5. 5. Architectural Principles <ul><li>REST </li></ul><ul><li>Capability Security Authorization </li></ul><ul><li>Authentication Agnostic </li></ul><ul><li>Scaling: Elastic, Sessionless, noSQL, Caching </li></ul><ul><li>Orthogonal: UI / Content Source / Artifacts </li></ul>
  6. 6. REST <ul><li>“ Representational State Transfer” </li></ul><ul><li>Architectural pattern for creating network APIs </li></ul><ul><li>All API calls are HTTP requests to some URL </li></ul><ul><ul><li>GET to retrieve data from a URL </li></ul></ul><ul><ul><li>PUT to write data to a URL </li></ul></ul><ul><ul><li>POST to perform some action on a URL </li></ul></ul><ul><ul><li>DELETE to remove the data from the URL </li></ul></ul><ul><li>Starting from response from an initial URL </li></ul><ul><ul><li>client code finds other URLs to operate on </li></ul></ul>
  7. 7. Capability Security Authorization <ul><li>An example of a URL from our API: </li></ul><ul><li>http://foo.com/document/ qY9vZObN-slqsv_RWnJB4w /content/chunks.json </li></ul><ul><li>Has cryptographically-secure random string </li></ul><ul><li>If you do not know this URL there is no way to find it </li></ul><ul><li>Possess URL <=> Authorized to use URL </li></ul><ul><li>“ Moderate” level of security </li></ul><ul><ul><li>Still vulnerable to network snoopers </li></ul></ul><ul><ul><li>Can use SSL to increase security </li></ul></ul>
  8. 8. Authentication Agnostic <ul><li>No concept of a “User” </li></ul><ul><ul><li>Instead just store anonymous resources </li></ul></ul><ul><li>Client code expected to </li></ul><ul><ul><li>keep track of users and authenticate them </li></ul></ul><ul><ul><li>remember which resources belong to each user </li></ul></ul><ul><li>Gives flexibility to client to use any authentication </li></ul><ul><ul><li>No need for complexity of “single-sign-on” </li></ul></ul><ul><li>Allows us to </li></ul><ul><ul><li>avoid many security/privacy headaches </li></ul></ul><ul><ul><li>avoid complexity and cost of using SSL </li></ul></ul><ul><ul><li>have our data cached in Internet infrastructure </li></ul></ul>
  9. 9. Scaling <ul><li>Can use elastic infrastructure cloud </li></ul><ul><ul><li>rapid spin-up and spin-down of virtual servers </li></ul></ul><ul><li>Sessionless </li></ul><ul><ul><li>Bank of web servers operating in parallel </li></ul></ul><ul><ul><li>Sequence of HTTP requests sprays out arbitrarily over multiple servers </li></ul></ul><ul><li>NoSQL </li></ul><ul><ul><li>Highly-distributed no-master key-value store </li></ul></ul><ul><li>Caching at every level </li></ul>
  10. 10. Any Permutation of <ul><li>User interface for creating documents </li></ul><ul><ul><li>Web (HTML5 or Flash) </li></ul></ul><ul><ul><li>Mobile device </li></ul></ul><ul><ul><li>PC application </li></ul></ul><ul><li>Where content comes from </li></ul><ul><ul><li>Social network </li></ul></ul><ul><ul><li>Photo site / document storage site </li></ul></ul><ul><ul><li>PC folder </li></ul></ul><ul><li>What kind of artifacts </li></ul><ul><ul><li>Print at home, at retail, or at PSP </li></ul></ul><ul><ul><li>View on e-Book reader, slate, or phone </li></ul></ul>
  11. 11. Platform Implementation <ul><li>Initially Targeting Photo-Oriented Documents </li></ul><ul><li>Unified Model for Document + Template + Content </li></ul><ul><li>Content Transformation </li></ul><ul><li>Transactional Data </li></ul><ul><li>Embeddable as a Widget </li></ul><ul><li>Monetizable </li></ul>
  12. 12. Initial Target <ul><li>Platform architected to handle a wide variety of documents </li></ul><ul><li>Initially handles photo-oriented documents </li></ul><ul><ul><li>Such as photo-books </li></ul></ul><ul><li>Can be extended to handle more text-heavy documents </li></ul><ul><ul><li>Such as magazines </li></ul></ul>
  13. 13. Unified “Document” Model <ul><li>Document = content + “rendersheet” </li></ul><ul><li>A single “document” resource type for </li></ul><ul><ul><li>User documents (content + rendersheet) </li></ul></ul><ul><ul><li>Collections of input content (just content) </li></ul></ul><ul><ul><li>Templates (just rendersheet) </li></ul></ul><ul><li>Any document can be used as template for new document </li></ul><ul><li>Any document can by used as source of content for new document </li></ul><ul><li>Look-and-feel of one document can be applied to another </li></ul>
  14. 14. Algorithms <ul><li>Auto-organizing algorithms using </li></ul><ul><ul><li>Photo quality </li></ul></ul><ul><ul><li>Near-duplicate detection using structural similarity and time proximity </li></ul></ul><ul><li>Auto-layout algorithms </li></ul><ul><ul><li>BRIC (blocked recursive image composition) </li></ul></ul><ul><ul><li>START (structured layout for resizable background art) </li></ul></ul>
  15. 15. Transactional Data <ul><li>Resources are not stored indefinitely </li></ul><ul><ul><li>Have an expiration date </li></ul></ul><ul><li>Two top-level types of resources </li></ul><ul><ul><li>Documents (composed of “Chunks”) </li></ul></ul><ul><ul><li>Artifacts </li></ul></ul>
  16. 16. Embeddable as a Widget <ul><li>Can be embedded in Web or mobile application </li></ul><ul><li>Third-party developer can </li></ul><ul><ul><li>write their own document design user-interface </li></ul></ul><ul><ul><li>or they can use the Flash widget that we provide </li></ul></ul>
  17. 17. Monetizable <ul><li>We include features that allow for a variety of different business models </li></ul><ul><ul><li>Each client application must register with us </li></ul></ul><ul><ul><li>API key and “shared secret” token </li></ul></ul><ul><ul><li>All client requests that create of modify resources must be signed with the secret </li></ul></ul><ul><ul><li>All resources are marked with the client application that created </li></ul></ul><ul><ul><li>All resources have a “time to live” before they are deleted </li></ul></ul>
  18. 18. Example Application <ul><li>Facebook Application </li></ul><ul><li>Built by team of outside developers </li></ul><ul><li>Uses our UI widget for creating and viewing photobooks </li></ul><ul><li>Integrates nicely into Facebook site </li></ul><ul><ul><li>Leverages social connections of users </li></ul></ul><ul><ul><li>To make application more viral </li></ul></ul>
  19. 20. Summary <ul><li>Introduction </li></ul><ul><li>Architectural Principles </li></ul><ul><li>Platform Implementation </li></ul><ul><li>Example Application – Facebook Photo-Books </li></ul>

×