• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Relational Won't Cut It: Architecting Content Centric Apps
 

Relational Won't Cut It: Architecting Content Centric Apps

on

  • 1,940 views

In the past, developers have chosen to develop their own content-centric apps from scratch or by leveraging low level libraries. A content repository like Alfresco can save time and cost. Even if you ...

In the past, developers have chosen to develop their own content-centric apps from scratch or by leveraging low level libraries. A content repository like Alfresco can save time and cost. Even if you don't choose Alfresco, you should still consider leveraging a standard API like CMIS as much as possible.

Statistics

Views

Total Views
1,940
Views on SlideShare
1,908
Embed Views
32

Actions

Likes
0
Downloads
47
Comments
0

4 Embeds 32

http://paper.li 15
http://us-w1.rockmelt.com 9
https://twitter.com 5
http://a0.twimg.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • "We're drowning in documents (or videos or images). We don't know what we have and none of it is organized. We waste so much time and money recreating stuff that probably already exists, if we could just find it.""We've got serious business risk caused by people using the first thing they find instead of the right thing.""We have a process for sending stuff around to the rest of the team for review and approval, but we have no idea what's in flight or who we're waiting on or why.""We have teams of people from both inside and outside the organization that need to be able to work together efficiently. They need to share files, of course, but really, it's more than that.""We've got business systems that generate, store, and process things like reports and images at an alarming rate."
  • May start out simple, but the system tends to morph over time. Let’s look at three “levels” of content-centric app complexity.
  • Process, Security, & SearchSome open source search engines that are out thereJBossjBPMActivitiIntalioBonitaSoftSome open source libraries that may be helpful in extraction/conversion: - Tikka - FOP - POI - ImageMagick - JAI
  • You’ve built a system that’s pretty bad-ass, and it is customized to your specific needs, but at what cost?
  • Is it easy to extend?Does it get out of the way?
  • CMIS Alfresco extensions support CMIS 1.0 out-of-scopefeaturessuch as aspects and datalists.
  • Founded in 2005John NewtonFounding developer of IngresCo-founded DocumentumJohn PowellCOO of Business ObjectsPresident of Oracle UKLots of Engineers from Documentum, Interwoven, VignetteAssembled from Open Source components
  • Pick your stack:Linux / Windows OS servers : RHEL, Solaris, Ubuntu, Windows Server, …DBMS : MySQL, MS SQL, Oracle, PostgreSQL, DB2, …Application servers : Tomcat, JBoss, WebLogic, WebSphere, …Web browsers : Firefox, MSIE, SafariIdentity Management systems : LDAP, AD, Kerberos, …
  • Activiti first appeared in theAlfresco 3.4 E preview release and willbe production readywith 4.0

Relational Won't Cut It: Architecting Content Centric Apps Relational Won't Cut It: Architecting Content Centric Apps Presentation Transcript

  • Relational Won't Cut ItArchitecting Content-Centric Applications for Java
    Jeff Potts
    Chief Community Officer
  • Agenda
    What is a content-centric application?
    Do-it-yourself approaches
    A better way: The Platform Approach
    Content Management Interoperability Services (CMIS) Standard
    Alfresco technical overview
    Repository services
    APIs
  • What is a Content-Centric Application?
    Web application with a mix of structured and unstructured data
    Unstructured data is typically file-based
    Office documents
    Images
    Audio/Video
    Reports
    Usually collaborative
    May also include business processes
  • A Few Examples
    Expense report review & approval
    Contract negotiation, creation, & review
    Press request/fulfillment
    Research study authoring
    Sales/Marketing collateral creation & communication
    Course guide ("student packet") authoring/publishing
  • Or the business is saying
    I’ve got a ton of files
    I’ve got people that produce them, sometimes collaboratively, and people that consume them.
    I want to somehow make it easier to deal with all of this.
    Source: eqqman
  • Pains
    Inability to find important content
    Black hole process
    Re-creating the wheel
    Productivity loss
    Higher costs
    Using outdated content
    Legal/business risk
    Loss-of-life/injury
    Source: khainomore
  • Components of content-centric systems
    User Interface
    Persistence/Data Model/Metadata
    Business Processes/Workflow
    Library Services (Upload/Download, Versioning, & Check-in/Check-out)
    Security
    Search
    Transforms/Renditions/Thumbs
    Tagging/Categorization
    Authoring tool integration
    Remote API
    Scheduler
    Comments/Ratings/Activity Streams
  • Let’s Build it Ourselves!
  • DIY approach seems simple
    “What’s so special about content-centric apps?”
    Standard web app toolkit
    Favorite front-end/presentation framework
    Relational Database
    Data Model/Metadata
    Comments/Ratings
    Tagging/Categorization
    Files? Generally, a Bad Idea
  • Files: Relational may not cut it
    Relational is good at text and numbers. Binary data, YMMV
    Size limits
    Random seek (streaming)
    Search: Some relational databases can index into blobs, but not all
  • File storage options
    On disk
    Amazon S3 or an internal CAS filer
    Source code control repository
    XML database
    NoSQL document store
    Content repository
    Apache Jackrabbit
    Alfresco
    Other open source and proprietary repositories
  • Content repository
    • Content =a file + metadata
    • File system
    • Content binaries
    • Search indexes
    • Database
    • Relations (associations)
    • Metadata
    • Repository
    • Abstraction layer
  • Once files are figured out…
    Security framework
    Search
    Business Process/Workflow Engine
    Transforms/Extractions/Renditions
    Scheduled jobs
    WebDAV, CIFS, FTP or other authoring integrations
    Versioning
    Check-in/Check-out
    Remote API
    Replication
    Social features
    Mobile access
    Custom code to integrate all of these subsystems
  • “What have we done?”
    Source: gobucks2
  • Factors that affect DIY reasonableness
    Number and size of documents
    Number and concurrency of users
    Number and nature of integration points
    Business process volatility & complexity
    Time and cost of
    Integrating all of these services/sub-systems
    Maintaining all of that code…forever
  • The Platform Approach
  • Platform approach
    Much of this has already been solved
    Content Platform = Repository + Services
    Find a platform that meets your needs
    Extend the platform with your own business logic
    Write your own front-end using whatever language or framework makes sense
    Or, customize the UI that the platform provides
  • What makes a great content platform?
    Agility
    Applicable to a broad set of solutions
    Scale up, scale down
    Fast/Friendly Development Model
    Open Source
    Troubleshooting
    Bug tracking
    Community
    Standards compliance
    Lower switching costs
    Easier integration
  • Bigpicture
    Web Applications
    Knowledge Portals
    Web Services
    Business
    Process
    Engine
    App Server
    CRM
    Portal Server
    Virtual File System
    High Availability
    FTP
    CIFS
    WebDAV
  • and
  • What is CMIS?
    Content Management Interoperability Services
    Language-independent, vendor-neutral API for content management
    CRUD functions for nodes
    Check-in/check-out
    Associations
    Permissions (Access Control Lists)
    Policies
    Queries
    Repository traversal
  • The Beauty of
    Presentation Tier
    REST
    SOAP
    Content Services Tier
    ?
    ?
    Enterprise Apps Tier
  • CMIS
    • CMIS API via
    • REST / Atom
    • WebServices
    • Use cases
    • Repository to repository
    • Application to repository
    • Federatedrepositories
    • CMIS Alfresco extensions
  • About the CMIS Spec
    OASIS standard
    Alfresco, IBM, Microsoft, Oracle, FileNet support
    Alfresco was first to production with CMIS
    Two parts
    Interoperability through standard SOAP and Atom Pub bindings
    SQL-based query language for rich content repositories
    New JSON binding coming soon
  • Implementations Already Available…
    Providers
    Consumers
    Developed by 30+ ECM Vendors
  • Open Source implementations of CMIS
    Apache Chemistry is the umbrella project for all CMIS related projects within the ASF
    OpenCMIS (Java, client and server)
    cmislib (Python, client)
    phpclient (PHP, client)
    DotCMIS (.NET, client)
  • Alfresco Overview
    Alfresco is an open source Enterprise Content Management platform
    Can manage any kind of file, any size
    Stores the file and metadata
    All content and metadata is searchable
    Files can be secured to specific users and groups
    CMIS-compliant
  • Alfresco Overview (Cont’d)
    Provides versioning and check-in/check-out
    Has a built-in workflow engine
    Can be accessed through a browser or from desktop applications via CIFS, WebDAV, FTP, IMAP, SMTP, SharePoint
    Three editions
    Community
    Team
    Enterprise
  • High-level Architecture
    Plus:
    • IMAP
    • SharePoint
  • High-level Custom Front-End
    Drupal
  • Repository Services
  • Repository Services
    Services allow the content items within the repository to be managed :
    • Content lifecycle
    • Creation, modification, deletion, …
    • Control over the objects
    • Permissions, locks
    • Content models
    • Properties, associations
    • Workflows
    • Search
    • Rules and Automatic Actions
    • etc …
  • Rules and actions
    Actions (ActionService) :
    • Trigger actions over content items
    • Secured and transactional
    • Scheduled or on-demand
    • Can beleveraged by workflows
    • Ex: Send an email, copy or move a content item, …
    Rules (RuleService) :
    • Similar to mail client filters
    • Program event-basedautomatictasks and actions
    • Run one or several actions
    • Reusablerules
    • Sortable rules
    • Easy to configure, easy to activate
  • Transformations
    Transformations to different file formats
    Ex : Word => PDF, Word => Flash, …
    Automatic extraction of common file metadata
    Grounded on OpenSource libraries :
    Apache Tika, POI, FOP, PDFBox, pdf2swf, …
    Can be leveraged by actions and content rules
    Ex : When a MS Word document is uploaded, make a PDF copy of it, and send it by email to “admin”
  • Workflows
    Full BPM capabilities with jBPM/Activiti
    Rich features :
    Parallel or serial workflows
    Joins, forks, conditions …
    Group or individual assignees
    Actions and complex behaviors
    Implement your custom lifecycle model through workflows
    Extensible--Build your own business processes
  • Security - Authentication
    Alfresco can handle it or pass it off to others
    ActiveDirectory
    LDAP
    Kerberos
    NTLM
    SSO
    Custom
    Source: rooreynolds
  • Security - Authorization
    Spring Security Framework (ACEGI) under the covers
    Users & Groups
    Access Control Lists
    Permissions
    Hierarchical
  • OtherAlfresco services
    • Search
    • Checkin/Checkout
    • Locking
    • Versioning
    • Tags & categories
    • Authentication
    • LDAP sync
    • Groups & users
    • Data dictionary
    • Browsing
    • Lifecycle
    • Rating
    • Invitations
    • Sites
    • User quotas
    • Copy – move
    • Transfer/Replication
  • APIs
  • Java & JavaScript
    Alfresco’s “foundation” API is Java
    Server-side JavaScript is also an option
    Remote APIs
    Web Services SOAP
    HTTP REST Webscripts - Java or JavaScript
    CMIS - Atom REST or SOAP
    Source: 96dpi
  • Web Script Framework
    • Model-View-Controller pattern
    Declare a URL, bind it to logic, provide one or more views
    Controller implemented in JavaScript or Java
    Views implemented in FreeMarker
    Deployed to the repository or the classpath
    Part of the Spring Surf Projecthttp://springsurf.org/
  • Summary
    Platform = Repository + Services
    CMIS is an important standard
    Alfresco is a great CMIS server
    Even if you don’t pick Alfresco, try to leverage CMIS
    Alfresco provides the repository plus services pre-integrated and ready for your custom content-centric apps
  • For More Information…
    Alfresco Community
    http://www.alfresco.org
    Alfresco Forums
    http://forums.alfresco.com
    Alfresco Wiki
    http://wiki.alfresco.com
    Alfresco Blogroll
    http://blogs.alfresco.com
    ECM Architect Blog
    http://ecmarchitect.com
  • Email: jpotts@alfresco.comTwitter: @jeffpotts01Blog: http://ecmarchitect.com
  • Extra/Unused Slides
  • Data Modeling
    Repository is a collection of nodes
    Everything is a node, nodes are typed
    Content Model is expressed in XML
    Cold-deploy most common, hot deploy possible
    Types, aspects, properties, associations, constraints
    Hierarchical
    Types inherit from super types
  • Types, Aspects, Properties, & Associations
    Content Types
    Type “report” (metadata : subject, abstract, …)
    Aspects
    Aspect “client” (metadata : name, reference, contact, …)
    Properties
    Property “customer id” [integer]
    Associations
    Association “related documents”
  • Example
    Aspects Useful for Cross-Cutting
    Type = Report
    Type = Contract
    Type = Email
    Type = Case
    Type attributes
    Subject
    Abstract
    Type attributes
    Effectivitystart date
    Effectivity and date
    Type attributes
    Subject
    Sender
    Recipients
    Type attributes
    Format
    Aspect = Client
    Aspect attributes
    Client nameClient IdContact-> Related docs
  • Spring Framework
    Alfresco repository services are built on top of the Spring framework :
    • Public Services through APIs
    • Services implemented through flexible components
    • XML driven configuration
    • Secured and transactional
    • Extensible
  • A content rule is defined on a space level :
    1/a triggeringevent(inbound / outbound / update)
    2/ a set of filtering conditions(objectname, mimetype, …)
    3/ a set of actions to run(move the content item, add an aspect, send a notification email …)
    Rules help youcreatesmart spaces
    Drafts
    Approved
    Published
    Example
    Rules and actions
  • Transformations and metadata extractions are used by Share web interface :
    PNG thumbnail
    Flashpreview
    Metadata
    extraction
    Transformations