Lessons LearnedBuilding Nuxeo EP    Stefane Fermigier, PhD - NuxeoPresented at ICSSEA 2010, Dec. 8 2010
History and Context
Who we are•   Company started in 2000•   2002-2005: Zope-based CPS project•   2005: First Eclipse RCP based project•   200...
What is ECM?ECM, a concept that emerged in the early2000s, represents the integratedenterprise-wide management of allforms...
5   Search                    & Find                                          1 Capture                                   ...
What are CEVA?•   4LA invented by Gartner in 2006: “Content    Enabled Vertical Applications”•   “CEVAs typically help to ...
Business Goals• First, create a MVP (minimal viable  product) to ensure company sustainability• Base it on a clean, extens...
Nuxeo CPS• Content management and portal platform• Developed from 2002 to 2005• Built on top of the Zope and CMF  (Content...
Switch to Java: Why?•   Technical reasons:    •   ZODB doesn’t scale well in terms of data        volume    •   Dymanic la...
A Few Numbers•   Nuxeo EP+DM is a 400 KLOC Java project•   Comprises ~190 independent modules (JARs)•   Developed over the...
Business Constraints and Requirements
Business Vision• Address the full ECM scope • Initial focus on Document Management • Architecture must be extensible and  ...
Business Vision• Low barrier of entry for:  • End-users (e.g. pleasant UI)  • Developers (e.g. clean model and API,    lev...
Our Original Roadmap• Dont reinvent the wheel • Leverage existing standards, work on new    ones (ex: JCR2, CMIS) • Build ...
Core ECM•   Document types definition and management•   Storage of the documents and associated    metadata•   Document lif...
Higher-Level          ECM Services•   Workflow•   Transformation and rendering•   User management•   User interface•   A ri...
2.0 and 3.0 (Ongoing)•   Tagging and folksonomies•   Lightweight collab (wikis) and publishing (blogs)•   Social networkin...
Products andApplications
Nuxeo ECM - Our Approach                     Applications                 Correspondence    Contracts        Invoice      ...
Document Management
DAM
Case Management
Web Sites
The Strongest          Requirement• Applications (horizontal, vertical or custom)  must be buildable just by assembling  c...
Technical Challenges
Standards Choice• Switch to Java was motivated by the desire  to be more “standards-compliant”• But the problem with stand...
Initial Standards• Java EE 5, as the structuring general  framework for the server-based application  (but not for the cor...
Initial Standards• JSF as the presentation layer (part of Java  EE 5)• JBoss Seam, a web presentation framework  that exte...
Notes• Java EE 5 was really new and still “wet” at  the time• Seam was not a standard, but its concepts  eventually merged...
Open Source Libraries• The Open Source Java ecosystem started  to grow in the late 90s (Apache) and had a  huge boost in t...
Choosing an OSS Library•   License compatibility with the LGPL (this    excludes proprietary and GPL licenses)•   Complian...
Benefits and Challengesof Using OSS Libraries•   With OSS, it’s easier to evaluate options•   Forking a library is sometime...
Lessons Learned• Allow users of our platform to extend it  without touching its source code• Or, even better, without writ...
Architectural Solutions
Architectural Solutions• Layered architecture• High-level APIs• Component system• Extension points• Event bus
Layer CakeNuxeo EP Architecture    Nuxeo UI Frameworks  Flexible choice of interfaces    Nuxeo ECM Services Modular set of...
APIs
Everything Pluggable
OSGi in Theory• OSGi is a component system developed  initially for the embedded systems industry• Adopted by Eclipse for ...
OSGi in Theory• An OSGi “container” takes care of  component activation• Bundles describe their own imports  (dependencies...
OSGi at Nuxeo•   We package our components as OSGi bundles•   We have our own “OSGi-like” adapter for app    servers (JBos...
OSGi at Nuxeo• Goal is to be able to run everything on a  “real” OSGi container in 2011• ... and to fully leverage the OSG...
Plugins and          Extension Points•   Inspired by the Eclipse architecture•   Eclipse = a core runtime engine + a set o...
Plugins andExtension Points
Note• This “core + extensions” pattern is very  common in successful open source projects  • Linux kernel + drivers (modul...
Event Bus•   EventHandlers aka listeners    •   Synchronous / PostCommit / Asynchronous    •   Easily contributed (Java / ...
Event Bus
Process andCommunity Engagement
Goals•   Must enable the participation of third party    contributors (partners, community)•   Must improve synchronizatio...
Process:Scrum & Kanban
Community Engagement:       PRIM
P = Portal
R = Repository
I = Issue Tracker
M = Mailing List (+ foruM)
“Every successful open source project I knowuses PRIM. Every closed source project Iknow, doesnt. People wonder how open s...
Development Tools
Tools• Mercurial (distributed SCM)• Maven (Dependency mngt, build,  packaging, releasing)• Hudson (Continuous integration)...
TDD and CI
More Tools• IDEs (Eclipse mostly)• Testing (JUnit, Selenium, WebDriver)• Static code analysis (FindBugs, IDEA  inspections...
Outstanding Issues•   CI at our level is very resource-intensive (10    servers farm)•   It’s hard (read, impossible) to t...
Related Work
Open Source, Java-Based•   Some Java-baed open source WCM or E2.0    platforms (XWiki, Jahia...) have developed ad-    hoc...
Open Source, Scripting  Languages Based• Drupal, Joomla, WordPress (PHP), Plone  (Python), are examples of extensible  con...
Proprietary• Proprietary software is, by definition,  harder to study and tear apart• But the general view is that big name...
Perspectives
Ongoing Work• Simplifying the developer experience: faster  code/test turnaround, simpler web front end  development using...
Nuxeo Studio
Ongoing Work• Replication, both in a LAN (for scalability  and fault tolerance) and WAN (for  replication between remote d...
Ongoing Work• Porting the platform to the Cloud, and  exposing it as a PaaS service• Work on mobile ECM, with clients for ...
Conclusion
Key Findings•   The Nuxeo EP architecture fits both the OSS    “architecture of participation” vision and our    business m...
More Information• www.nuxeo.com• www.nuxeo.org• blogs.nuxeo.com
Upcoming SlideShare
Loading in...5
×

Lessons learned Building Nuxeo EP - Component-based, open source ECM platform

3,531

Published on

Stefane Fermigier shares lessons learned over the last ten years of building Nuxeo EP. Presented at the ICSSEA 2010 software engineering conference.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
3,531
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
50
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Lessons learned Building Nuxeo EP - Component-based, open source ECM platform

    1. 1. Lessons LearnedBuilding Nuxeo EP Stefane Fermigier, PhD - NuxeoPresented at ICSSEA 2010, Dec. 8 2010
    2. 2. History and Context
    3. 3. Who we are• Company started in 2000• 2002-2005: Zope-based CPS project• 2005: First Eclipse RCP based project• 2006-now: Full switch to Java (Java EE 5 and OSGi)• 2009-now: Business model migration from service company to an OSS Software Vendor
    4. 4. What is ECM?ECM, a concept that emerged in the early2000s, represents the integratedenterprise-wide management of allforms of non-structured (andsometimes, semi-structured) content,including their metadata, across theirwhole lifecycle, supported by appropriatetechnologies and administrativeinfrastructure.
    5. 5. 5 Search & Find 1 Capture & Create4 Publish & Archive 2 Share & Collaborate 3 Process & Review
    6. 6. What are CEVA?• 4LA invented by Gartner in 2006: “Content Enabled Vertical Applications”• “CEVAs typically help to automate complex processes that previously required workers to manually sort through paper documents and other forms of content (in effect, a way to manage down costs of exception handling) and optimize the remainder of the work.”
    7. 7. Business Goals• First, create a MVP (minimal viable product) to ensure company sustainability• Base it on a clean, extensible architecture• With the end goal of enabling the creation of a rich ecosystem of extensions and application profiles
    8. 8. Nuxeo CPS• Content management and portal platform• Developed from 2002 to 2005• Built on top of the Zope and CMF (Content Management Framework) open source frameworks• Architecture: pluggable components (“Products”) and events
    9. 9. Switch to Java: Why?• Technical reasons: • ZODB doesn’t scale well in terms of data volume • Dymanic languages don’t scale well in terms of managing complexity (> 100 KLOC)• Business reasons: • Java makes it much easier to work with mainstream systems integrators
    10. 10. A Few Numbers• Nuxeo EP+DM is a 400 KLOC Java project• Comprises ~190 independent modules (JARs)• Developed over the last 4 1/2 years by a core team of 20 developers and 50 community contributors• Has generated ~20 MEUR of revenue for Nuxeo, ~50 MEUR for partners
    11. 11. Business Constraints and Requirements
    12. 12. Business Vision• Address the full ECM scope • Initial focus on Document Management • Architecture must be extensible and modular• Enable and sustain the Ecosystem • Easy to work with, designed for participation
    13. 13. Business Vision• Low barrier of entry for: • End-users (e.g. pleasant UI) • Developers (e.g. clean model and API, leverage existing knowledge) • Sysadmins / operations• “Enterprise-class” software • 10 000s of users, millions of documents
    14. 14. Our Original Roadmap• Dont reinvent the wheel • Leverage existing standards, work on new ones (ex: JCR2, CMIS) • Build on proven open source libraries(JBoss, Apache, Sun, Eclipse)• Use a robust software engineering process • Make it transparent for our community
    15. 15. Core ECM• Document types definition and management• Storage of the documents and associated metadata• Document life cycle and versioning• Access control• Indexing + query language, must enable complex queries on both full-text and metadata
    16. 16. Higher-Level ECM Services• Workflow• Transformation and rendering• User management• User interface• A rich set of HTTP-based APIs exposed to third-party developers and integrators (WS-* and REST)
    17. 17. 2.0 and 3.0 (Ongoing)• Tagging and folksonomies• Lightweight collab (wikis) and publishing (blogs)• Social networking (“friending” or “following” colleagues or business partners, user timelines)• Collaborative filtering• Mobile and disconnected access• Semantic content categorization and named entities extractions
    18. 18. Products andApplications
    19. 19. Nuxeo ECM - Our Approach Applications Correspondence Contracts Invoice Marketing Asset Management Management Processing Management Business Solutions Construction Media Government Life Sciences Digital Case Horizontal Document Records Content Asset Management Packages Management Management Aggregator Management Framework Nuxeo Enterprise Platform Platform: Complete set of components covering all aspects of ECM ContentInfrastructure Nuxeo Core Lightweight, scalable, embeddable content repository 19
    20. 20. Document Management
    21. 21. DAM
    22. 22. Case Management
    23. 23. Web Sites
    24. 24. The Strongest Requirement• Applications (horizontal, vertical or custom) must be buildable just by assembling components (packaged as Java JARs)• Architecture must allow behavior modification at the repository level (e.g. new document type), at the UI level (e.g. new actions), and at the service level (e.g. adding new services) without recompilation
    25. 25. Technical Challenges
    26. 26. Standards Choice• Switch to Java was motivated by the desire to be more “standards-compliant”• But the problem with standards, is that there are too many to choose from!• Old vs. new or emerging• Open standards vs. de facto standards• Overlapping standards (hardest issue!)
    27. 27. Initial Standards• Java EE 5, as the structuring general framework for the server-based application (but not for the core services)• OSGi, as a packaging model for components• The JCR (Java Content Repository), as the model API to manage content and metadata at the most basic level
    28. 28. Initial Standards• JSF as the presentation layer (part of Java EE 5)• JBoss Seam, a web presentation framework that extends JSF, because we felt would provide a much improved developer experience over the “pure Java EE 5” model
    29. 29. Notes• Java EE 5 was really new and still “wet” at the time• Seam was not a standard, but its concepts eventually merged into one (JCDI)• In 2006 OSGi had credibility in the embedded and rich client spaces, not yet on the server• We dropped JCR support in 2010
    30. 30. Open Source Libraries• The Open Source Java ecosystem started to grow in the late 90s (Apache) and had a huge boost in the early 00s (Eclipse, JBoss, OW2, etc.)• Like with standards, there are usually many OSS implementations to choose from• FYI: Nuxeo EP now embeds more that 200 external open source libraries!
    31. 31. Choosing an OSS Library• License compatibility with the LGPL (this excludes proprietary and GPL licenses)• Compliance to a chosen standard• Quality, as witnessed by visual inspection of the source code• Confidence in the development process (e.g. are there unit tests?) and the community behind the project
    32. 32. Benefits and Challengesof Using OSS Libraries• With OSS, it’s easier to evaluate options• Forking a library is sometimes the only way to fix a bug or add a missing functionality• But it comes with a tremendous price because now you have to maintain your own branch• Becoming a contributor is also sometimes needed, but comes at a price too• Risk of “JAR hell” (conflicting libraries reqs)
    33. 33. Lessons Learned• Allow users of our platform to extend it without touching its source code• Or, even better, without writing code at all!• Keep your options open, but don’t over- engineer flexibility
    34. 34. Architectural Solutions
    35. 35. Architectural Solutions• Layered architecture• High-level APIs• Component system• Extension points• Event bus
    36. 36. Layer CakeNuxeo EP Architecture Nuxeo UI Frameworks Flexible choice of interfaces Nuxeo ECM Services Modular set of content services Nuxeo Core Advanced content repository Nuxeo Runtime Component and service model
    37. 37. APIs
    38. 38. Everything Pluggable
    39. 39. OSGi in Theory• OSGi is a component system developed initially for the embedded systems industry• Adopted by Eclipse for Eclipse 3.0 (2005)• Both a module system and service platform (but we’re currently only using the former)• Modules, or “bundles” are just JARs with a special MANIFEST
    40. 40. OSGi in Theory• An OSGi “container” takes care of component activation• Bundles describe their own imports (dependencies) and exports (exposed API)• Container can also take care of provisioning• Class loader isolation can take care of “JAR Hell”
    41. 41. OSGi at Nuxeo• We package our components as OSGi bundles• We have our own “OSGi-like” adapter for app servers (JBoss, Jetty, Tomcat, Glassfish)• Most of our components can also run on Eclipse Equinox (for RCP apps)• We have our own service registry, but it’s currently not based on OSGi• We don’t provide class loader isolation
    42. 42. OSGi at Nuxeo• Goal is to be able to run everything on a “real” OSGi container in 2011• ... and to fully leverage the OSGi service stack at the same time • Including service registry, hot-reload, class isolation, etc.• Biggest conceptual issue: overlap with Java EE
    43. 43. Plugins and Extension Points• Inspired by the Eclipse architecture• Eclipse = a core runtime engine + a set of plugins• Plugin: the smallest extensible unit to contribute additional functions to the system• Extension points: boundaries between plug-ins• A plugin (bundle) can contribute either configuration (pure XML contribution) or code (XML + Java)
    44. 44. Plugins andExtension Points
    45. 45. Note• This “core + extensions” pattern is very common in successful open source projects • Linux kernel + drivers (modules) • Firefox + plugins • Emacs + Emacs LISP macros• It’s a key to enabling an architecture of participation
    46. 46. Event Bus• EventHandlers aka listeners • Synchronous / PostCommit / Asynchronous • Easily contributed (Java / script / MDB)• Great solution for • Glueing together independent components • Enforcing business rules (synchronous inline) • Pushing/getting data to/from external systems
    47. 47. Event Bus
    48. 48. Process andCommunity Engagement
    49. 49. Goals• Must enable the participation of third party contributors (partners, community)• Must improve synchronization between custom developments and OSS projects• Agile development practices (XP, TDD) already used at Nuxeo since 2001 or so• Must complement them with simple, efficient and scalable project management practices
    50. 50. Process:Scrum & Kanban
    51. 51. Community Engagement: PRIM
    52. 52. P = Portal
    53. 53. R = Repository
    54. 54. I = Issue Tracker
    55. 55. M = Mailing List (+ foruM)
    56. 56. “Every successful open source project I knowuses PRIM. Every closed source project Iknow, doesnt. People wonder how open sourceprojects manage to create high-quality productswithout managers or accountability. The answer:were accountable to our infrastructure. PRIMis the open source secret sauce.”Ted Husted http://jroller.com/TedHusted/entry/prim
    57. 57. Development Tools
    58. 58. Tools• Mercurial (distributed SCM)• Maven (Dependency mngt, build, packaging, releasing)• Hudson (Continuous integration)• Jira (Bug / task tracking, Scrum iteration backlogs)
    59. 59. TDD and CI
    60. 60. More Tools• IDEs (Eclipse mostly)• Testing (JUnit, Selenium, WebDriver)• Static code analysis (FindBugs, IDEA inspections, Checkstyle, Enerjy)• Various profilers and debuggers
    61. 61. Outstanding Issues• CI at our level is very resource-intensive (10 servers farm)• It’s hard (read, impossible) to test UI without a browser• OTOH “Plain” Selenium test are hard to maintain• Some pieces (Maven repository, Selenium testing) are fragile, and introduce heisenbugs
    62. 62. Related Work
    63. 63. Open Source, Java-Based• Some Java-baed open source WCM or E2.0 platforms (XWiki, Jahia...) have developed ad- hoc component systems similar to ours• Alfresco is an ECM solution with a static architecture based on Spring, which makes it harder to customize and extend• Apache Sling is a framework based on JCR and OSGi, but doesn’t come with complete solutions and seems more focussed on WCM
    64. 64. Open Source, Scripting Languages Based• Drupal, Joomla, WordPress (PHP), Plone (Python), are examples of extensible content management platforms based on scripting languages with large ecosystems• Plugins usually rely on callbacks functions / methods instead of ext. points and events• Plugins can break due to API changes and lack of statical verification of compatibility
    65. 65. Proprietary• Proprietary software is, by definition, harder to study and tear apart• But the general view is that big name vendors products (Documentum, Open Text, FileNet) are based on mixes of old technologies patched together after various acquisitions, and harder to make evolve and to program to for modern developers
    66. 66. Perspectives
    67. 67. Ongoing Work• Simplifying the developer experience: faster code/test turnaround, simpler web front end development using the more recent JAX-RS standard or Google Web Toolkit (GWT).• Development of business-oriented RESTful APIs to allow high-level interaction with the content, and eventually business application development by non-technical users (cf. Nuxeo Studio).
    68. 68. Nuxeo Studio
    69. 69. Ongoing Work• Replication, both in a LAN (for scalability and fault tolerance) and WAN (for replication between remote data centers, or between a server and a desktop or mobile client) contexts• Social integration (using OpenSocial)• Further work on semantic technologies
    70. 70. Ongoing Work• Porting the platform to the Cloud, and exposing it as a PaaS service• Work on mobile ECM, with clients for platforms such as the iPhone/iPad, Android and Blackberry operating systems• Bringing Nuxeo EP to Java EE 6 and full OSGi compliance
    71. 71. Conclusion
    72. 72. Key Findings• The Nuxeo EP architecture fits both the OSS “architecture of participation” vision and our business model and goals• Main effort has now moved from the platform to its periphery (extensions, applications, development and operation tools), as enabled by the architecture• Still work to do on some key standards compliance aspects (OSGi, Java EE 6, CMIS...)
    73. 73. More Information• www.nuxeo.com• www.nuxeo.org• blogs.nuxeo.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×