Woonsan Ko
Nov. 15, 2018
(By courtesy of USFWS Mountain-Prairie, licensed by CC BY 2.0)
● slideshare.net/woonsan
● woonsanko.blogspot.com
“What is Apache Jackrabbit?
Is it what I need to know?”
Disclaimer:
Not supported yet
by BRC.
● (repository.xml)
<Repository>
<!-- SNIP -->
<Workspace name="${wsp.name}">
<FileSystem class="...">...</FileSystem>
<PersistenceManager class="...">...</PersistenceManager>
<SearchIndex class="...">...</SearchIndex>
<!-- SNIP -->
</Workspace>
<Versioning rootPath="${rep.home}/version">
<FileSystem class="...">...</FileSystem>
<PersistenceManager class="...">...</PersistenceManager>
<!-- SNIP -->
</Versioning>
<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">...</DataStore>
<!-- SNIP -->
</Repository>
“Can’t we store huge amount of binary data in JCR?” *
* https://woonsanko.blogspot.com/2016/08/cant-we-store-huge-amount-of-binary.html
● When using DbDataStore (the default option)
> 85% : binaries
(DATASTORE table)
* More at https://woonsanko.blogspot.com/2018/11/apache-jackrabbit-database-usage.html
Apache Jackrabbit DataStore supports:
● PersistenceManager
● DataStore
●
<< DbDataStore >> << VFS DataStores >>
AbstractDataStore
DbDataStore CachingDataStore
S3DataStoreVFSDataStore
LocalCache
E.g, ./datastore/
● (repository.xml)
<Repository>
<!-- SNIP -->
<DataStore class="org.apache.jackrabbit.aws.ext.ds.S3DataStore">
<param name="config" value="${catalina.base}/conf/aws-s3-datastore.properties"/>
<param name="cacheSize" value="68719476736"/> <!-- 64MB -->
<param name="minRecordLength " value="1024"/>
<!-- SNIP -->
<param name="concurrentUploadsThreads" value="10"/>
<param name="asyncUploadLimit" value="100"/>
<param name="uploadRetries" value="3"/>
</DataStore>
<!-- SNIP -->
</Repository>
* Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
● (repository.xml)
<Repository>
<!-- SNIP -->
<DataStore class="org.apache.jackrabbit.vfs.ext.ds.VFSDataStore">
<param name="config" value="${catalina.base}/conf/vfs2-datastore-sftp.properties" />
<param name="cacheSize" value="68719476736"/> <!-- 64MB -->
<param name="minRecordLength" value="1024"/>
<!-- SNIP -->
<param name="asyncWritePoolSize" value="10" />
</DataStore>
<!-- SNIP -->
</Repository>
* Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
● Benefits:
○ Transparent
○ Almost unlimited storage for binaries
■ Amazon S3, SFTP Gateway for S3
■ SFTP file server, SFTP server with Google Cloud Platform backend
○ Cheaper storage
■ Amazon RDS vs. Amazon S3 buckets or SFTP
○ Faster backup, import, migration
■ Build new environment quickly from production data.
○ Save backup storage
■ Nightly DB backup files
○ Encryption at rest
■ Amazon S3 encryption, SFTP server with Linux file system encryption
“Can’t we keep unlimited amount of revision history?”
* https://woonsanko.blogspot.com/2018/11/externalizing-jcr-version-storage-with.html
●
[1] https://www.onehippo.org/library/administration/maintenance/cleaning-up-version-history.html
> 55% : versions
(VERSION_BUNDLE table)
● When using BundleDbPersistenceManager (the default option)
> 25% : binaries
(DATASTORE table)
* More at https://woonsanko.blogspot.com/2018/11/apache-jackrabbit-database-usage.html
● (repository.xml)
<Repository>
<!-- SNIP -->
<Versioning rootPath="${rep.home}/version">
<FileSystem class="org.apache.jackrabbit.vfs.ext.fs.VFSFileSystem">
<param name="config" value="${catalina.base}/conf/vfs2-filesystem-sftp.properties" />
</FileSystem>
<PersistenceManager
class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager">
</PersistenceManager>
<!-- SNIP -->
</Versioning>
<!-- SNIP →
</Repository>
* Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
- The first project, FeedbackPlugin, started on 2008-01-11.
- boudekerk did the first SVN commit at 2008-01-15 14:20:15 CEST.
- arjecahn made the second commit on that day at 14:49 CEST.
- ...
- jeroenhoffman: “Moving to github is done!” at 2017-02-17 13:34:00 CEST.
- 32 active projects... welcoming your participation!
- ...
● “Don’t stay up. Just schedule it.”
○ The Forge Utilities project supports scheduling Groovy Updater scripts.
○ Check out https://bloomreach-forge.github.io/hippo-utilities/
● “FreeMarker, Handlebars or Thymeleaf? No problem!”
○ The Tempating Support project allows to use any of those.
○ Check out https://bloomreach-forge.github.io/templating-support/
● “Content Packaging REST endpoints? Content-EXIM!”
○ It now supports built-in REST endpoints to export and import content in a flexible way.
○ Check out https://bloomreach-forge.github.io/content-export-import/builtin-rest-services.html
● “Web page flow control like Spring WebFlow in HST apps? Page-Flow!”
○ Page-Flow project supports easy page flow control development, seamlessly integrated with
Channel Manager and Relevance.
○ Checkout https://bloomreach-forge.github.io/page-flow/
● “Can we update webfiles on server at runtime?” or
“Can we use JCR API through JCR over HTTP?”
○ Jackrabbit support SimpleWebDAVServlet and JcrRemotingServlet.
○ Check out https://bloomreach-forge.github.io/hippo-jcr-over-webdav/.
○ Use cadaver or cyberduck for WebDAV access.
○ JCR-Shell (https://bloomreach-forge.github.io/jcr-shell/), a CLI tool for JCR,
using JCR over WebDAV (JCR Remoting).
● https://github.com/woonsanko/hippo-davstore-demo
○ Follow README.md to build and run it on your laptop.
● Scenarios
○ VFS DataStore with SFTP backend
○ VFS FileSystem for Versioning
○ More if time allows...
https://bloomreach-forge.github.io/
https://github.com/bloomreach-forge
...
● jackrabbit.apache.org
● developers.bloomreach.com / www.onehippo.org
● bloomreach-forge.github.io/

Hidden gems in Apache Jackrabbit and BloomReach Forge

  • 1.
    Woonsan Ko Nov. 15,2018 (By courtesy of USFWS Mountain-Prairie, licensed by CC BY 2.0)
  • 2.
  • 3.
    “What is ApacheJackrabbit? Is it what I need to know?”
  • 4.
  • 5.
    ● (repository.xml) <Repository> <!-- SNIP--> <Workspace name="${wsp.name}"> <FileSystem class="...">...</FileSystem> <PersistenceManager class="...">...</PersistenceManager> <SearchIndex class="...">...</SearchIndex> <!-- SNIP --> </Workspace> <Versioning rootPath="${rep.home}/version"> <FileSystem class="...">...</FileSystem> <PersistenceManager class="...">...</PersistenceManager> <!-- SNIP --> </Versioning> <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">...</DataStore> <!-- SNIP --> </Repository>
  • 6.
    “Can’t we storehuge amount of binary data in JCR?” * * https://woonsanko.blogspot.com/2016/08/cant-we-store-huge-amount-of-binary.html
  • 7.
    ● When usingDbDataStore (the default option) > 85% : binaries (DATASTORE table) * More at https://woonsanko.blogspot.com/2018/11/apache-jackrabbit-database-usage.html
  • 8.
    Apache Jackrabbit DataStoresupports: ● PersistenceManager ● DataStore ● << DbDataStore >> << VFS DataStores >>
  • 9.
  • 10.
    ● (repository.xml) <Repository> <!-- SNIP--> <DataStore class="org.apache.jackrabbit.aws.ext.ds.S3DataStore"> <param name="config" value="${catalina.base}/conf/aws-s3-datastore.properties"/> <param name="cacheSize" value="68719476736"/> <!-- 64MB --> <param name="minRecordLength " value="1024"/> <!-- SNIP --> <param name="concurrentUploadsThreads" value="10"/> <param name="asyncUploadLimit" value="100"/> <param name="uploadRetries" value="3"/> </DataStore> <!-- SNIP --> </Repository> * Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
  • 11.
    ● (repository.xml) <Repository> <!-- SNIP--> <DataStore class="org.apache.jackrabbit.vfs.ext.ds.VFSDataStore"> <param name="config" value="${catalina.base}/conf/vfs2-datastore-sftp.properties" /> <param name="cacheSize" value="68719476736"/> <!-- 64MB --> <param name="minRecordLength" value="1024"/> <!-- SNIP --> <param name="asyncWritePoolSize" value="10" /> </DataStore> <!-- SNIP --> </Repository> * Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
  • 12.
    ● Benefits: ○ Transparent ○Almost unlimited storage for binaries ■ Amazon S3, SFTP Gateway for S3 ■ SFTP file server, SFTP server with Google Cloud Platform backend ○ Cheaper storage ■ Amazon RDS vs. Amazon S3 buckets or SFTP ○ Faster backup, import, migration ■ Build new environment quickly from production data. ○ Save backup storage ■ Nightly DB backup files ○ Encryption at rest ■ Amazon S3 encryption, SFTP server with Linux file system encryption
  • 13.
    “Can’t we keepunlimited amount of revision history?” * https://woonsanko.blogspot.com/2018/11/externalizing-jcr-version-storage-with.html
  • 14.
  • 15.
    > 55% :versions (VERSION_BUNDLE table) ● When using BundleDbPersistenceManager (the default option) > 25% : binaries (DATASTORE table) * More at https://woonsanko.blogspot.com/2018/11/apache-jackrabbit-database-usage.html
  • 16.
    ● (repository.xml) <Repository> <!-- SNIP--> <Versioning rootPath="${rep.home}/version"> <FileSystem class="org.apache.jackrabbit.vfs.ext.fs.VFSFileSystem"> <param name="config" value="${catalina.base}/conf/vfs2-filesystem-sftp.properties" /> </FileSystem> <PersistenceManager class="org.apache.jackrabbit.core.persistence.bundle.BundleFsPersistenceManager"> </PersistenceManager> <!-- SNIP --> </Versioning> <!-- SNIP → </Repository> * Find example configs from https://github.com/woonsanko/hippo-davstore-demo/tree/master/conf/.
  • 17.
    - The firstproject, FeedbackPlugin, started on 2008-01-11. - boudekerk did the first SVN commit at 2008-01-15 14:20:15 CEST. - arjecahn made the second commit on that day at 14:49 CEST. - ... - jeroenhoffman: “Moving to github is done!” at 2017-02-17 13:34:00 CEST. - 32 active projects... welcoming your participation! - ...
  • 18.
    ● “Don’t stayup. Just schedule it.” ○ The Forge Utilities project supports scheduling Groovy Updater scripts. ○ Check out https://bloomreach-forge.github.io/hippo-utilities/ ● “FreeMarker, Handlebars or Thymeleaf? No problem!” ○ The Tempating Support project allows to use any of those. ○ Check out https://bloomreach-forge.github.io/templating-support/ ● “Content Packaging REST endpoints? Content-EXIM!” ○ It now supports built-in REST endpoints to export and import content in a flexible way. ○ Check out https://bloomreach-forge.github.io/content-export-import/builtin-rest-services.html ● “Web page flow control like Spring WebFlow in HST apps? Page-Flow!” ○ Page-Flow project supports easy page flow control development, seamlessly integrated with Channel Manager and Relevance. ○ Checkout https://bloomreach-forge.github.io/page-flow/
  • 19.
    ● “Can weupdate webfiles on server at runtime?” or “Can we use JCR API through JCR over HTTP?” ○ Jackrabbit support SimpleWebDAVServlet and JcrRemotingServlet. ○ Check out https://bloomreach-forge.github.io/hippo-jcr-over-webdav/. ○ Use cadaver or cyberduck for WebDAV access. ○ JCR-Shell (https://bloomreach-forge.github.io/jcr-shell/), a CLI tool for JCR, using JCR over WebDAV (JCR Remoting).
  • 20.
    ● https://github.com/woonsanko/hippo-davstore-demo ○ FollowREADME.md to build and run it on your laptop. ● Scenarios ○ VFS DataStore with SFTP backend ○ VFS FileSystem for Versioning ○ More if time allows...
  • 21.
  • 22.
    ... ● jackrabbit.apache.org ● developers.bloomreach.com/ www.onehippo.org ● bloomreach-forge.github.io/