Alfresco: Novedades de Arquitectura y escalabilidad en la versión 4
Upcoming SlideShare
Loading in...5
×
 

Alfresco: Novedades de Arquitectura y escalabilidad en la versión 4

on

  • 1,463 views

Presentación del Webinar "Alfresco, Arquitectura y Escalabilidad", 14/11/2012. ...

Presentación del Webinar "Alfresco, Arquitectura y Escalabilidad", 14/11/2012.

Habla de las novedades de Alfresco 4 en relación a las capacidades de despliegue y las novedades más significativas a nivel de arquitectura y escalabilidad.

Statistics

Views

Total Views
1,463
Views on SlideShare
1,445
Embed Views
18

Actions

Likes
0
Downloads
102
Comments
0

3 Embeds 18

https://twitter.com 9
http://www.linkedin.com 7
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • What should we say this:Besides the big blocks .. There Solr and Activiti and the necessary adjustments to surf it and share a lot of specific adjustments to the repoImpact very positively on the platform and make the product a lot more mature.In view of the repository, we have not changed but our architecture:- Supplemented where necessary to make sense ... Social Services, Publishing, caching, content store, open CMIS, SOLR, Peer AssociationsAdjusted where necessary (as we have seen is a little better) ... Canned queries,Home Folder, Content Disk DriverUpdated ... Share Java backed WebScripts and services ActivitActiviti Solr and I let out ....Also Share Services ...Content Disk Driver workt out, especially in the Enterprise EditionMigrating away from the AVMAVM Deprecation Warning: The AVM is no longer being actively developed by Alfresco Engineering and Enterprise support subscriptions for the AVM are no longer being offered to new customers. New projects requiring Web Content Management features may want to consider leveraging a CMIS-based solution such as Web Quick Start instead.– When new content is published, the AVM creates an entire version of the web project before publishing it. This feature while of some use, is expensive. Cetera should evaluate the usefulness of this feature and consider whether it is still necessary.- Transfer Services Deploy to File System – In 4.0 Transfer services adds the capability of publishing content from the DM portion of the repository to a file system with out pushing it to the AVM or using a custom script.- Leveraging SOLR – 4.0 provides the ability to use SOLR as an indexing server for a clustered environment. This can add some robustness to the indexing of the repository.Node LocatorThe 4.0 release saw the introduction of the NodeLocatorService.The service provides a way to lookup one node from another, it's main use is from the Forms association control, allowing custom "startLocation" strategies to be plugged in.
  • As you might notice from the picture, the mapping of Alfresco stores (for example, Workspace, Archive) is very straightforward: each store is mapped onto a separate core in Solr and can be configured and tuned separately.
  • Database and search indexes are kept in sync with the Lucene based indexing as indexing is part of the transaction
  • Vs. the transactional consistency in Alfresco 3Therefore, the Alfresco repository can focus on performing core content management operations delegating the indexing duties to a pull index tracking scheduled job running in the Solr tier. In other words, Alfresco 4 removes the requirement to have the database and indexes in perfect sync at any given timeand relies on an index that gets updated on a configurable interval (default: 15s) by Solr itself.
  • For Solr based queries the ACL checks are moved from the repository tier to the index tier.
  • Alfresco 4 using the Solr index tier outperforms Alfresco using Lucene in each data point testedAlfresco 4 + Solr seems to cope much better with growth on the users, demonstrating a much better scalability on the users’ dimension
  • Most demand for EMC CenteraArchive based storage – not folder based storageSet XAM:RetainUntil attribute (defualt value if not set)Define new content storeUses aspect to drive content to be archived – moved to content storeCan use to extend XAM meta data
  • Basic single server scenario’s with embedded Solr, Lucene and the no-index option
  • In this scenario there are two nodes for both repository services and indexingThis is a typical simple high availability scneario
  • Benchmarks showed that the client tier hardly has any overhead, so the only reason to have a client tier is when you have a lot repository tiers.
  • In this scenario there are two repository nodes and two indexing nodesThe repository nodes are also responsible for index trackingIn common scenario’s having a separate Share client tier makes no sense as Share does not affect performanceThere can be special scenario’s where a separate Share client tier makes sense, for example when using heavy customizations on the client tierAt the same time for headless scneario’s having share on the content tier makes no sense
  • In this scenario there are two repository nodes and two indexing nodesThe indexing nodes have a local alfresco for index tracking to remove indexing load from the repository nodesThe repository nodes query the solr nodes also through the load balancer to allow scaling out the indexing tier separate from the repository tierThe indexing nodes also require access to the storage layer as indexing is done in a pull fashion on the indexing nodesYou can put a separate load balancer between the repository nodes and the indexing nodesIn common scenario’s having a separate Share client tier makes no sense as Share does not affect performanceThere can be special scenario’s where a separate Share client tier makes sense, for example when using heavy customizations on the client tierAt the same time for headless scneario’s having share on the content tier makes no sense
  • In this scenario there are two repository nodes and two indexing nodesThe indexing nodes have a local alfresco for index tracking to remove indexing load from the repository nodesThe repository nodes query the solr nodes also through the load balancer to allow scaling out the indexing tier separate from the repository tierThe indexing nodes also require access to the storage layer as indexing is done in a pull fashion on the indexing nodesYou can put a separate load balancer between the repository nodes and the indexing nodesIn common scenario’s having a separate Share client tier makes no sense as Share does not affect performanceThere can be special scenario’s where a separate Share client tier makes sense, for example when using heavy customizations on the client tierAt the same time for headless scneario’s having share on the content tier makes no sense
  • In this scenario there are two repository nodes and two indexing nodesThe indexing nodes have a local alfresco for index tracking to remove indexing load from the repository nodesThe repository nodes query the solr nodes also through the load balancer to allow scaling out the indexing tier separate from the repository tierThe indexing nodes also require access to the storage layer as indexing is done in a pull fashion on the indexing nodesYou can put a separate load balancer between the repository nodes and the indexing nodesIn common scenario’s having a separate Share client tier makes no sense as Share does not affect performanceThere can be special scenario’s where a separate Share client tier makes sense, for example when using heavy customizations on the client tierAt the same time for headless scneario’s having share on the content tier makes no sense
  • Sometimes it might be beneficial to have separate independent environment, for example an enterprise wide collaboration platform and a platform for the documentation department for example to create documentation using Componize DITA. Especially in large projects this might be beneficial. Content can be replicated from one environment to the other.Similar scenario’s can also be used for WCM scenario’s.
  • Sometimes it might be beneficial to have separate independent environment, for example an enterprise wide collaboration platform and a platform for the documentation department for example to create documentation using Componize DITA. Especially in large projects this might be beneficial. Content can be replicated from one environment to the other.Similar scenario’s can also be used for WCM scenario’s.
  • 48% of the operations searches is not a realistic figure for typical collaboration cases. 2.5% to 5% of the operations is these cases is a more realistic percentage.

Alfresco: Novedades de Arquitectura y escalabilidad en la versión 4 Alfresco: Novedades de Arquitectura y escalabilidad en la versión 4 Presentation Transcript

  • Arquitectura yescalabilidadJosé CarrascoSenior Solution EngineerBarcelona 14 Nov 2012
  • Qué hay en este webinar ? 10 % 0verview de las novedades de la plataforma 50 % Mejoras en la escalabilidad 40 % Las 1001 formas de escalar Alfresco
  • NOVEDADESARQUITECTÓNICAS10 % 0verview de las novedades de la plataforma
  • La plataforma CMIS SOA - Webservices APIs RESTful Webscripts Bulkloading API WebDAV Protocolos FTP CIFs - Sharepoint Java .net Lenguajes Php Python C++
  • Qué hay de nuevo en Alfresco 4 ? New OpenCMIS Android APP CIFS Driver 2 WebScripts ServerFoundation Share Social Publishing NodeLocatorServices Services Services Home Peer Search Canned Folder 2 Associations Queries Index Encrypted Control Properties Canned Queries Caching Content Store ContentStore Database
  • MEJORAS EN LAESCALABILIDAD50 % Mejoras en la escalabilidad
  • Mejoras en el rendimiento• 10x más rápido en Queries al User Dashboard• 4x más rápido en la subida de contenidos• 25% más rápido cargando la librería de documentos• 50% más rápido cargando los detalles del documento• Mejoras significativas búscando y indexando
  • Mejoras en la Arquitectura• Subsistema de indexación• Cloud Híbrida• Servidor de transformación• Clustered Filesystems• FSTR
  • Subsistema deIndexación
  • Subsistema de indexación• El sistema de indexación ahora esta en un subsistema separado• Alfresco nos ofrece una capa opcional de indexación independiente basada en Apache Solr• El funcionamiento del repositorio ahora es independiente de los servicios de búsqueda.
  • El gran cambio en la 4.01. Alfresco Repository  ESCALABLE – alfresco.war = alfresco/2. Alfresco Share  ESCALABLE – share.war = share/3. Third party applications  ESCALABLE – OpenOffice, convert, pdf2swf4. Database  ESCALABLE – JDBC supported database5. Content Store  ESCALABLE – alf_data/contentstore and alf_data/contentstore.deleted6. Indexes  ESCALABLE AHORA env4.0 – alf_data/lucene-indexes and alf_data/backup-lucene-indexes
  • Las 3 opciones del subsistema• lucene – Librerías de Lucene embebidas dentro del repositorio.• solr – Habilita la integración con Solr• noindex: – No se habilita ningún buscador.
  • Sistema Push
  • Lucene
  • Eventual consistency (Solr)
  • Control de ACLs
  • Ventajas• Distribución de las cargas del repositorio y de la capa de indexación en diferentes capas Repository Tier Index Tier Share Alfresco Solr Tomcat Tomcat
  • Ventajas• Mejora de la escalabilidad horizontal y vertical de la solución Index Tier Solr Repository Tier Alfresco Load Balancer Share Tomcat Alfresco Tomcat Solr Alfresco Tomcat
  • Solr vs. Lucene• Solr mejora los rendimientos de Lucene en todos los escenarios• El rendimiento de Solr escala bien a medida que escalamos usuarios
  • Desplieguehíbrido
  • Alfresco Cloud• Hosted service• Multitenant – Red Privada – Invitación Privada• Free 10GB storage• Cuentas Premium – Almacenamiento – Caracteristicas Admin• Sincronización con On Premise
  • Nuevo paradigma ECM Repositorio central Despliegue Silo hibrido del SILO a la NUBE™
  • Nube Hibrida Consultant Customer EU Division Sync US Prof Alfresco in the cloud Services This is Cloud Connected Content Offshore Development
  • Alfresco Enterprise Sync1 3 Alfresco Enterprise Alfresco in the cloud On-Premise 2 4
  • The Alfresco API & SDK www.alfresco.com/develop
  • Servidor detransformación
  • Transformaciones ?• Convertir de un formato a otro• Usado para previsualizaciones• Ejemplos: – Miniaturas – Previsualizaciones• Lanzadas por reglas
  • Cómo se hace?• Utiliza un conjunto de herramientas: – Open Office para ofimática – ImageMagick para imágenes – SWF Tools para Flash• Se puede extender ( es un framework)• Se pueden encadenar (composición)• Transformaciones existentes ?: – http://localhost:8081/alfresco/s/mimetypes
  • Servidor de Transformación• Transformación perfectas a nivel de Pixel• Alto grado de corrección en transformaciones de Office complejas• Transformation Tier• Sistema avanzado de gestión de errores de transformación• Del orden de 2 a 3 veces más rápido transformando grandes documentos (+1Mb) de office.• En el roadmap: conversiones avanzadas de video
  • Requerimientos de Software• Microsoft Windows 2008 Server R2 SP1 x64 con los últimos parches (English)• Microsoft Office 2010 SP1 x86 (English)• Consultar http://support.alfresco.com para último stack
  • Clustered FileSystems
  • Clustered File Systems (4.1.2)• Para usar CIFS, FTP o NFS en un entorno clúster• Soporte gracias a las librerías Hazelcast
  • FSTR
  • FSTRFILE SYSTEM TRANSFER SERVICE• FSTR ha sido reescrito para la versión 4.0• FSTR ahora utiliza los Transfer Services• Incluido como una parte del DM (en lugar de AVM)• Configurable via SHARE• Nos ofrece la posibilidad de publicar contenido desde el DM a cualquier file system sin la necesidad de utilizar un custom script o pasar por el AVM.• FSTR funciona en su propia instancia de tomcat
  • Caching ContentStore
  • Caching Content Store• Permite encapsular un determinado Store para mejorar el rendimento• Es un wrapper pensado para implementaciones lentas.
  • Mejorar loexistente
  • Políticas de Almacenamiento• AKA Information Lifecycle Policy Rules SSD Management (ILM) $$$• Almacenamiento dinámico basado en políticas de Policy FC Rules negocio / ciclo de vida Drives $$• Reduce el costo sin reducir el performance necesario SATA – Políticas de Backup Drive – Seguridad $ – Coste por documento
  • XAM Content Connector• Una buena solución para contenido que no va a cambiar• Soporte para almacenamiento compatible con XAM• Diseñado para trabajar con soluciones como EMC, HP, IBM, Hitachi, Sun, etc.• Solo para Alfresco Enterprise
  • MODELOSDE DESPLIEGUE40 % Las 1001 formas de escalar Alfresco
  • 1. Definiendo el CASO de uso Scanning Corporate Solutions Systems Liferay Content DrupalDistribution Jive SAPWeb PeopleSoft sites Archive ContentDeployment Records Management Share Project Intranet Team Knowledge Department Repository Collaboration
  • 2. Analizando la carga• Usuarios concurrentes• Tamaño del repositorio• Ratio de ingesta de documentos• Ratio de escritura / lectura• Operaciones de búsqueda• Usuarios y grupos• Protocolos• Operaciones Batch• Adaptaciones
  • Único Servidor Share Alfresco Share Share Solr Alfresco Alfresco Tomcat Tomcat Tomcat Storage Layer Storage Layer Storage LayerEmbedded Solr Lucene No Index
  • Activo - Activo Load Balancer Share Share Alfresco Alfresco Solr Solr Tomcat Tomcat Storage Layer Database Cluster SAN Failover
  • Escalando SHARE Load Balancer Share Share Tomcat Tomcat Client Tier Load Balancer Alfresco Alfresco Solr Solr Tomcat Tomcat Repository Tier Storage Layer
  • Capa de Indexación• Sin Alfresco dedicado a Tracking ) Load Balancer Repository Tier Share Share Alfresco Alfresco Tomcat Tomcat Solr Solr Tomcat Tomcat Index Tier Storage Layer
  • Capa de Indexación• Con Alfresco dedicado a Tracking ) Load Balancer Share Share Alfresco Alfresco Tomcat Tomcat Repository Tier Load Balancer Solr Solr Alfresco Alfresco Tomcat Tomcat Index Tier Storage Layer
  • Ventajas de Alfresco dedicado• Recibe la carga del seguimiento de la indexación• Se genera menos tráfico de red a las instancias productivas de Alfresco.• Mejora el rendimiento general del index tracking
  • DESventajas de Alfresco dedicado• El Alfresco dedicado al index tracking ocupa recursos del servidor Solr. Esto puede afectar a los tiempos de respuesta.• En estos escenarios, puede ser necesario mover el alfresco a otra máquina.
  • Capa de Transformación Load Balancer Share Share Alfresco Alfresco Tomcat Tomcat Repository Tier Load Balancer Solr Solr Transformation Server Alfresco Alfresco Tomcat Tomcat Tomcat Index Tier Transformation Tier Storage Layer
  • Con servidor de ingesta Load Balancer CMIS Share Share Alfresco Tomcat Alfresco Alfresco Tomcat Tomcat Repository Tier Load Balancer Solr Solr Alfresco Alfresco Tomcat Tomcat Index Tier Storage Layer
  • Solo Repositorio CMIS Load Balancer Alfresco Alfresco Tomcat Tomcat Repository Tier Load Balancer Solr Solr Alfresco Alfresco Tomcat Tomcat Index Tier Storage Layer
  • Separación funcional Load Balancer Share Share Share Alfresco Alfresco Alfresco Replication Job Solr Solr Tomcat Tomcat Tomcat Storage Layer Storage Layer Enterprise Collaboration Documentation Department
  • Web Content Services Load Balancer Drupal CMIS Share Share Alfresco Alfresco Alfresco Replication Job Solr Solr Solr Tomcat Tomcat Tomcat Storage Layer Storage Layer
  • Despliegue Hibrido Load Balancer Share Share Alfresco Alfresco SYNC Solr Solr Tomcat Tomcat Storage Layer Cloud Enterprise Collaboration
  • VERSIÓN 4.1.1BENCHMARKS3-4 x veces más rápido en la mayoría de operaciones(comparado con la 3.4)
  • Resultados• Dos nodos de Alfresco con 3 CPU cada uno y alrededor de 12Gb de Heap con 2 nodos de Solr pueden soportar hasta 1080 usuarios concurrentes en un escenario de colaboración que alcance 10 millones de contenidos sin ningún problema de degradación de performance.• Share es un cliente ligero que apenas afecta al performance.• El repositorio ya no es el cuello de botella• En un escenario con un 48% de búsquedas, Solr es una capa critica.• Un Alfresco dedicado para Index Trackers es beneficioso en un conjunto amplio de escenarios.
  • Benchmark Server Architecture Client Configuration Reporting ZooKeeper Server configuration Test Definitions MongoDB Test run definitions MongoDB MongoDB Test Run Event Queues Test Run Results Data Mirror Collections Benchmark Server 1 Benchmark Server N Thread Pool Thread Pool Common Libraries eg. WebDriver Common Libraries e.g. WebDriver Test Target