John Newton, founder and CTO of Alfresco, describes how Amazon Aurora enables the Alfresco Content Management System to store, manage, and retrieve billions of documents and related information with fast and linear scalability. Using new techniques of information modeling, indexing, and processing with the recently launched Aurora database, Alfresco can support cloud-based workloads previously not possible for high-throughput insurance, banking, and case-based applications. This session addresses the challenges of scaling document repositories to this level; architectural approaches for coordinating data; search and storage technologies such as Aurora, Solr, Amazon EBS, and Amazon S3; the breadth of use cases that modern content systems need to support; and how to support user applications that require subsecond response times. The result is a solution that once would have required large data centers to support but can now be handled cost-effectively with AWS and Aurora.
2. What to expect from the session
• Challenges of scaling to billions of documents
• Architectural approaches of managing data, search, and storage
with Amazon Aurora, Solr, Amazon EBS, and Amazon S3
• The breadth of use cases of content at scale
• How to support user applications that require sub-second response
times
• Moving from large data centers to cost-effective management with
AWS and Amazon Aurora
8. someone is trying to store…
One Billion
Documents!!!
http://www.warnerbros.com/austin-powers-international-man-mystery
9. Some have tried before … and failed!
We’ll
configure
1 Million
SharePoint
Servers!!!
10. Digital transformation is driving huge flows of content
Gartner Nexus
PWC 6th Annual Digital IQ Survey, 2014
Digital
Business
Cloud Social
Big DataMobile
11. Content use cases at scale
Enterprise
Document Library
Loans &
Policies
Claims & Case
Processing
Transaction &
Logistics Records
Research &
Analysis
Real-time
Video
Internet of
Things
Medical &
Personnel Records
Government
Records & Archives
Discovery &
Litigation
14. Content vs. data vs. files vs. EFSS
Data Files EFSS Content and ECM
15. Content architecture as a big data problem
15
Files /
Renditions
Metadata
Directory CategoriesRelationships
Indexes
Search
Activities
Security People
APIs
Processes /
Tasks
Rules
Semantics
Types
Content
Object
Access Create – Manage – Distribute – Use
Context
Database
Distributed
FSDatabase
Solr /
Elasticsearch
16. Content at scale in the enterprise
Users at Scale
Concurrency Content Count
Read/Write
Throughput
Geographic
Distribution
Volume Size
17. The problem with traditional approaches
Provisioning and
Administration
Geographic Distribution Lack of Agility
Lack of Redundancy Lack of Elasticity
21. Content management architecture
21
Alfresco Share
Alfresco
Repository
Alfresco SOLR
Activiti
Workflow
Protocols
APIs (CMIS)
Media
Mgmt
Desktop and
Mail Client
Mobile App Cloud Sync
Database
FS Content
Store
Indexes
Records
Management
Reports &
Analytics
Reports and
Analytics Server
Media Transform
Services
Transforms
Authentication
Auditing
Rules/Policies
Web Scripts
Scheduled Jobs
22. Process management architecture
Database Elasticsearch
Files
Amazon EBS
Process
Mining
Activiti Engine
Tomcat / Jetty
Process
Virtual
Machine
Tasks Processes Jobs
Activiti REST App REST Admin REST
MS Office
Protocol
Activiti
Analysis
AngularJS
Activiti
App
AngularJS
Activiti
Admin
AngularJS
MS
Office
Activiti
Mobile
iOS / Android
Activiti
Designer
Eclipse
23. Scaling in tiers
Alfresco
Transformation Server
Alfresco
Transformation Server
Alfresco Solr
Alfresco Local Repo
(Index Tracking)
Alfresco Solr
Alfresco Local Repo
(Index Tracking)
Alfresco Repository Alfresco Repository
Alfresco Share Alfresco Share
Alfresco Activiti Suite
Alfresco Activiti Suite
24. Multi-tenant Cloud Service on AWS
RDS
Activities
Route53 (DNS)
S3
ELB
Layer7
solr trans
/share
/alfresco
haproxy
haproxy
varnish
/share
/alfresco
haproxy
/share
/alfresco
haproxy
haproxy
varnish
haproxy
varnish
web nodes
alfresco nodes
solr
solr trans
trans
33. Do you suppose we
can put it together
with some string and
Scotch Tape?!!
34. Provisioning VLDB repos
10
new architecture allows room to scale the environment to support 2013 and 2014 roadmap
plans while still supporting an environment that will be reliable and robust.
Additionally, this environment support disaster recovery capabilities as well, guaranteeing that in
case of a severe outage, that backups are stored and quick turn around can occur to restore the
environment.
Below is a screenshot of the Customer Deployment Portal in which Stanford will be able to scale
the Alfresco environment seamlessly within a web based UI.
Flexibility
This new architecture will also utilize new Cloud Ops tools that will allow increased flexibility in
the administration of the Alfresco environment. This gives Stanford the flexibility to grow or
shrink the different environments based on demand, pricing, or performance. While the need for
flexibility of the environment might be minimal in production, this will be especially advantageous
as Stanford develops on the Alfresco service, and needs to rapidly spin up and down
test/development environments.
Self-Service
The Customer Deployment Portal will be one of the benefits of moving to the proposed
environment. The Customer Deployment Portal is a web based administration tool that allows
Stanford to self-service their environment. Stanford will be able to setup, deploy, change, and
monitor the different AWS environments through a user friendly and intuitive web interface.
This includes control over the number of virtual machines, size of the virtual machines, load
balancers, databases, storage sizes and types, and more.
ContainersDevOps
Data as a Service
Indexing and
Search as a Service
Files as a Service
Rolling Deployment
• Nginx
• HA Proxy
• Varnish
• Alfresco Share
• Alfresco Repo
• Alfresco
Analytics
• Alfresco Media
• Activiti
• Solr
• ActiveMQ
• Transform
• Database
• Storage
• LDAP
• Email Server
• Logs
• Monitoring
Security
35. Large-scale benchmarking
BM01 User scenarios
BM02 User concurrency on single
node
BM03 Solr Performance
BM04 Concurrent Load and Access
– multi-user
BM05 User Invite and Tenant
Provisioning
BM06 Workflow service
performance
BM07 Workflow API performance
BM08 High concurrency in Multi-
Tenancy
https://wiki.alfresco.com/wiki/Benchmark_Testing_with_Alfresco
https://github.com/AlfrescoBenchmark/alfresco-benchmark
Benchmark Server
Tomcat 7
Rest API
MongoDB
Config Data
Services
MongoDB
Test Data
UI
Benchmark Driver (xN)
Benchmark Driver (xN)
Benchmark Driver
Tomcat 7 Extras
(Selenium)
Servers / APIs Servers / APIs
Load Balancer
Servers / APIs
Test
Services
Rest API
36. BM4 test execution environment: 1.2B docs
UI Test x 20 m3.2xlarge
Simulate 500 Users
• Selenium / Firefox
• 1 hour constant load
• 10 sec think time
UI Test UI Test
Alfresco Alfresco Alfresco x 10 c3.2xlarge
Alfresco with Share
and Repo
Solr x 20 c3.8xlargeSolr Solr
Aurora x 1 db.r3.4xlarge
ELB
Sharded Solr Cloud
sites folders files transactions dbSize GB
10,804 1,168,206 1,168,206,000 15,475,064 3,185
Simulate AWS
Import/Export
(in place)
37. Benchmark results
• Document load rate 1200 per sec
• 4.3 Million per Hour on 10 nodes!
• Load rate consistent beyond 1B
• CPU loads:
• Database: 70-80% in Bulk Load
• Alfresco: ~50%
• Solr: <60%
• CMIS API Calls (OASIS Standard)
• Aurora indexes efficient at 3.2TB
• NAME (=, LIKE) ~ 20ms
• IN_FOLDER (sorted, limited) ~ 160ms
• Sub-second login times and good,
linear responses for other actions
• Open Library: ~3s
• Page Results: <1s
• Navigate to Site: ~1s
• Individual search ~1s
• 500 concurrent search: ~3s response
• CPU loads:
• Database: <10%
• Alfresco: 25-30%
• Solr: 25-30%
No size-related bottlenecks with 1.2 billion documents
Bulk Operations User Operations
38. AWS
What a difference
3-6 Months
Questionable Scale
Little Redundancy
Lots of $$$
< 30 mins
10x Faster
Elastic, Fault-Tolerant
Open, Cost Effective
ECM ECM ECM
Search Search Search
FS FS FS
Hardware Hardware Hardware
Load Balancer
DR Plan
HSM HSM HSM
ECM ECM ECM
ELB
Alfresco Alfresco Alfresco
Solr Solr Solr
S3
EC2 EC2 EC2
AZ1 AZ2 AZ3
Aurora
EBS
39. What a difference
3-6 Months
Questionable Scale
Little Redundancy
Lots of $$$
< 30 mins
10x Faster
Elastic, Fault-Tolerant
Open, Cost Effective
40. Well, what am I
supposed to do
with all this
frickin’
hardware?!!