To view webinar recording please use the below URL: http://wso2.com/library/webinars/2015/01/data-to-insight-introduction-to-wso2-business-activity-monitor/
WSO2 BAM 3.0 is an evolutionary upgrade to the current system, which will include a flexible data abstraction layer to support smaller systems and systems with high-end data requirements. Moreover, this version will sport a new distributed indexing and searching features and a new high performance analytics engine as well.
In this webinar, Anjana Fernando, senior technical lead at WSO2, will take a closer view of the following use cases:
Activity monitoring capabilities of WSO2 BAM for tracking related events in a system
Log processing; for indexing and searching a set of log entries from a system
Server statistics generation; for processing data from server events and deriving service statistics
4. WSO2 BAM v3.0
● What’s new?
○ Faster analytics with Apache Spark, 10x - 100x speedups
○ Rich indexing support
○ Pluggable data stores, from light-weight embedded RDBMS to highly scalable HDFS
○ Embeddable architecture for inclusion with other Carbon servers
5. Data Agents
● Compatible with CEP/BAM
● Get data across to BAM
− Service monitoring feature – WSO2 AS, DSS, ESB, API Manager
− Mediation monitoring feature – BAM Mediator for WSO2 ESB
− Custom data-agents
● Asynchronous & non-blocking
● Multiple data transport support with Thrift, Kafka, JMS, MQTT and more.
● BAM analytics specific REST service for index operations, data retrieval and search
6. Data Receiver
● Receives data and stores it in the backend data store
○ Pluggable data store, ranging from light-weight databases to highly scalable big data stores
● Asynchronous & non-blocking
○ Combination of Cassandra, Thrift and the non-blocking nature results in extremely fast writes
● Shared with WSO2 CEP for real time analysis
● Supports plugging in of different receiver types
7. Data Model
● Data is sent over using strongly typed, versioned data streams
{
'name':'phone.retail.shop',
'version':'1.0.0',
'nickName': 'Phone_Retail_Shop',
'description': 'Phone Sales',
'metaData':[
{'name':'clientType','type':'STRING'}
],
'payloadData':[
{'name':'brand','type':'STRING'},
{'name':'quantity','type':'INT'},
{'name':'total','type':'INT'},
{'name':'user','type':'STRING'}
]
}
● BAM Analytics store it in a backend table store structure, with optional indexing, for efficient data
lookup, pagination and near-realtime search
8. Analytics Data Abstraction Layer
● A well defined data source API for exposing storage for analytics record and index store
○ Analytics Record Store
■ A schema-less table record store for storing individual records, with timestamps and
pagination supported
○ Analytics File System
■ A file system structure implementation to be used by the data indexing
implementation
○ Initial set of connectors to be supported
■ RDBMS - with any RDBMS supported using a configuration based query templates
■ HDFS
■ MongoDB
9. The Analyzer Engine
● Powered by Apache Spark with querying through Spark SQL
● Parallel, distributed processing with optimized in-memory computing
● Outperforms Hadoop in efficiency and speed, making smaller deployments feasible for typical
analytics tasks
● Clustered using Spark embedded Carbon servers
10. The Analyzer Engine
● Runs on a Spark engine embedded Carbon server cluster
○ Scalable analytics
○ Cluster can range from a couple of nodes to 1000s
● Analysis is carried out on an interactive query console and analytics scripts
● Queries are based on an easy-to-learn, SQL-like query language
INSERT INTO TABLE UserTable SELECT userName, COUNT(DISTINCT orderID), SUM(quantity) FROM
PhoneSalesTable WHERE version= "1.0.0" GROUP BY userName;
11. The Analyzer Engine
● Interactive Query Console
○ Queries are entered in an console and executed one by one in the cluster, and the results are
sent back asynchronously to the console
● Scripts can be scheduled
○ e.g.:- once a minute, every Wednesday at 4:15 p.m., every 30th
at 12 midnight
● Connects to the standard analytics data store
○ Any type of backend data store is supported, to read data to be analyzed and resultant data
can also be written out using the same interface
○ Single set of analytics scripts / toolboxes for any type of backing data store
12. Distributed Data Indexing
● Data indexing support with full text indexing
● Data drill down support with facets
● Based on Apache Lucene, high performant, feature rich, indexing engine
● Distributed indexing support using sharded Lucene indexes
○ Horizontal scaling of index storage and indexing performance
○ Designed to be able to dynamically add more nodes later
● Near real-time data indexing and retrieval
○ Data is indexed immediately when the data is retrieved by the event receiver
○ Batched data indexing for higher performance
13. The Presentation Layer
● Gadget based dashboards for visualization
○ Custom user dashboard and gadget creation using wizards
○ Data drill-down views
● Activity Dashboard for correlating activities
● Message Console - one stop shop for managing the analytics data store
○ Full support for paginated data retrieval and timestamp based filtering
○ Data table creation, index association, row insertion, updates, deletions
○ Batch data upload using files
○ Rich search functionality for indexed tables
● Log analysis dashboard - out-of-the box log analysis solution
18. BAM Toolboxes
● BAM Toolboxes are installable and hot deployable artifacts used for deploying functionalities to a
BAM server
○ Stream definitions
○ Analytics scripts
○ Dashboards
● Toolboxes for monitoring and auditing most WSO2 products are available OOTB
● Toolboxes for custom scenarios can be created easily
19. H/A Distributed BAM deployment
● WSO2 BAM can be clustered and deployed in a distributed manner to enable high-availability, fail-
over scenarios
○ Distributed deployment
■ All components of BAM are clustered (data receiver, storage, analyzer and presentation)
■ Hazelcast in-memory data grids are used for clustering implementation
■ Optimized for a simpler deployment
20. Service Statistics Monitoring
● Improved set of analytics and visualization for web services and web applications
● Available with upcoming WSO2 AS v6.0 release
● In-built dashboard in WSO2 AS with light-weight analytics module
○ Configurable to point to an external BAM server
29. Activity Monitoring
● Activity monitoring is for tracking events from multiple nodes in a flow to understand a specific
activity
○ e.g.:-
■ A client initiating a web services request which travels through multiple ESBs,
application servers and returns back. This flow will be uniquely identified and
visualized in BAM
○ Used for tracing messages, finding performance hotspots in the flow
○ Implemented based on a correlation id based mechanism and indexing
○ Upcoming: Mediator level tracing and profiling in WSO2 ESB 5.0
32. Log Analysis
● OOTB log analysis solution in BAM v3.0
● Log event indexing
○ Uses the new BAM v3.0 indexing support
○ Event attributes can be indexed to be search by server, cluster, log type and also log
messages itself for full text search
● Custom search queries using Lucene queries and regular expressions
● Log agent client for reading and publishing log events from log files
○ No need to install a special agent inside the target servers itself
○ Extensible mechanism for specifying log formats in parsing