Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

WSO2 Product Release Webinar: WSO2 Data Analytics Server 3.0


Published on

To view recording of this webinar please use below URL:

WSO2 Data Analytics Server (WSO2 DAS) version 3.0 is the successor of WSO2 Business Activity Monitor 2.5. It based on the latest technologies and is an evolutionary upgrade to the current system. WSO2 DAS comes with a comprehensive set of new features including support for pluggable data sources, support for batch processing with Apache Spark, support for distributed data indexing, a new dashboard and support for unified data querying with analytics REST APIs.

The WSO2 DAS combines real-time, batch, interactive, and predictive (via machine learning) analysis of data into a single integrated platform. This webinar will present and demonstrate the following key features and capabilities in detail:

Pluggable data sources support with its new data abstraction layer
Batch analytics using the Apache Spark analytics engine
Interactive analysis powered by Apache Lucene
An analytics dashboard to visualize results
Activity monitoring capabilities for tracking related events in a system

Published in: Technology

WSO2 Product Release Webinar: WSO2 Data Analytics Server 3.0

  1. 1. WSO2 Data Analytics Server 3.0.0 Product Release Webinar Inosh Goonewardena Associate Technical Lead
  2. 2. WSO2 Analytics Platform
  3. 3. WSO2 Analytics Platform
  4. 4. Data Processing Pipeline
  5. 5. Introducing WSO2 Data Analytics Server ● Fully-open source solution with the ability to build systems and applications that collect and analyze data and communicate the results. ● Embodies the WSO2 Analytics Platform by combining batch, real- time, interactive and predictive analytics capabilities ● High performance data capture framework ● Highly available and scalable by design
  6. 6. Advantages of DAS 3.0 over WSO2 BAM 2.5.0 ● Complete rewrite from the ground up, with performance and extensibility as core values ● Faster analytics powered by Apache Spark, 10x - 100x speedup ● Rich indexing support, with near real-time text search ● Pluggable data store support, from lightweight embedded RDBMS to highly scalable HBase/HDFS ● Revamped Analytics Dashboard with wizard-based gadget generation
  7. 7. WSO2 DAS Architecture
  8. 8. Collecting Data
  9. 9. Data Model { 'name': '', 'version': '1.0.0', 'nickName': 'stream nickname', 'description': 'description of the stream', 'metaData':[ {'name':'meta_data_1','type':'STRING'}, ], 'correlationData':[ {'name':'correlation_data_1','type':'STRING'} ], 'payloadData':[ {'name':'payload_data_1','type':'BOOL'}, {'name':'payload_data_2','type':'LONG'} ] } ● Published data conforms to a strongly typed data stream
  10. 10. ● One API for Batch and Real-time Analytics. ● Asynchronous and non-blocking nature enables extremely fast writes. ● Supports multiple transport adapters for data collection Data Receiver
  11. 11. Highly Pluggable Event Receiver Architecture
  12. 12. Data Persistence ● Data Abstraction Layer to enable pluggable data connectors ○ RDBMS, Cassandra and HBase/HDFS offered. Custom connectors could be easily written ● Analytics Table ○ The data persistence entity in WSO2 Data Analytics Server ○ Provides a backend data source agnostic way of storing and retrieving data ○ Allows applications to be written in a way that it does not depend on a specific data source, e. g. JDBC (RDBMS), Cassandra APIs etc. ○ WSO2 DAS gives a standard REST API in accessing the Analytics Tables
  13. 13. Data Persistence ● Analytics Record Stores ○ An Analytics Record Store houses a specific set of Analytics Tables ○ The Analytics Record Stores to be used for storing incoming events and storing query processing output are configurable ○ Single Analytics Table namespace, the target record store only given at the time of table creation ○ Useful in creating Analytics Tables where data will be stored in multiple target databases ● Analytics File System ○ The location where the indexing data is stored ○ Multiple implementations provided OOTB, or custom implementations can be written
  14. 14. Analyzing Data
  15. 15. Batch Analytics
  16. 16. Batch Analytics - Overview ● Powered by Apache Spark for 10x-100x higher performance than Hadoop ● Parallel, distributed with optimized in-memory processing ● Scalable script-based analytics written using an easy-to-learn, SQL-like query language powered by Spark SQL ● Interactive built in web interface for ad-hoc query execution ● Scheduled query script execution support with high-availability and failover ● Run Spark on a single node, Spark embedded Carbon server cluster or connect to external Spark cluster
  17. 17. create temporary table product_data using CarbonAnalytics options (schema …) create temporary table products using CarbonAnalytics options (schema …) insert into products select product_name from product_data group by … Batch Analytics - Spark SQL
  18. 18. Batch Analytics - Interactive Console
  19. 19. Batch Analytics - Spark Scripts
  20. 20. Interactive Analytics
  21. 21. ● Full text data indexing support powered by Apache Lucene ● Drill down search support ● Distributed data indexing ○ Designed to support scalability ● Near real-time data indexing and retrieval ○ Data indexed immediately as received Interactive Analytics
  22. 22. Interactive Analytics
  23. 23. Real-time Analytics
  24. 24. What is Real-time Analytics? Real-time Analytics in →
  25. 25. Real-time Analytics in → ● Gather data from multiple sources ● Correlate data streams over time ● Find interesting occurrences ● And Notify ● All in real-time What is Real-time Analytics?
  26. 26. Predictive Analytics (upcoming)
  27. 27. Predictive Analytics in → What is Predictive Analytics?
  28. 28. Predictive Analytics in → ● Extract, pre-process, and explore data ● Create models, tune algorithms and make predictions ● Integrate for better intelligence What is Predictive Analytics?
  29. 29. Communicating Results
  30. 30. Dashboards ● “Overall idea” in a glance (e.g. car dashboard) ● Support for personalization, you can build your own dashboard. ● The entry point for Drill-down ● Building a custom dashboard ○ Dashboard via Google Gadgets and content via HTML5 + JavaScript ○ Leverages WSO2 User Engagement Server to build a dashboard. ○ Uses charting libraries like Vega, D3.js
  31. 31. Dashboards: Gadget Generation Wizard ● Start with data in tabular format ● Map each column to dimension in your plot like X,Y, color, point size, etc ● Also do drill downs ● Create a chart with few clicks
  32. 32. Alerts ● Detecting conditions can be done via CEP Queries ● “Last Mile” is key ○ Email ○ SMS ○ Push notifications to a UI ○ Pager ○ Trigger physical Alarm
  33. 33. APIs ● With mobile Apps, most data are exposed and shared as APIs (REST/JSON ) to end users. ● Analytics results can be exposed through APIs ○ REST API ○ JavaScript API
  34. 34. What can WSO2 DAS do for you?
  35. 35. Common Use Cases of WSO2 DAS ● KPI Statistics ○ Application Statistics Monitoring ○ Network / Service Statistics ○ Sensor Data Aggregation ● Solving Optimization Problems ○ Urban Planning ○ Revenue Distribution Analysis ● Activity Monitoring ○ Tracking Message Flows ● HL7 Data Exploration ○ ESB HL7 Transport Interfaced with DAS ● Log Analysis ○ Application / System Logs ● Sports ○ Real-time Analysis of Player Performance ○ Real-time Match Analysis ● Geo-Spatial ○ Traffic Monitoring and Alerting ○ Geo-fencing ● Anomaly Detection ○ Fraud Detection ○ Network Intrusion Detection ○ Server Health Monitoring
  36. 36. API Statistics
  37. 37. API Statistics
  38. 38. HTTP Monitoring
  39. 39. Activity Monitoring Activity monitoring is for tracking events from multiple nodes in a flow to understand a specific activity ● Example: ○ A client initiating a web services request which travels through multiple ESBs, application servers and returns back. This flow will be uniquely identified and visualized in DAS ● Used for tracing messages, finding performance hotspots in the flow ● Implemented based on a correlation id based mechanism using Interactive Analytics
  40. 40. Activity Monitoring
  41. 41. Activity Monitoring
  42. 42. Activity Monitoring
  43. 43. Activity Monitoring
  44. 44. Activity Monitoring
  45. 45. Fraud Detection ● Built for detecting credit card fraud ● The rules are extensible with customized Siddhi execution plans for any type of fraud detection ● Currently leverages Real-time and Interactive Analytics features Source:
  46. 46. Log Analysis ● Distributed indexing and searching of any type of logs stored in the system ● Notifications support with Real-time event processing features ● Application / Server health prediction with Machine Learning ● Utilizes Interactive + Real-time Analytics + Machine Learning features Source:
  47. 47. Urban Route Planning
  48. 48. Urban Route Planning
  49. 49. Product Demonstration
  50. 50. Questions?