More Related Content Similar to Powering Realtime Decision Engines in Finance and Healthcare using Open Source Software (20) More from Greg Makowski (6) Powering Realtime Decision Engines in Finance and Healthcare using Open Source Software1. © 2015 ligaDATA, Inc. All Rights Reserved.
Powering Real-time
Decisioning for
Financial &
Healthcare using
Open Source
August 2015
Community @ http://Kamanja.org
2. 2
© 2015 ligaDATA, Inc. All Rights Reserved.
In ’14 the bank embarked
on transforming how they
leverage their data using
Open Source & Big Data
technologies.
3. 3
© 2015 ligaDATA, Inc. All Rights Reserved.
To achieve this goal with
the bank we needed to:
1. Create a framework
to adopt Open Source
Software
2. Need a catalyst to
attract and retain
the talent
4. © 2015 ligaDATA, Inc. All Rights Reserved.
4
Marissa Meyer of Yahoo won’t have to go in front of the senate to explain
why 100,000 records were lost – Barbara Desoer of CitiBank would.
What is different about Financial Services?
ü Regulatory requirements requires 100% data protection
ü Security & Data governance
ü Auditability
ü Lineage
ü ZERO data loss
ü Integration with legacy ecosystem
ü Skillset
Open Source in Financial Services
Good enough for Internet companies isn't good enough!
5. © 2015 ligaDATA, Inc. All Rights Reserved.
5
A modified “Crossing the Chasm” view for OSS
OSS – Adoption Chasm
Why Financial Services have not adopted OSS more aggressively?
Creators Contributors Users
Creators
Technology
Organizations, Rich
resources, Solving a
problem, Creating a
competitive advantage
Contributors
Technology
Organizations, taking
a risk while Solving a
problem
Users
Lower Technology
Skillset, Low risk
tolerance, Solving a
problem
6. © 2015 ligaDATA, Inc. All Rights Reserved.
6
Establish the BOSS framework for the consumption and contribution to open
source software (OSS) at scale in the Bank
.
Bank Open Source Software (BOSS)
Contribution to OSS by
enhancing existing open
source projects,
documentation, fixes,
enhancements
Initiation of a new OSS
project, championing and
facilitating OSS community
development and
consumption
Evaluation & Consumption of
OSS
Maturing Capability
Consumption
Contribution
Bank Current Focus
Step Change
Pioneering Target
BOSS optimises Consumption, enables Contribution and Creation
• Input from stakeholders, internal and external influenced BOSS framework definition
• OSS advisory board to steer and drive
• Pre-approved licenses types per use case (consumption and contribution)
• Invest in enabling technology, GitHub, Black Duck, Sonatype
• No new governance steps, leverage and streamline existing controls instead of creating new ones
Creation
7. © 2015 ligaDATA, Inc. All Rights Reserved.
7
BOSS framework is designed based on guidance and feedback received from key
representatives within the Bank and from leading open source contributors and fellow banks
.
Technology
Internal
External
BOSS – Collective Thought Process
Retail
Investment
Cards
Legal
Risk
Security
Sourcing
Business Units
Control
Functions
Data
Design
Infra
8. © 2015 ligaDATA, Inc. All Rights Reserved.
8
Millennial developers …
• Grew up using OSS
• Unaware of Closed Source
software
• Want to engage, share and
contribute
Real-time using Kamanja was selected
as a capability big enough, important
enough to build a Center of Excellence
around it.
Attracting and Retaining talent
9. © 2015 ligaDATA, Inc. All Rights Reserved.
9
Individual
Events
Decisioning,
Detection
In-context
and online
Cross section
of events
Analytics,
MI
Offline,
Longer cycle
Deriving Decisions
from Big Data
BATCH
REAL-TIME
10. © 2015 ligaDATA, Inc. All Rights Reserved.
10
customer-centric product design
require Real-time decisions
Triggers
Scoring
Notifications
Alerts
Transactional Updates
Deriving an
Opportunity or Threat
E N D - T O - E N D C A P A B I L I T Y
Tracking & Analyzing
(processing)
Streams of Information
(real-time)
About Things That
Happen (events)
Actions
Real-time
Decisions
11. 11
© 2015 ligaDATA, Inc. All Rights Reserved.
LigaDATA introduced Kamanja –
an open source real-time decisioning project,
hardened for Financial Services & Healthcare requirements and
scalable to IoT level data volumes enabling low latency use
cases.
Customer
churn/
retention
Risk
Analysis
Customer
Contact
Cyber
Crime
Fraud
Security &
Compliance
Audit &
Governance
U S E C A S E S
Marketing
Telephony
Interception
Real-Time
Offer
12. 12
© 2015 ligaDATA, Inc. All Rights Reserved.
Uses of
Real-Time Decisioning
Complex Event Processing (CEP)
• A few to possibly 100’s of concurrent data streams
• Apply rule logic, select, aggregate
• Decide action on elements in stream
Enterprise Applications, During …
• customer call or chat: recommendations to improve service
• card transaction: offer credit increase
• web application: pre-approval
• web transaction: recommend other product(s)
13. 13
© 2015 ligaDATA, Inc. All Rights Reserved.
Case Study of a Modeling Department
Monitor $80B of consumer bank transactions / year to detect
fraud (between 1,400 banks)
PAIN POINT:
~2 months to deploy
(model group was different from deployment group)
INDUSTRY REVIEW to answer:
• How common is it to use many algorithms or tools in a project?
• What is an easier way to deploy models?
14. 14
© 2015 ligaDATA, Inc. All Rights Reserved.
http://www.kdnuggets.com/2015/06/data-mining-data-science-tools-associations.html
Independent use of tools
15. 15
© 2015 ligaDATA, Inc. All Rights Reserved.
http://www.kdnuggets.com/2015/06/data-mining-data-science-tools-associations.html
Tools used in combination
16. 16
© 2015 ligaDATA, Inc. All Rights Reserved.
Scoring
Engine
(Kamanja)
PMML Diagram
Predictive Modeling Markup Language
Training & test data
(batch)
Data
Mining
Tool File, Save As
PMML
PMML
File
PMML
Producer
PMML
FileScoring data
(real time streaming)
Output data has
new score field
Training Project Phase
Production Scoring Project Phase
Full model
specification
PMML Consumer
17. 17
© 2015 ligaDATA, Inc. All Rights Reserved.
Given industry fragmentation,
PMML is a solution
PMML Producers (18 companies)
• R (Rattle, PMML)
• RapidMiner
• KNIME
PMML Consumers (12 co)
• Zementis
• SAS
• IBM SPSS
• KNIME
• Microstrategy
• Kamanja
• JPMML
• Spark (MLlib) (Open Source)
• Weka
• SAS Enterprise Miner
PREDICTIVE
Naïve Bayes
Neural Net
Regression
Rules
Scorecard
Sequence
SVM
Time Series
Trees
DESCRIPTIVE / OTH
Association Rules
Cluster, K-Nearest Nb
Text Models
model ensembles &
composition
(i.e. Gradient Boosting)
18. © 2015 ligaDATA, Inc. All Rights Reserved.
18
Real Time Computing
OSS Technology Stack
Integration with Kamanja
Kamanja
(PMML/Java/Scala Consumer)
High level languages / abstractions
Compute
Fabric
Cloud, EC2
Internal Cloud
Security
Kerberos
Real Time
Streaming
Kafka,
MQ
Spark*
ligaDATA
Data Store
HBase,
Cassandra,
InfluxDB
HDFS
(Create
adaptors to
integrate
others)
Resource
Management
Zookeeper,
Yarn*,
Mesos*
High Level Languages /
Abstractions
MLlib* (PMML Producers)
19. © 2015 ligaDATA, Inc. All Rights Reserved.
19
Performance
Characteristics
© 2015 ligaDATA, Inc. All Rights Reserved.
19
Performance
• Throughput of million messages/second
• Uses commodity hardware
Scalability
• Linear scalability -- horizontally
• Data partitioning support
• Runtime multi-model optimizations to
support thousands of models
• Consistent performance on hundreds of
models and thousands of rules
Built for IoT
data volumes
20. © 2015 ligaDATA, Inc. All Rights Reserved.
20
• Clinicians (knowledge experts) develop heuristic based rule set models
• The initial model was COPD (Chronic Obstructive Pulmonary Disease) risk
assessment
• Support of referenced Beneficiary, HL7, Inpatient Claim, and Outpatient Claim
• Models are expressed with a domain specific language (DSL) they developed
• DSL models are transformed to PMML for Kamanja
• Models consume current + prior related messages over “look back period”
Save the “assertions” of a patient in the database (beyond standard PMML)
“State” can evolve over time
• The “Medical Company” plans to integrate the DSL with their ontology data
modeling effort
• Goal is to generate new models as their “medical world” ontology evolves
Medical Company use
of Kamanja
21. © 2015 ligaDATA, Inc. All Rights Reserved.
Try out
© 2015 ligaDATA, Inc. All Rights Reserved.
CONFIDENTIAL
Community @ http://Kamanja.org