SlideShare a Scribd company logo
1 of 90
Download to read offline
Bigdata based Fraud Detection
김민경
daengky@naver.com
2015.04.09
2
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
3
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
4
• O2O Platform
Introduction
온라인 기업의 오프라인 장악을 위해 고안되었으나
현재는 오프라인과 온라인이 서로 인터랙션하는 플랫폼
으로 진행....
5
• FinTech
Introduction
결제의 편의성에서 출발......
6
•Introduction
Market Share
핀테크와 O2O플랫폼은 편의성 확보를 통한 고객
확보와 시장을 선점하기 위한 전쟁
7
•Introduction
인증
간편하게 ID와 패스워드 한번으로 결제
공인인증서 발급하여 쇼핑하는 것이 가능한가?
아이핀을 발급하여 쇼핑하는 것이 가능한가?
8
•Introduction
Trade-Off
보안성
편의성
보안성과 편의성은 서로 트레이드 오프(Trade-Off) 관계
9
•Introduction
Success
보안성
편의성
두마리 토끼를 잡아야 성공할 수 있다.
10
•Introduction
Motive
인터넷 전문은행?
• 인터넷과 모바일을 통해서 예금 수신·이체·대출·펀드투자
등 금융 서비스를 제공하는 은행
• 특징 : 점포 없이 저비용 구조로 운영하면서 시중은행 보다
저렴한 수수료와 낮은 대출 금리 제공.
• 산업자본의 지분 참여 30% 이상 허용
• 대기업군(61개) 제한 : 삼성, 현대자동차 등 공정거래 위원회로
부터 상호 출자 제한을 받는 자산 5조원 이상
11
•Introduction
Motive
• 불편한 금융 보안장치와 프로세스
• 보안카드 OTP
• 책임은 누구
• 금융보안은 자율적으로 처리하는 것이 대세
• 금융회사 책임 범위 강화
• 금융보호업무 재위탁 금지, 단 금융위 허용시 예외
• 징벌적 과징금- 50억원이하
• 벌칙강화 -10년이하 징역, 1억원이하 벌금
• 과태료 -신설, 안정성 확보의무 불이행시 5천만원이하
• 의무적 보고 – CISO의 매월 정보보안점검 내용 보고.
12
•Introduction
신제윤 금융위원장은 금융보안을 위해 모든 금융권이 이상거
래탐지시스템(FDS) 구축을 완료해야 한다고 촉구했다.
"핀테크 활성화 방안을 추진하기 위해서 반드시 전제돼야
할 사항은 보안의 중요성"이라며 "정보보안이 확보되지 않
은 서비스는 결국 사상누각이 될 것"이라고 우려했다.
그는 핀테크(Fintech) 추진 방안과 관련해서는 "오프라인
위주의 금융제도 개편을 통해 핀테크 기술이 금융에 자연스
럽게 접목될 수 있도록 지원할 것"이라며 "전자금융업종 규
율을 재설계토록 하겠다"고 밝혔다.
Motive
13
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
14
•
Bigdata Ecosystem
Bigdata
• 빅데이터의 의의
데이터 양이 방대할 뿐만 아니라 복잡해져서 전통적인 데이터 프로세싱으로
는 처리하기 어려워서 고안되 대용량 병렬 컴퓨팅 기술
• 빅데이터 처리 기술
이러한 복잡하고 방대한 데이터를 병렬 프로세싱을 통해서 효율적으로 처
리하는 기술
• 빅데이터 처리 과정
수집-저장-처리-분석-표현
수집-처리-분석-표현-저장
• 빅데이터 분석의 의의
복잡하고 방대한 데이터를 대용량 병렬 컴퓨팅 기술에 기반하여 기
계학습이나 확률 통계적 기법을 이용한 분석 기술
15
•
Bigdata Ecosystem
Open Source Bigdata Ecosystem
• Query (NOSQL) : Cassandra, HBase, MongoDB and more
• Query (SQL) : Hive, Stinger, Impala, Presto, Shark
• Advanced Analytic : Hadoop, Spark,H2O
• Real time : Storm, Samza, S4, Spark Streaming
Bigdata
16
•
Bigdata Ecosystem
Bigdata
17
•
Veracity
Bigdata Problems
Bigdata
Value Meaning
18
•
Bigdata Streaming
Bigdata
19
•
Analytics Problems
Bigdata
DATA
ACQUISITION
DATA
ANALYSIS
DATA
STORAGE
RESULT
Stream pipeline
20
•
Stream Problems
Bigdata
Big
(Volume)
Complex
(Veriety)
Speed
(Velocity)
21
•
Lambda Architecture
Bigdata
22
•
Lambda Architecture
Bigdata
23
•
Lambda Architecture
Bigdata
24
•
Lambda Architecture
Bigdata
25
•
Lambda Architecture
Bigdata
26
•
Lambda Architecture
Bigdata
27
•
Lambda Architecture
Bigdata
28
•
Lambda Architecture
Bigdata
•All data entering the system is dispatched to both the batch layer and the speed layer for processing.
•The batch layer has two functions: (i) managing the master dataset
(an immutable, append-only set of raw data), and (ii) to pre-compute the batch views.
•The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way.
•The speed layer compensates for the high latency of updates to the serving layer
and deals with recent data only.
•Any incoming query can be answered by merging results from batch views and real-time views.
29
•
Lambda Architecture
Bigdata
30
•
In-Stream Processing
Bigdata
31
•
Seldon Infrastructure
Bigdata
•Real-Time Layer : responsible for handling the live predictive API requests.
•Storage Layer : various types of storage used by other components.
•Near time/Offline Layer:components that run compute intensive or non-realtime jobs.
•Stats layer : components to monitor and analyze the running system.
32
•
Pulsar Architecture
Bigdata
Pulsar Deployment Architecture
33
•
Pulsar Architecture
Bigdata
34
•
Pulsar Architecture
Bigdata
The Pulsar pipeline includes the following components:
• Collector: Ingests events through a Rest end point
• Sessionizer: Sessionizes the events, maintaining the session state and
generating marker events
• Distributor: Filters and mutates events to different consumers;
acts as an event router
• Metrics calculator: Calculates metrics by various dimensions and
persists them in the metrics store
• Replay: Replays the failed events on other stages
• ConfigApp: Configures dynamic provisioning for the whole pipeline
35
•
Pulsar Architecture
Bigdata
• • Complex Event Processing: SQL on stream data
• • Custom sub-stream creation: Filtering and Mutation
• • In Memory Aggregation: Multi Dimensional counting
36
•
Pulsar Architecture
Bigdata
37
•
Realtime Ecosystem
Bigdata
38
•
Realtime Ecosystem
Bigdata
39
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
40
•
What is ?
Machine Learnig
Data로 부터 출발....
• 기계(Machine) + Learning (학습)
• 기계(컴퓨터)에게 데이터를 이용하여 학습하는 방법을
가르치는 것.
Teach computer how to learn from data
따라서 Data가 교재이다.
41
•
ML Types
Machine Learnig
• Supervised learning : 지도학습
• Data의 종류를 알고 있을 때(Category, Labeled)
• ex: spam mail
• Unsupervised : 비지도학습
• Data의 종류는 모르지만 패턴을 알고 싶을 때
• SNS, Twitter
• Semi-supervised learning : 지도학습 + 비지도학습
• Reinforcement learning : 강화학습
• 잘못된 것을 다시 피드백
• Evolutionary learning : 진화학습(GA, AIS)
• Meta Learning : Landmark of data for classifier
42
•
Lifecycle on Realtime
Machine Learnig
ML Modeling
ML Deploy
ML Optimizer
New Data
Decision Making
Alert
Anomaly Store
Hadoop DFS/NoSQl/Hive
43
•
Genetic algorithm
Abnormal Behavior
Machine Learnig
44
• Association Rule
Mining
Machine Learnig
45
• Finite State Automata
(FSA)
Since the tests in can be grouped, the states can represent the
several tests being performed at the same time. For example, T34
means that T3 and T4 can be done simultaneously
Machine Learnig
46
• Clustering
Machine Learnig
47
• Hidden Markov
Sequence Based Algorithm
•Certain fraudulent activities may not be detectable with instance
based algorithms
•small amount of money, instance based algorithms will fail to
detect the fraud
Machine Learnig
48
•
Decision Tree
Profiling?
Machine Learnig
49
• Support Vector
Machine
Machine Learnig
50
• Neural Network
Single Layer Feed Forward Model
Machine Learnig
51
• anti-k nearest
neighbor
Outlier Detection
Machine Learnig
52
• Comparison
of Three Algorithms
Machine Learnig
53
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
54
• Banking
• 트래픽
Fraud Detection
1일트랜잭션 1일로그 날짜 총건수 트랜젝션수
20,000,000 200,000,000 7일 1,400,000,000건 140,000,000건
21일 4,200,000,000건 420,000,000건
30일 6,000,000,000건 600,000,000건
60일 12,000,000,000건 1,200,000,000건
90일 18,000,000,000건 1,800,000,000건
120일 24,000,000,000건 2,400,000,000건
150일 30,000,000,000건 3,000,000,000건
180일 36,000,000,000건 3,600,000,000건
360일 72,000,000,000건 7,200,000,000건
55
•Fraud Detection
Credit card data (70-80 variables per transaction):
• Transaction ID
• Transaction type
• Date and time of transaction
(to nearest second)
• Amount
• Currency
• Local currency amount
• Merchant category
• Card issuer ID
• ATM ID
• POS type
• Cheque account prefix
• Savings account prefix
• Acquiring institution ID
• Transaction authorisation code
• Online authorisation performed
• New card
• Transaction exceeds floor limit
• Number of times chip has been accessed
• Merchant city name
• Chip terminal capability
• Chip card verification result
Card
56
• Fraud Detection
Basics
Fraud Detection
Speed is the key !!!•
- many transactions - billions - algorithms must be efficient
- mixed variable types (generally not text, image)
- large number of variables
- incomprehensible variables, irrelevant variables
- different misclassification costs
- many ways of committing fraud
- unbalanced class sizes (c. 0.1% transactions fraudulent)
- delay in labelling
- mislabelled classes
- random transaction arrival times
- (reactive) population drift
- Maintain a sliding buffer of the last billion transactions in RAM
(fast memory)
- Organize the transactions in such a way that some queries
could be executed very fast
- Develop some clever algorithms that operate on this data
structure
- Will it work??? Yes, it will !!! Yes, it does …
57
• Fraud Detection
Basics
Fraud Detection
Challenge: real-time detection!
• Monitor in real time all POS/ATM transactions
• Detect unusual patterns and block compromised cards as quickly as
possible
• Ideally: block compromised cards before fraud is discovered!
• A big question: can we do it ???
• Some numbers:
• 3,000,000,000 transactions per year
• up to 15,000,000 transactions per day
• up to 400 transactions per second (peak hours)
• 100,000,000 cards
58
• Fraud Detection
Basics
Fraud Detection
•Self Healing
•Multi datacenter failovers
•State management
•Shutdown Orchestration
•Dynamic Partitioning
•Elastic Clusters
•Dynamic Flow Routing
•Dynamic Topology Changes
59
• Fraud Detection
Basics
Fraud Detection
–SQL like language for specifying processing rules
–Analysis over rolling and tumbling windows of time
–Filtering and Joining streams
–Grouping and Ordering output
–For routing events between stages and between clusters
–Event Mutation
–Correlation
–Patterns
60
• Fraud Detection
Basics
Fraud Detection
•Rolling window aggregation over long time windows
(hours or days)
•Session store scaling to 1 million insert/update per sec
•Dynamic Joins with graphs and RDBMS tables
•Auto scaling based on load sensing
•Hot deployment of Java source code
61
• Fraud Detection
Basics
• Outlier Detection
• detecting data points that don’t follow the trends and
patters in the data
• rule base detection
• anomaly detection
• Two approaches for treating input
• focus on instance of data point
• focus on sequence of data points
• Three kinds of algorithms
• building a model out of data
• using data directly.
• immunse system base on temporal data
• Real time fraud detection
• feasible with model based approach
• A model is built with batch processing of training data
• A real time stream processor uses the model and
makes predictions in real time
Fraud Detection
62
•
Economy Imperative
• Not worth spending $200m to stop $20m fraud
• The Pareto principle
• fthe first 50% of fraud is easy to stop
• next 25% takes the same effort
• next 12.5% takes the same effort
• Resources available for fraud detection are always limited
• around 3% of police resources go on fraud ?
• this will not significantly increase
• If we cannot outspend the fraudsters we must out-think them
Fraud Detection
63
•
Types of Anomaly
Fraud Detection
64
•Fraud Detection
AIS are adaptive systems inspired by theoretical immunology and
observed immune functions, principles and models, which
are applied to complex problem domains
•Immune system needs to be able to differentiate between
self and non-self cells
•may result in cell death therefore
• Some kind of positive selection(Clonal Selection)
• Some kind of negative selection
Aritifical
Immune Systems
65
•Fraud Detection
무과립성 백혈구(無顆粒性 白血球, agranulocyte)의 일종으로
면역 기능 관여하며 전체 백혈구 중에서도 30%를 차지한다.
•T세포(T cell)
•보조 T세포(Helper T cell)
•세포독성 T세포(killer T cell)
•억제 T세포(suppressor T cell)
•B세포(B cell)
•NK세포(Natural killer cell, NK cell)
Lymphocyte(림프구)
66
•Fraud Detection
B 세포(B細胞, B cell)는 림프구 중 항체를 생산하는 세포
B cell
67
•Fraud Detection
T세포(T細胞, T cell) 또는 T림프구(T lymphocyte)는 항원 특이적인 적
응 면역을 주관하는 림프구의 하나이다. 가슴샘(Thymus)에서 성숙되기 때문
에 첫글자를 따서 T세포라는 이름이 붙었다. 전체 림프구 중 약 4분의 3이 T
세포
T세포는 아직 항원을 만나지 못한 미접촉 T세포와, 항원을 만나 성숙한 효과 T
세포(보조 T세포, 세포독성 T세포, 자연살상 T세포), 그리고 기억 T세포로 분류
T cell
68
•Fraud Detection
each antibody can recognize
a single antigen
Antibody, Antigen
69
•Fraud Detection Biological Immune
System
70
• Danger Theory
•Proposed by Polly Matzinger, around 1995
•Traditional self/non-self theory doesn’t always match
observations
•Immune system always responds to non-self
•Immune system always tolerates self
•Antigen-presenting cell(APC):T-cell activation by APCs
•Danger theory relates innate and adaptive immune systems
•Tissues induce tolerance towards themselves
•Tissues protect themselves and select class of response
Fraud Detection
71
•
•Tissues induce tolerance by
•Lymphocytes receive 2 signals
•antigen/lymphocyte binding
•antigen is properly presented by APC
•Signal 1 WITHOUT signal 2 : lymphocyte death
•Tissues protect themselves
•Alarm Signals activate APCs
•Alarm signals come from
•Cells that die unnaturally
•Cells under stress
•APCs activate lymphocytes
•Tissues dictate response type
•Alarm signals may convey information
Danger Theory
Fraud Detection
72
• Danger Theory
Fraud Detection
73
•
Artificial Immune Systems
Fraud Detection
•Vectors
Ab = {Ab1, Ab2, ..., AbL}
Ag = {Ag1, Ag2, ..., AgL}
•Real-valued shape-space
•Integer shape-space
•Binary shape-space
•Symbolic shape-space
D=
√∑i =1
L
(Abi −Ag i )2
Artificial Immune
System
74
•Fraud Detection
Meta-Frameworks
Artificial Immune
System
75
•Fraud Detection Hybrid Immune
Learning
76
•Fraud Detection
For natural immune system, all cells of body are
categorized as two types of self and non-self. The
immune process is to detect non-self from cells.
use the Positive Selection Algorithm (PSA) to
perform the non-self detection for recognizing the
malicious executable.
Non-self Detection
Principle
77
•Fraud Detection Network Security
78
•Fraud Detection Network Security
Architecture of anomaly detection system.
79
•Fraud Detection Intrusion Detection
Systems
80
•Outline
• Intorduction
• Bigdata
• Machine Learning
• Fraud Detection
• Solutions
81
• Neural Stream Architecture
Solutions
Modeling
Fork
New data stream
Alert
82
• Neural Stream Architecture
Solutions
Fork
New data stream
Batch
Modeling
Online
Compute
Online
Learning
Convergence
Alert
83
• Neural Stream Architecture
Solutions
Agent
TimeReducer
Daily
Weekly
Monthly
TwoMonth
ThreeMonth
FourMonth
FiveMonth
SixMonth
SevenMonth
EightMonth
NineMonth
TenMonth
TwoWeek
ThreeWeek
TwoDay
ThreeDay
FourDay
FiveDay
SixDay
FourWeek
MetaParser
TimeStore
Long Transaction Memory
BlackList
SeccueCode
POSEntry
84
• Neural Stream Architecture
Solutions
Velocity
Volume
Variety
Veracity
Neural
Stream
Big Data
On-line Learning Neural Architecture
Machine Learning Platform
85
• Neural Stream FDS
Solutions
86
•Solutions
• Storage
• hadoop
• HDFS: Distributed File System(DFS)
• MapReduce : parallel processing
• Algorithms
• on-line learning (Immune System and Genetic Algorithms)
• batch model
• direct data
• Stream
• Neural stream
• Decentralize decision process
• Cell base detection
• Network for Artificial Immune Systems
• Lambda architecture, Pulsar can’t use on-line learning
Neural Stream
87
• Classical rule-based
approach
• Always “too late”:
• New fraud pattern is “invented” by criminals
• Cardholders lose money and complain
• Banks investigate complains and try to understand
the new pattern
• A new rule is implemented a few weeks later
• Expensive to build (knowledge intensive)
• Difficult to maintain:
• Many rules
• The situation is dynamically changing, so frequently
• rules have to be added, modified, or removed …
Solutions
88
•Solutions
• Every bank user gets a vector of parameters that describe his/her
behavior: an “average-behavior” profile
• The system constantly compares this “long-term” profile with the
recent behavior of cardholder
• Transactions that do not fit into bank user’s profile are flagged as
suspicious (or are blocked)
• Profiles are updated with every single transaction, so the system
constantly adopts to (slow and small) changes in bank user’ behavior
A system based on
profiles
89
•Solutions
Solve the problems
Q&A
Thanks

More Related Content

What's hot

Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data AnalyticsVijay Rao
 
Neo4j Aura Enterprise
Neo4j Aura EnterpriseNeo4j Aura Enterprise
Neo4j Aura EnterpriseNeo4j
 
Satyam open analytics nyc
Satyam open analytics nycSatyam open analytics nyc
Satyam open analytics nycOpen Analytics
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream Inc.
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...GetInData
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise deteo
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]ercan5
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunityStanley Wang
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best PracticesYellowfin
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?Thanakrit Lersmethasakul
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and OpportunitiesKenny Huang Ph.D.
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...DataWorks Summit
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Denodo
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
 

What's hot (20)

Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Neo4j Aura Enterprise
Neo4j Aura EnterpriseNeo4j Aura Enterprise
Neo4j Aura Enterprise
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 
Satyam open analytics nyc
Satyam open analytics nycSatyam open analytics nyc
Satyam open analytics nyc
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 
Bigdata (1) converted
Bigdata (1) convertedBigdata (1) converted
Bigdata (1) converted
 
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
Understanding Big Data Analytics - solutions for growing businesses - Rafał M...
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunity
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Big Data and BI Best Practices
Big Data and BI Best PracticesBig Data and BI Best Practices
Big Data and BI Best Practices
 
How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
BIG DATA and USE CASES
BIG DATA and USE CASESBIG DATA and USE CASES
BIG DATA and USE CASES
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
 
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud ...
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
 

Viewers also liked

Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentationHernan Huwyler
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationScott Mongeau
 
Online Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics WebinarOnline Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics WebinarDatameer
 
Real-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsReal-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsChristian Gügi
 
Presentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlPresentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlDominic Sroda Korkoryi
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014Sri Ambati
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)Amazon Web Services
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopDataWorks Summit
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionkalpesh1908
 
Masters thesis - Fraud & Big Data
Masters thesis - Fraud & Big DataMasters thesis - Fraud & Big Data
Masters thesis - Fraud & Big DataStephanie Canovas
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detectionhadooparchbook
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopPranab Ghosh
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learningwgyn
 
A visual approach to fraud detection and investigation - Giuseppe Francavilla
A visual approach to fraud detection and investigation - Giuseppe FrancavillaA visual approach to fraud detection and investigation - Giuseppe Francavilla
A visual approach to fraud detection and investigation - Giuseppe FrancavillaData Driven Innovation
 

Viewers also liked (20)

Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
 
Fraud Detection presentation
Fraud Detection presentationFraud Detection presentation
Fraud Detection presentation
 
ACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and MitigationACFE Presentation on Analytics for Fraud Detection and Mitigation
ACFE Presentation on Analytics for Fraud Detection and Mitigation
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Online Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics WebinarOnline Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics Webinar
 
Real-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment TransactionsReal-Time Fraud Detection in Payment Transactions
Real-Time Fraud Detection in Payment Transactions
 
Presentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & controlPresentation on fraud prevention, detection & control
Presentation on fraud prevention, detection & control
 
PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014PayPal's Fraud Detection with Deep Learning in H2O World 2014
PayPal's Fraud Detection with Deep Learning in H2O World 2014
 
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
AWS re:Invent 2016: Fraud Detection with Amazon Machine Learning on AWS (FIN301)
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
Fraud detection
Fraud detectionFraud detection
Fraud detection
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Masters thesis - Fraud & Big Data
Masters thesis - Fraud & Big DataMasters thesis - Fraud & Big Data
Masters thesis - Fraud & Big Data
 
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detection
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using Hadoop
 
Detecting fraud with Python and machine learning
Detecting fraud with Python and machine learningDetecting fraud with Python and machine learning
Detecting fraud with Python and machine learning
 
A visual approach to fraud detection and investigation - Giuseppe Francavilla
A visual approach to fraud detection and investigation - Giuseppe FrancavillaA visual approach to fraud detection and investigation - Giuseppe Francavilla
A visual approach to fraud detection and investigation - Giuseppe Francavilla
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to Bigdata and Machine Learning based Real-Time Fraud Detection

Fraud prevention is better with TigerGraph inside
Fraud prevention is better with  TigerGraph insideFraud prevention is better with  TigerGraph inside
Fraud prevention is better with TigerGraph insideTigerGraph
 
Analytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseAnalytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseSriskandarajah Suhothayan
 
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2
 
20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security BrokersRobin Vermeirsch
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Big Data Spain
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServicesDavid Walker
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNeo4j
 
From Data to Services at the Speed of Business
From Data to Services at the Speed of BusinessFrom Data to Services at the Speed of Business
From Data to Services at the Speed of BusinessAli Hodroj
 
The Gib Five - Modern IT Architecture
The Gib Five - Modern IT ArchitectureThe Gib Five - Modern IT Architecture
The Gib Five - Modern IT ArchitectureAnatole Tresch
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar Caroline Stewart
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in ISISACA Riyadh
 
Creating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchCreating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchJonathan LeBlanc
 
2016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V42016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V4Janani Eshwaran
 
2016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V42016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V4Janani Eshwaran
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...Databricks
 
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...Grid Dynamics
 
How I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsHow I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsAlejandro Correa Bahnsen, PhD
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...Grid Dynamics
 

Similar to Bigdata and Machine Learning based Real-Time Fraud Detection (20)

Fraud prevention is better with TigerGraph inside
Fraud prevention is better with  TigerGraph insideFraud prevention is better with  TigerGraph inside
Fraud prevention is better with TigerGraph inside
 
Analytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital EnterpriseAnalytics Patterns for Your Digital Enterprise
Analytics Patterns for Your Digital Enterprise
 
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital EnterpriseWSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
WSO2Con USA 2017: Analytics Patterns for Your Digital Enterprise
 
20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers20160000 Cloud Discovery Event - Cloud Access Security Brokers
20160000 Cloud Discovery Event - Cloud Access Security Brokers
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServices
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4j
 
From Data to Services at the Speed of Business
From Data to Services at the Speed of BusinessFrom Data to Services at the Speed of Business
From Data to Services at the Speed of Business
 
The Gib Five - Modern IT Architecture
The Gib Five - Modern IT ArchitectureThe Gib Five - Modern IT Architecture
The Gib Five - Modern IT Architecture
 
CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar CrateDB Machine Data Platform Webinar
CrateDB Machine Data Platform Webinar
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in IS
 
Creating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from ScratchCreating an In-Aisle Purchasing System from Scratch
Creating an In-Aisle Purchasing System from Scratch
 
2016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V42016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V4
 
2016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V42016 DSG Webinar Azure HDInsight 2 V4
2016 DSG Webinar Azure HDInsight 2 V4
 
Analytics&IoT
Analytics&IoTAnalytics&IoT
Analytics&IoT
 
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 Application and Challenges of Streaming Analytics and Machine Learning on Mu... Application and Challenges of Streaming Analytics and Machine Learning on Mu...
Application and Challenges of Streaming Analytics and Machine Learning on Mu...
 
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
 
How I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data ProductsHow I Learned to Stop Worrying and Love Building Data Products
How I Learned to Stop Worrying and Love Building Data Products
 
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...Open Blueprint for Real-Time  Analytics in Retail: Strata Hadoop World 2017 S...
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
 

More from Mk Kim

Startuplab Cube Cluster
Startuplab Cube ClusterStartuplab Cube Cluster
Startuplab Cube ClusterMk Kim
 
Cube advisor 2.0
Cube advisor 2.0Cube advisor 2.0
Cube advisor 2.0Mk Kim
 
Fraud Detection System on Neural Stream
Fraud Detection System on Neural StreamFraud Detection System on Neural Stream
Fraud Detection System on Neural StreamMk Kim
 
Bigdata IoT Cluster
Bigdata IoT ClusterBigdata IoT Cluster
Bigdata IoT ClusterMk Kim
 
Direct paysystem
Direct paysystemDirect paysystem
Direct paysystemMk Kim
 
Prostate cancer detection
Prostate cancer detection Prostate cancer detection
Prostate cancer detection Mk Kim
 
Meetup history
Meetup historyMeetup history
Meetup historyMk Kim
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning PlatformMk Kim
 
Financial security and machine learning
Financial security and machine learningFinancial security and machine learning
Financial security and machine learningMk Kim
 
Fin tech and Fraud Detection System
Fin tech and Fraud Detection SystemFin tech and Fraud Detection System
Fin tech and Fraud Detection SystemMk Kim
 
Bigdata Intelligence Platform- BICube
Bigdata Intelligence Platform- BICubeBigdata Intelligence Platform- BICube
Bigdata Intelligence Platform- BICubeMk Kim
 
Neural stream
Neural streamNeural stream
Neural streamMk Kim
 
Bio bigdata
Bio bigdata Bio bigdata
Bio bigdata Mk Kim
 

More from Mk Kim (13)

Startuplab Cube Cluster
Startuplab Cube ClusterStartuplab Cube Cluster
Startuplab Cube Cluster
 
Cube advisor 2.0
Cube advisor 2.0Cube advisor 2.0
Cube advisor 2.0
 
Fraud Detection System on Neural Stream
Fraud Detection System on Neural StreamFraud Detection System on Neural Stream
Fraud Detection System on Neural Stream
 
Bigdata IoT Cluster
Bigdata IoT ClusterBigdata IoT Cluster
Bigdata IoT Cluster
 
Direct paysystem
Direct paysystemDirect paysystem
Direct paysystem
 
Prostate cancer detection
Prostate cancer detection Prostate cancer detection
Prostate cancer detection
 
Meetup history
Meetup historyMeetup history
Meetup history
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
 
Financial security and machine learning
Financial security and machine learningFinancial security and machine learning
Financial security and machine learning
 
Fin tech and Fraud Detection System
Fin tech and Fraud Detection SystemFin tech and Fraud Detection System
Fin tech and Fraud Detection System
 
Bigdata Intelligence Platform- BICube
Bigdata Intelligence Platform- BICubeBigdata Intelligence Platform- BICube
Bigdata Intelligence Platform- BICube
 
Neural stream
Neural streamNeural stream
Neural stream
 
Bio bigdata
Bio bigdata Bio bigdata
Bio bigdata
 

Recently uploaded

PMFBY , Pradhan Mantri Fasal bima yojna
PMFBY , Pradhan Mantri  Fasal bima yojnaPMFBY , Pradhan Mantri  Fasal bima yojna
PMFBY , Pradhan Mantri Fasal bima yojnaDharmendra Kumar
 
Economic Risk Factor Update: April 2024 [SlideShare]
Economic Risk Factor Update: April 2024 [SlideShare]Economic Risk Factor Update: April 2024 [SlideShare]
Economic Risk Factor Update: April 2024 [SlideShare]Commonwealth
 
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACT
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACTGOODSANDSERVICETAX IN INDIAN ECONOMY IMPACT
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACTharshitverma1762
 
Managing Finances in a Small Business (yes).pdf
Managing Finances  in a Small Business (yes).pdfManaging Finances  in a Small Business (yes).pdf
Managing Finances in a Small Business (yes).pdfmar yame
 
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》rnrncn29
 
The Core Functions of the Bangko Sentral ng Pilipinas
The Core Functions of the Bangko Sentral ng PilipinasThe Core Functions of the Bangko Sentral ng Pilipinas
The Core Functions of the Bangko Sentral ng PilipinasCherylouCamus
 
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...Amil baba
 
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Sonam Pathan
 
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdf
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdfBPPG response - Options for Defined Benefit schemes - 19Apr24.pdf
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdfHenry Tapper
 
Overview of Inkel Unlisted Shares Price.
Overview of Inkel Unlisted Shares Price.Overview of Inkel Unlisted Shares Price.
Overview of Inkel Unlisted Shares Price.Precize Formely Leadoff
 
2024 Q1 Crypto Industry Report | CoinGecko
2024 Q1 Crypto Industry Report | CoinGecko2024 Q1 Crypto Industry Report | CoinGecko
2024 Q1 Crypto Industry Report | CoinGeckoCoinGecko
 
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfKempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfHenry Tapper
 
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...Henry Tapper
 
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdf
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdfmagnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdf
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdfHenry Tapper
 
Governor Olli Rehn: Dialling back monetary restraint
Governor Olli Rehn: Dialling back monetary restraintGovernor Olli Rehn: Dialling back monetary restraint
Governor Olli Rehn: Dialling back monetary restraintSuomen Pankki
 
212MTAMount Durham University Bachelor's Diploma in Technology
212MTAMount Durham University Bachelor's Diploma in Technology212MTAMount Durham University Bachelor's Diploma in Technology
212MTAMount Durham University Bachelor's Diploma in Technologyz xss
 
Bladex 1Q24 Earning Results Presentation
Bladex 1Q24 Earning Results PresentationBladex 1Q24 Earning Results Presentation
Bladex 1Q24 Earning Results PresentationBladex
 
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...Amil baba
 
(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)twfkn8xj
 

Recently uploaded (20)

PMFBY , Pradhan Mantri Fasal bima yojna
PMFBY , Pradhan Mantri  Fasal bima yojnaPMFBY , Pradhan Mantri  Fasal bima yojna
PMFBY , Pradhan Mantri Fasal bima yojna
 
Economic Risk Factor Update: April 2024 [SlideShare]
Economic Risk Factor Update: April 2024 [SlideShare]Economic Risk Factor Update: April 2024 [SlideShare]
Economic Risk Factor Update: April 2024 [SlideShare]
 
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACT
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACTGOODSANDSERVICETAX IN INDIAN ECONOMY IMPACT
GOODSANDSERVICETAX IN INDIAN ECONOMY IMPACT
 
Managing Finances in a Small Business (yes).pdf
Managing Finances  in a Small Business (yes).pdfManaging Finances  in a Small Business (yes).pdf
Managing Finances in a Small Business (yes).pdf
 
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》
《加拿大本地办假证-寻找办理Dalhousie毕业证和达尔豪斯大学毕业证书的中介代理》
 
The Core Functions of the Bangko Sentral ng Pilipinas
The Core Functions of the Bangko Sentral ng PilipinasThe Core Functions of the Bangko Sentral ng Pilipinas
The Core Functions of the Bangko Sentral ng Pilipinas
 
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...
NO1 WorldWide Love marriage specialist baba ji Amil Baba Kala ilam powerful v...
 
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
Call Girls Near Golden Tulip Essential Hotel, New Delhi 9873777170
 
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdf
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdfBPPG response - Options for Defined Benefit schemes - 19Apr24.pdf
BPPG response - Options for Defined Benefit schemes - 19Apr24.pdf
 
Overview of Inkel Unlisted Shares Price.
Overview of Inkel Unlisted Shares Price.Overview of Inkel Unlisted Shares Price.
Overview of Inkel Unlisted Shares Price.
 
2024 Q1 Crypto Industry Report | CoinGecko
2024 Q1 Crypto Industry Report | CoinGecko2024 Q1 Crypto Industry Report | CoinGecko
2024 Q1 Crypto Industry Report | CoinGecko
 
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdfKempen ' UK DB Endgame Paper Apr 24 final3.pdf
Kempen ' UK DB Endgame Paper Apr 24 final3.pdf
 
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
letter-from-the-chair-to-the-fca-relating-to-british-steel-pensions-scheme-15...
 
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdf
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdfmagnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdf
magnetic-pensions-a-new-blueprint-for-the-dc-landscape.pdf
 
Governor Olli Rehn: Dialling back monetary restraint
Governor Olli Rehn: Dialling back monetary restraintGovernor Olli Rehn: Dialling back monetary restraint
Governor Olli Rehn: Dialling back monetary restraint
 
212MTAMount Durham University Bachelor's Diploma in Technology
212MTAMount Durham University Bachelor's Diploma in Technology212MTAMount Durham University Bachelor's Diploma in Technology
212MTAMount Durham University Bachelor's Diploma in Technology
 
Bladex 1Q24 Earning Results Presentation
Bladex 1Q24 Earning Results PresentationBladex 1Q24 Earning Results Presentation
Bladex 1Q24 Earning Results Presentation
 
Monthly Economic Monitoring of Ukraine No 231, April 2024
Monthly Economic Monitoring of Ukraine No 231, April 2024Monthly Economic Monitoring of Ukraine No 231, April 2024
Monthly Economic Monitoring of Ukraine No 231, April 2024
 
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...
NO1 Certified Amil Baba In Lahore Kala Jadu In Lahore Best Amil In Lahore Ami...
 
(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)(中央兰开夏大学毕业证学位证成绩单-案例)
(中央兰开夏大学毕业证学位证成绩单-案例)
 

Bigdata and Machine Learning based Real-Time Fraud Detection

  • 1. Bigdata based Fraud Detection 김민경 daengky@naver.com 2015.04.09
  • 2. 2 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 3. 3 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 4. 4 • O2O Platform Introduction 온라인 기업의 오프라인 장악을 위해 고안되었으나 현재는 오프라인과 온라인이 서로 인터랙션하는 플랫폼 으로 진행....
  • 6. 6 •Introduction Market Share 핀테크와 O2O플랫폼은 편의성 확보를 통한 고객 확보와 시장을 선점하기 위한 전쟁
  • 7. 7 •Introduction 인증 간편하게 ID와 패스워드 한번으로 결제 공인인증서 발급하여 쇼핑하는 것이 가능한가? 아이핀을 발급하여 쇼핑하는 것이 가능한가?
  • 10. 10 •Introduction Motive 인터넷 전문은행? • 인터넷과 모바일을 통해서 예금 수신·이체·대출·펀드투자 등 금융 서비스를 제공하는 은행 • 특징 : 점포 없이 저비용 구조로 운영하면서 시중은행 보다 저렴한 수수료와 낮은 대출 금리 제공. • 산업자본의 지분 참여 30% 이상 허용 • 대기업군(61개) 제한 : 삼성, 현대자동차 등 공정거래 위원회로 부터 상호 출자 제한을 받는 자산 5조원 이상
  • 11. 11 •Introduction Motive • 불편한 금융 보안장치와 프로세스 • 보안카드 OTP • 책임은 누구 • 금융보안은 자율적으로 처리하는 것이 대세 • 금융회사 책임 범위 강화 • 금융보호업무 재위탁 금지, 단 금융위 허용시 예외 • 징벌적 과징금- 50억원이하 • 벌칙강화 -10년이하 징역, 1억원이하 벌금 • 과태료 -신설, 안정성 확보의무 불이행시 5천만원이하 • 의무적 보고 – CISO의 매월 정보보안점검 내용 보고.
  • 12. 12 •Introduction 신제윤 금융위원장은 금융보안을 위해 모든 금융권이 이상거 래탐지시스템(FDS) 구축을 완료해야 한다고 촉구했다. "핀테크 활성화 방안을 추진하기 위해서 반드시 전제돼야 할 사항은 보안의 중요성"이라며 "정보보안이 확보되지 않 은 서비스는 결국 사상누각이 될 것"이라고 우려했다. 그는 핀테크(Fintech) 추진 방안과 관련해서는 "오프라인 위주의 금융제도 개편을 통해 핀테크 기술이 금융에 자연스 럽게 접목될 수 있도록 지원할 것"이라며 "전자금융업종 규 율을 재설계토록 하겠다"고 밝혔다. Motive
  • 13. 13 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 14. 14 • Bigdata Ecosystem Bigdata • 빅데이터의 의의 데이터 양이 방대할 뿐만 아니라 복잡해져서 전통적인 데이터 프로세싱으로 는 처리하기 어려워서 고안되 대용량 병렬 컴퓨팅 기술 • 빅데이터 처리 기술 이러한 복잡하고 방대한 데이터를 병렬 프로세싱을 통해서 효율적으로 처 리하는 기술 • 빅데이터 처리 과정 수집-저장-처리-분석-표현 수집-처리-분석-표현-저장 • 빅데이터 분석의 의의 복잡하고 방대한 데이터를 대용량 병렬 컴퓨팅 기술에 기반하여 기 계학습이나 확률 통계적 기법을 이용한 분석 기술
  • 15. 15 • Bigdata Ecosystem Open Source Bigdata Ecosystem • Query (NOSQL) : Cassandra, HBase, MongoDB and more • Query (SQL) : Hive, Stinger, Impala, Presto, Shark • Advanced Analytic : Hadoop, Spark,H2O • Real time : Storm, Samza, S4, Spark Streaming Bigdata
  • 28. 28 • Lambda Architecture Bigdata •All data entering the system is dispatched to both the batch layer and the speed layer for processing. •The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views. •The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way. •The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only. •Any incoming query can be answered by merging results from batch views and real-time views.
  • 31. 31 • Seldon Infrastructure Bigdata •Real-Time Layer : responsible for handling the live predictive API requests. •Storage Layer : various types of storage used by other components. •Near time/Offline Layer:components that run compute intensive or non-realtime jobs. •Stats layer : components to monitor and analyze the running system.
  • 34. 34 • Pulsar Architecture Bigdata The Pulsar pipeline includes the following components: • Collector: Ingests events through a Rest end point • Sessionizer: Sessionizes the events, maintaining the session state and generating marker events • Distributor: Filters and mutates events to different consumers; acts as an event router • Metrics calculator: Calculates metrics by various dimensions and persists them in the metrics store • Replay: Replays the failed events on other stages • ConfigApp: Configures dynamic provisioning for the whole pipeline
  • 35. 35 • Pulsar Architecture Bigdata • • Complex Event Processing: SQL on stream data • • Custom sub-stream creation: Filtering and Mutation • • In Memory Aggregation: Multi Dimensional counting
  • 39. 39 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 40. 40 • What is ? Machine Learnig Data로 부터 출발.... • 기계(Machine) + Learning (학습) • 기계(컴퓨터)에게 데이터를 이용하여 학습하는 방법을 가르치는 것. Teach computer how to learn from data 따라서 Data가 교재이다.
  • 41. 41 • ML Types Machine Learnig • Supervised learning : 지도학습 • Data의 종류를 알고 있을 때(Category, Labeled) • ex: spam mail • Unsupervised : 비지도학습 • Data의 종류는 모르지만 패턴을 알고 싶을 때 • SNS, Twitter • Semi-supervised learning : 지도학습 + 비지도학습 • Reinforcement learning : 강화학습 • 잘못된 것을 다시 피드백 • Evolutionary learning : 진화학습(GA, AIS) • Meta Learning : Landmark of data for classifier
  • 42. 42 • Lifecycle on Realtime Machine Learnig ML Modeling ML Deploy ML Optimizer New Data Decision Making Alert Anomaly Store Hadoop DFS/NoSQl/Hive
  • 45. 45 • Finite State Automata (FSA) Since the tests in can be grouped, the states can represent the several tests being performed at the same time. For example, T34 means that T3 and T4 can be done simultaneously Machine Learnig
  • 47. 47 • Hidden Markov Sequence Based Algorithm •Certain fraudulent activities may not be detectable with instance based algorithms •small amount of money, instance based algorithms will fail to detect the fraud Machine Learnig
  • 50. 50 • Neural Network Single Layer Feed Forward Model Machine Learnig
  • 51. 51 • anti-k nearest neighbor Outlier Detection Machine Learnig
  • 52. 52 • Comparison of Three Algorithms Machine Learnig
  • 53. 53 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 54. 54 • Banking • 트래픽 Fraud Detection 1일트랜잭션 1일로그 날짜 총건수 트랜젝션수 20,000,000 200,000,000 7일 1,400,000,000건 140,000,000건 21일 4,200,000,000건 420,000,000건 30일 6,000,000,000건 600,000,000건 60일 12,000,000,000건 1,200,000,000건 90일 18,000,000,000건 1,800,000,000건 120일 24,000,000,000건 2,400,000,000건 150일 30,000,000,000건 3,000,000,000건 180일 36,000,000,000건 3,600,000,000건 360일 72,000,000,000건 7,200,000,000건
  • 55. 55 •Fraud Detection Credit card data (70-80 variables per transaction): • Transaction ID • Transaction type • Date and time of transaction (to nearest second) • Amount • Currency • Local currency amount • Merchant category • Card issuer ID • ATM ID • POS type • Cheque account prefix • Savings account prefix • Acquiring institution ID • Transaction authorisation code • Online authorisation performed • New card • Transaction exceeds floor limit • Number of times chip has been accessed • Merchant city name • Chip terminal capability • Chip card verification result Card
  • 56. 56 • Fraud Detection Basics Fraud Detection Speed is the key !!!• - many transactions - billions - algorithms must be efficient - mixed variable types (generally not text, image) - large number of variables - incomprehensible variables, irrelevant variables - different misclassification costs - many ways of committing fraud - unbalanced class sizes (c. 0.1% transactions fraudulent) - delay in labelling - mislabelled classes - random transaction arrival times - (reactive) population drift - Maintain a sliding buffer of the last billion transactions in RAM (fast memory) - Organize the transactions in such a way that some queries could be executed very fast - Develop some clever algorithms that operate on this data structure - Will it work??? Yes, it will !!! Yes, it does …
  • 57. 57 • Fraud Detection Basics Fraud Detection Challenge: real-time detection! • Monitor in real time all POS/ATM transactions • Detect unusual patterns and block compromised cards as quickly as possible • Ideally: block compromised cards before fraud is discovered! • A big question: can we do it ??? • Some numbers: • 3,000,000,000 transactions per year • up to 15,000,000 transactions per day • up to 400 transactions per second (peak hours) • 100,000,000 cards
  • 58. 58 • Fraud Detection Basics Fraud Detection •Self Healing •Multi datacenter failovers •State management •Shutdown Orchestration •Dynamic Partitioning •Elastic Clusters •Dynamic Flow Routing •Dynamic Topology Changes
  • 59. 59 • Fraud Detection Basics Fraud Detection –SQL like language for specifying processing rules –Analysis over rolling and tumbling windows of time –Filtering and Joining streams –Grouping and Ordering output –For routing events between stages and between clusters –Event Mutation –Correlation –Patterns
  • 60. 60 • Fraud Detection Basics Fraud Detection •Rolling window aggregation over long time windows (hours or days) •Session store scaling to 1 million insert/update per sec •Dynamic Joins with graphs and RDBMS tables •Auto scaling based on load sensing •Hot deployment of Java source code
  • 61. 61 • Fraud Detection Basics • Outlier Detection • detecting data points that don’t follow the trends and patters in the data • rule base detection • anomaly detection • Two approaches for treating input • focus on instance of data point • focus on sequence of data points • Three kinds of algorithms • building a model out of data • using data directly. • immunse system base on temporal data • Real time fraud detection • feasible with model based approach • A model is built with batch processing of training data • A real time stream processor uses the model and makes predictions in real time Fraud Detection
  • 62. 62 • Economy Imperative • Not worth spending $200m to stop $20m fraud • The Pareto principle • fthe first 50% of fraud is easy to stop • next 25% takes the same effort • next 12.5% takes the same effort • Resources available for fraud detection are always limited • around 3% of police resources go on fraud ? • this will not significantly increase • If we cannot outspend the fraudsters we must out-think them Fraud Detection
  • 64. 64 •Fraud Detection AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models, which are applied to complex problem domains •Immune system needs to be able to differentiate between self and non-self cells •may result in cell death therefore • Some kind of positive selection(Clonal Selection) • Some kind of negative selection Aritifical Immune Systems
  • 65. 65 •Fraud Detection 무과립성 백혈구(無顆粒性 白血球, agranulocyte)의 일종으로 면역 기능 관여하며 전체 백혈구 중에서도 30%를 차지한다. •T세포(T cell) •보조 T세포(Helper T cell) •세포독성 T세포(killer T cell) •억제 T세포(suppressor T cell) •B세포(B cell) •NK세포(Natural killer cell, NK cell) Lymphocyte(림프구)
  • 66. 66 •Fraud Detection B 세포(B細胞, B cell)는 림프구 중 항체를 생산하는 세포 B cell
  • 67. 67 •Fraud Detection T세포(T細胞, T cell) 또는 T림프구(T lymphocyte)는 항원 특이적인 적 응 면역을 주관하는 림프구의 하나이다. 가슴샘(Thymus)에서 성숙되기 때문 에 첫글자를 따서 T세포라는 이름이 붙었다. 전체 림프구 중 약 4분의 3이 T 세포 T세포는 아직 항원을 만나지 못한 미접촉 T세포와, 항원을 만나 성숙한 효과 T 세포(보조 T세포, 세포독성 T세포, 자연살상 T세포), 그리고 기억 T세포로 분류 T cell
  • 68. 68 •Fraud Detection each antibody can recognize a single antigen Antibody, Antigen
  • 70. 70 • Danger Theory •Proposed by Polly Matzinger, around 1995 •Traditional self/non-self theory doesn’t always match observations •Immune system always responds to non-self •Immune system always tolerates self •Antigen-presenting cell(APC):T-cell activation by APCs •Danger theory relates innate and adaptive immune systems •Tissues induce tolerance towards themselves •Tissues protect themselves and select class of response Fraud Detection
  • 71. 71 • •Tissues induce tolerance by •Lymphocytes receive 2 signals •antigen/lymphocyte binding •antigen is properly presented by APC •Signal 1 WITHOUT signal 2 : lymphocyte death •Tissues protect themselves •Alarm Signals activate APCs •Alarm signals come from •Cells that die unnaturally •Cells under stress •APCs activate lymphocytes •Tissues dictate response type •Alarm signals may convey information Danger Theory Fraud Detection
  • 73. 73 • Artificial Immune Systems Fraud Detection •Vectors Ab = {Ab1, Ab2, ..., AbL} Ag = {Ag1, Ag2, ..., AgL} •Real-valued shape-space •Integer shape-space •Binary shape-space •Symbolic shape-space D= √∑i =1 L (Abi −Ag i )2 Artificial Immune System
  • 75. 75 •Fraud Detection Hybrid Immune Learning
  • 76. 76 •Fraud Detection For natural immune system, all cells of body are categorized as two types of self and non-self. The immune process is to detect non-self from cells. use the Positive Selection Algorithm (PSA) to perform the non-self detection for recognizing the malicious executable. Non-self Detection Principle
  • 78. 78 •Fraud Detection Network Security Architecture of anomaly detection system.
  • 79. 79 •Fraud Detection Intrusion Detection Systems
  • 80. 80 •Outline • Intorduction • Bigdata • Machine Learning • Fraud Detection • Solutions
  • 81. 81 • Neural Stream Architecture Solutions Modeling Fork New data stream Alert
  • 82. 82 • Neural Stream Architecture Solutions Fork New data stream Batch Modeling Online Compute Online Learning Convergence Alert
  • 83. 83 • Neural Stream Architecture Solutions Agent TimeReducer Daily Weekly Monthly TwoMonth ThreeMonth FourMonth FiveMonth SixMonth SevenMonth EightMonth NineMonth TenMonth TwoWeek ThreeWeek TwoDay ThreeDay FourDay FiveDay SixDay FourWeek MetaParser TimeStore Long Transaction Memory BlackList SeccueCode POSEntry
  • 84. 84 • Neural Stream Architecture Solutions Velocity Volume Variety Veracity Neural Stream Big Data On-line Learning Neural Architecture Machine Learning Platform
  • 85. 85 • Neural Stream FDS Solutions
  • 86. 86 •Solutions • Storage • hadoop • HDFS: Distributed File System(DFS) • MapReduce : parallel processing • Algorithms • on-line learning (Immune System and Genetic Algorithms) • batch model • direct data • Stream • Neural stream • Decentralize decision process • Cell base detection • Network for Artificial Immune Systems • Lambda architecture, Pulsar can’t use on-line learning Neural Stream
  • 87. 87 • Classical rule-based approach • Always “too late”: • New fraud pattern is “invented” by criminals • Cardholders lose money and complain • Banks investigate complains and try to understand the new pattern • A new rule is implemented a few weeks later • Expensive to build (knowledge intensive) • Difficult to maintain: • Many rules • The situation is dynamically changing, so frequently • rules have to be added, modified, or removed … Solutions
  • 88. 88 •Solutions • Every bank user gets a vector of parameters that describe his/her behavior: an “average-behavior” profile • The system constantly compares this “long-term” profile with the recent behavior of cardholder • Transactions that do not fit into bank user’s profile are flagged as suspicious (or are blocked) • Profiles are updated with every single transaction, so the system constantly adopts to (slow and small) changes in bank user’ behavior A system based on profiles