SlideShare a Scribd company logo
1 of 25
When Your SUM/COUNT queries are not enough any more
Tarmo Vikat – SR IT Service Engineer (fraud related area)
Nov 2016
Agenda
Skype Fraud 101
0.10 € 0.06 €
Alex Bob
How do we battle the situation?
Find
suspicious
elements
Investigate
Find linked
elements
Block the
fraudsters
Data points
Account Order Service
Device ID
Credit Cards
Time and Date
Login Data
Payment Information
IP address
Call Destination
Billing Address
Shopper IP
Client IP
Order IP
Linked Accounts
Issuer Country
E-mail
Caller ID Usage Summary
Profile Information
Contact List
What kind of data we use?
Statistics for call by IP
IP Events Cost Duration First Last Users/Blocked users
194.101.26.28 10 500 125,7 10:28:56 2016-09-13 00:15:08 2016-10-11 10:20:00 1178 / 101
Linked users by call IP
Username IP Events Cost Duration First Last
fraudulent_user1 27.11.105.23 273 10,8 02:15:00 2016-09-13 00:15:08 2016-10-11 10:20:00
fraudulent_user1 194.101.26.28 15 2,7 00:45:00 2015-12-01 00:00:08 2015-12-13 00:15:00
So what’s the problem?
Big DBs
Batch Jobs SQL
User interface
IP User Creations Callers
194.101.26.28 160/50 35/10
Plain SQL vs aggregated data
Why not use DWH or materialized views?
Bob
Powerful DWH
USER
IP
PSTN
PostreSQL
materialized views
Aggregation types
• Total stats over whole history
• Never expires
• Long-term storage only for non-personalised data
Whole history
• Last X days/hours stats in realtime (ie 30 days)
• Expiration (deaggregation) of old events
• Ability to implement velocity checks / data flow rules
X days back from
current moment
• Hourly/daily statistics
• Destruction of old data
• Ability to monitor data trends
Time series
Data structures
LINK
aggr_object
linked_object
count
price
duration
first_event
last_event
PK: aggr_object, linked_object
IDX: linked_object
STATS
aggr_object
count
price
duration
first_event
last_event
linked_object_count
blocked_object_count
PK: aggr_object
Basics of PostgreSQL based architecture
½ Data ½ Data
Basics of aggregation (IP based)
CALL (raw data)
User IP Duration Cost
testuser1 10.12.35.103 00:10:00 0.27
testuser2 10.12.35.103 00:30:00 1.00
testuser2 20.22.15.100 00:20 2.00
LINK
Object User Count
10.12.35.103 testuser1 1
10.12.35.103 testuser2 1
20.22.15.100 testuser2 1
STATS
Object User count
10.12.35.103 2
STATS
Object User count
20.22.15.100 1
Basics of queueing (pgq component)
PostgreSQL
Insert
Update
Delete
EVENT QUEUE
Event type (ins/upd/del)
Event data
Select Script
Schematics – whole history aggregation / time series
QUEUE
Aggregator
Worker 1
Config
STATS
LINK
STATS
Aggregator
Worker 2
Config
QUEUE
LINK
Dynamic configuration
CALL
User IP Length Cost
testuser1 10.12.35.103 00:10:00 0.27
testuser2 10.12.35.103 00:30:00 1.00
testuser2 20.22.15.100 00:20 2.00
STATS
Object Linked object Duration Price
user ip length cost
Schematics – sliding window aggregation
TRACKER
aggr fields
date
Aggregator
worker
STATS LINK
Data
destruction
worker
QUEUE
Simplified examples of used functions
new_user_count = 1;
user_count = user_count + <new_user_count>
So simple only?
Poor man’s SQL streaming (users who bought same product)
PURCHASE_DETAIL
Purchase_ID Product_name
1 Call packet 60 min
2 100 x SMS
QUEUE
PURCHASE
Username IP Cost
Testuser1 10.12.35.103 10.50
DENORMALISED DATA
Product_name Username IP Cost
Call packet 60 min Testuser1 10.12.35.103 10.50
100 x SMS Testuser1 10.12.35.103 10.50
Aggregator
worker
Poor man’s SQL streaming
Table 1
Table 2
De-normalised
Some caveats
Some numbers about performance
USER
IP
PSTN
Use cases
0
10
20
30
40
50
60
70
80
Phone number fraudulency
Phone1 Phone2 Phone3
Q&A
Tarmo Vikat
tavikat@microsoft.com
Skype: piruelain

More Related Content

Similar to PgConf_2016_EU.pptx

WebAction In-Memory Computing Summit 2015
WebAction In-Memory Computing Summit 2015WebAction In-Memory Computing Summit 2015
WebAction In-Memory Computing Summit 2015WebAction
 
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrSplunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrGeorg Knon
 
Recorded Future News Analytics for Financial Services
Recorded Future News Analytics for Financial ServicesRecorded Future News Analytics for Financial Services
Recorded Future News Analytics for Financial ServicesChris Holden
 
Architecting io t solutions with microisoft azure ignite tour version
Architecting io t solutions with microisoft azure ignite tour versionArchitecting io t solutions with microisoft azure ignite tour version
Architecting io t solutions with microisoft azure ignite tour versionAlon Fliess
 
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...randyguck
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk
 
Create The Internet of Your Things example of a real system - Laurent Ellerbach
Create The Internet of Your Things example of a real system - Laurent EllerbachCreate The Internet of Your Things example of a real system - Laurent Ellerbach
Create The Internet of Your Things example of a real system - Laurent EllerbachITCamp
 
SplunkApplicationLoggingBestPractices_Template_2.3.pdf
SplunkApplicationLoggingBestPractices_Template_2.3.pdfSplunkApplicationLoggingBestPractices_Template_2.3.pdf
SplunkApplicationLoggingBestPractices_Template_2.3.pdfTuynNguyn819213
 
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...Paolo Missier
 
Teradata Partner 2016 Gas_Turbine_Sensor_Data
Teradata Partner 2016 Gas_Turbine_Sensor_DataTeradata Partner 2016 Gas_Turbine_Sensor_Data
Teradata Partner 2016 Gas_Turbine_Sensor_Datapepeborja
 
Les objets connectés : de nombreux cas d'usage
Les objets connectés : de nombreux cas d'usage Les objets connectés : de nombreux cas d'usage
Les objets connectés : de nombreux cas d'usage Jedha Bootcamp
 
Data Modeling for IoT and Big Data
Data Modeling for IoT and Big DataData Modeling for IoT and Big Data
Data Modeling for IoT and Big DataJayesh Thakrar
 
INTEGRATE 2022 - Data Mapping in the Microsoft Cloud
INTEGRATE 2022 - Data Mapping in the Microsoft CloudINTEGRATE 2022 - Data Mapping in the Microsoft Cloud
INTEGRATE 2022 - Data Mapping in the Microsoft CloudDaniel Toomey
 
Camille chaudet measure camp-tagguing_mobile_apps_june15_v1.0
Camille chaudet   measure camp-tagguing_mobile_apps_june15_v1.0 Camille chaudet   measure camp-tagguing_mobile_apps_june15_v1.0
Camille chaudet measure camp-tagguing_mobile_apps_june15_v1.0 measurecampparis
 
Fiware IoT Proposal & Community
Fiware IoT Proposal & Community Fiware IoT Proposal & Community
Fiware IoT Proposal & Community TIDChile
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for StreamSplunk
 
Variables Creation using SAS on Longitudinal Data for Fraud Models
Variables Creation using SAS on Longitudinal Data for Fraud ModelsVariables Creation using SAS on Longitudinal Data for Fraud Models
Variables Creation using SAS on Longitudinal Data for Fraud ModelsKaitlyn Hu
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Dan Robinson
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB
 

Similar to PgConf_2016_EU.pptx (20)

WebAction In-Memory Computing Summit 2015
WebAction In-Memory Computing Summit 2015WebAction In-Memory Computing Summit 2015
WebAction In-Memory Computing Summit 2015
 
WebAction-Sami Abkay
WebAction-Sami AbkayWebAction-Sami Abkay
WebAction-Sami Abkay
 
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrSplunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
 
Recorded Future News Analytics for Financial Services
Recorded Future News Analytics for Financial ServicesRecorded Future News Analytics for Financial Services
Recorded Future News Analytics for Financial Services
 
Architecting io t solutions with microisoft azure ignite tour version
Architecting io t solutions with microisoft azure ignite tour versionArchitecting io t solutions with microisoft azure ignite tour version
Architecting io t solutions with microisoft azure ignite tour version
 
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
 
Create The Internet of Your Things example of a real system - Laurent Ellerbach
Create The Internet of Your Things example of a real system - Laurent EllerbachCreate The Internet of Your Things example of a real system - Laurent Ellerbach
Create The Internet of Your Things example of a real system - Laurent Ellerbach
 
SplunkApplicationLoggingBestPractices_Template_2.3.pdf
SplunkApplicationLoggingBestPractices_Template_2.3.pdfSplunkApplicationLoggingBestPractices_Template_2.3.pdf
SplunkApplicationLoggingBestPractices_Template_2.3.pdf
 
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...Decentralized, Trust-less Marketplacefor Brokered IoT Data Tradingusing Blo...
Decentralized, Trust-less Marketplace for Brokered IoT Data Trading using Blo...
 
Teradata Partner 2016 Gas_Turbine_Sensor_Data
Teradata Partner 2016 Gas_Turbine_Sensor_DataTeradata Partner 2016 Gas_Turbine_Sensor_Data
Teradata Partner 2016 Gas_Turbine_Sensor_Data
 
Les objets connectés : de nombreux cas d'usage
Les objets connectés : de nombreux cas d'usage Les objets connectés : de nombreux cas d'usage
Les objets connectés : de nombreux cas d'usage
 
Data Modeling for IoT and Big Data
Data Modeling for IoT and Big DataData Modeling for IoT and Big Data
Data Modeling for IoT and Big Data
 
INTEGRATE 2022 - Data Mapping in the Microsoft Cloud
INTEGRATE 2022 - Data Mapping in the Microsoft CloudINTEGRATE 2022 - Data Mapping in the Microsoft Cloud
INTEGRATE 2022 - Data Mapping in the Microsoft Cloud
 
Camille chaudet measure camp-tagguing_mobile_apps_june15_v1.0
Camille chaudet   measure camp-tagguing_mobile_apps_june15_v1.0 Camille chaudet   measure camp-tagguing_mobile_apps_june15_v1.0
Camille chaudet measure camp-tagguing_mobile_apps_june15_v1.0
 
Fiware IoT Proposal & Community
Fiware IoT Proposal & Community Fiware IoT Proposal & Community
Fiware IoT Proposal & Community
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for Stream
 
Variables Creation using SAS on Longitudinal Data for Fraud Models
Variables Creation using SAS on Longitudinal Data for Fraud ModelsVariables Creation using SAS on Longitudinal Data for Fraud Models
Variables Creation using SAS on Longitudinal Data for Fraud Models
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
 

Recently uploaded

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

PgConf_2016_EU.pptx

  • 1. When Your SUM/COUNT queries are not enough any more Tarmo Vikat – SR IT Service Engineer (fraud related area) Nov 2016
  • 3. Skype Fraud 101 0.10 € 0.06 € Alex Bob
  • 4. How do we battle the situation? Find suspicious elements Investigate Find linked elements Block the fraudsters
  • 5. Data points Account Order Service Device ID Credit Cards Time and Date Login Data Payment Information IP address Call Destination Billing Address Shopper IP Client IP Order IP Linked Accounts Issuer Country E-mail Caller ID Usage Summary Profile Information Contact List
  • 6. What kind of data we use? Statistics for call by IP IP Events Cost Duration First Last Users/Blocked users 194.101.26.28 10 500 125,7 10:28:56 2016-09-13 00:15:08 2016-10-11 10:20:00 1178 / 101 Linked users by call IP Username IP Events Cost Duration First Last fraudulent_user1 27.11.105.23 273 10,8 02:15:00 2016-09-13 00:15:08 2016-10-11 10:20:00 fraudulent_user1 194.101.26.28 15 2,7 00:45:00 2015-12-01 00:00:08 2015-12-13 00:15:00
  • 7. So what’s the problem? Big DBs Batch Jobs SQL User interface IP User Creations Callers 194.101.26.28 160/50 35/10
  • 8. Plain SQL vs aggregated data
  • 9. Why not use DWH or materialized views? Bob Powerful DWH USER IP PSTN PostreSQL materialized views
  • 10. Aggregation types • Total stats over whole history • Never expires • Long-term storage only for non-personalised data Whole history • Last X days/hours stats in realtime (ie 30 days) • Expiration (deaggregation) of old events • Ability to implement velocity checks / data flow rules X days back from current moment • Hourly/daily statistics • Destruction of old data • Ability to monitor data trends Time series
  • 11. Data structures LINK aggr_object linked_object count price duration first_event last_event PK: aggr_object, linked_object IDX: linked_object STATS aggr_object count price duration first_event last_event linked_object_count blocked_object_count PK: aggr_object
  • 12. Basics of PostgreSQL based architecture ½ Data ½ Data
  • 13. Basics of aggregation (IP based) CALL (raw data) User IP Duration Cost testuser1 10.12.35.103 00:10:00 0.27 testuser2 10.12.35.103 00:30:00 1.00 testuser2 20.22.15.100 00:20 2.00 LINK Object User Count 10.12.35.103 testuser1 1 10.12.35.103 testuser2 1 20.22.15.100 testuser2 1 STATS Object User count 10.12.35.103 2 STATS Object User count 20.22.15.100 1
  • 14. Basics of queueing (pgq component) PostgreSQL Insert Update Delete EVENT QUEUE Event type (ins/upd/del) Event data Select Script
  • 15. Schematics – whole history aggregation / time series QUEUE Aggregator Worker 1 Config STATS LINK STATS Aggregator Worker 2 Config QUEUE LINK
  • 16. Dynamic configuration CALL User IP Length Cost testuser1 10.12.35.103 00:10:00 0.27 testuser2 10.12.35.103 00:30:00 1.00 testuser2 20.22.15.100 00:20 2.00 STATS Object Linked object Duration Price user ip length cost
  • 17. Schematics – sliding window aggregation TRACKER aggr fields date Aggregator worker STATS LINK Data destruction worker QUEUE
  • 18. Simplified examples of used functions new_user_count = 1; user_count = user_count + <new_user_count>
  • 20. Poor man’s SQL streaming (users who bought same product) PURCHASE_DETAIL Purchase_ID Product_name 1 Call packet 60 min 2 100 x SMS QUEUE PURCHASE Username IP Cost Testuser1 10.12.35.103 10.50 DENORMALISED DATA Product_name Username IP Cost Call packet 60 min Testuser1 10.12.35.103 10.50 100 x SMS Testuser1 10.12.35.103 10.50 Aggregator worker
  • 21. Poor man’s SQL streaming Table 1 Table 2 De-normalised
  • 23. Some numbers about performance USER IP PSTN
  • 24. Use cases 0 10 20 30 40 50 60 70 80 Phone number fraudulency Phone1 Phone2 Phone3

Editor's Notes

  1. Fraudsters are working fast, we have to be faster.
  2. Fraudsters are working fast, we have to be faster.
  3. Key data points which are analyzed on the first place. Data in real time. Orders, calls, SMS are visible thanks to Engineering team work.
  4. Fraudsters are working fast, we have to be faster.
  5. Fraudsters are working fast, we have to be faster.
  6. Fraudsters are working fast, we have to be faster.
  7. Fraudsters are working fast, we have to be faster.
  8. Fraudsters are working fast, we have to be faster.
  9. Fraudsters are working fast, we have to be faster.
  10. Fraudsters are working fast, we have to be faster.
  11. Fraudsters are working fast, we have to be faster.
  12. Fraudsters are working fast, we have to be faster.
  13. Fraudsters are working fast, we have to be faster.
  14. Fraudsters are working fast, we have to be faster.
  15. Fraudsters are working fast, we have to be faster.
  16. Fraudsters are working fast, we have to be faster.
  17. Fraudsters are working fast, we have to be faster.