SlideShare a Scribd company logo
Tech view on Regulatory Compliance
MarkLogic User Group Benelux Meetup December 2016
Speaker: Alexander L. de Goeij
About me
• Architect / Consultant
• Financial Services: Core Trading
• Regulations: EMIR, MiFID II
• Architecture: Enterprise / Solution / Project Architect
• Consulting: IT Strategy, implementations, vendor selection, etc.
• Business degree, Tech addiction.
“Regulations really make my life more fun! ”
As said by no-one, ever.
“Regulations really make my life more fun! ”
As said by no-one, ever.
everyone who gets to use cool databases!
exciting
The challenge we think we are facing:
TransformExtract
Source Data
Happy
Regulator
Load Send
extractload
Some Application
The actual challenge we are facing:
Happy
Regulators
DB 1Load
Source Data
Extract
Email
FTP
REST
SOAP
Tool 2Load Extract
Thing NLoad Extract
Database you
didn’t know
still existed
Current solution:
Doesn’t work anymore:
• Auditability / Process checks included in
Regulations.
• Obligation to re-report.
• More complex Ad-Hoc requests from the
Regulator.
• Not suited for Real-Time reporting.
• Waste of money…
What do we need?
• Auditability: keep original data in original format to prove results,
keep track of ‘who-did-what’ with the data.
• Consistency: real-time requirement from regulator demands more
than eventual consistency.
• Forward Flexibility: we know we don’t know what we will have to
report tomorrow.
Looking to technology for a better answer!
Your favorite RDBMS
• ACID, consistent, and blazing fast
if you buy Exadata
• Normalize your way out, and fail.
• Not fit for processing/reporting
across different data objects:
e.g. Trades and Mortgages
• Try to do NoSQL with SQL
(innovative, but terribly slow and
impossible to maintain)
Example of what not to do:
SQL
SQL
MongoDB
• Free! Open Source! GridFS!
• Have to transform data on ingest
(to JSON) as most data is XML
• Eventual consistency (AKA data
loss) means not real-time.
• Good at homogeneous data.
• Still master-slave, and scaling
issues
• Brilliant for RAD / prototyping!
Where things go wrong:
Source: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
Cassandra (DataStax)
• Favors data duplication over normalization
• Very fast (if you duplicate well) but does not do JOINs
• Used by ING as main component of their Risk grid (YouTube)
• Excellent for time series data
Source: https://academy.datastax.com/resources/getting-started-time-series-data-modeling
Hadoop
Source: http://hortonworks.com/products/data-center/hdp/
MarkLogic
• Focused on heterogeneously
structured data
• Bitemporal, if you dare
• Semantics / RDF Triples
• ACID, Consistent, stores original file
• ABAC & redaction in enterprise
version
• Rules, Workflows, Alerts, Triggers
• Not a COTS!
Ok, so now what?
Two approaches to a solution
Infra approach:
• Build everything yourself, use
open source components
E.g.:
• Hadoop
• Cassandra + Kafka
Platform approach:
• Focus on application and
business logic, not on infra
E.g.:
• MarkLogic
• Spark (without Hadoop)
Akka ActorsAkka Actors
Spark
SparkKafkaKafka
Infra approach (SMACK example)
• Used (and designed) by
Netflix, LinkedIn, Uber,
Twitter
• Massive amounts of event
processing (IoT)
• HA and Geo distributed
• Scala, Python, R, Java(Script)
• Asynchronous everywhere
• Near impossible to destroy:
reactive, self-healing, back-
pressure.
Kafka
Akka Actors
Play REST APIs
Cassandra
Spark
Mesos OS
Bare
Metal
Bare
Metal
Bare
Metal
Bare
Metal
Cassandra
Cassandra
Zookeeper
Marathon
Play REST APIsPlay REST APIs
Platform approach
MarkLogic
Insert
Time Series
Database here
Spark
Source Data
Qualitative
Quantitative
Data Flows Data Stores Analytics Feedback Loop
Happy
Regulator
• Schema transformations
• Business Rules
• Workflow
• Rights management
Main take-aways
• There are no one-stop solutions
• Don’t pick bleeding edge stuff if you need it to work
• Focus on Business benefit of investment in Regulatory Compliance
• Separate the platform from the project!
• Start small, think big
Thank you for listening !
Alexander L. de Goeij
alexander@aldg.nl
References
• https://academy.datastax.com/resources/getting-started-time-series-data-modeling
• http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
• http://hortonworks.com/products/data-center/hdp/
• https://www.linkedin.com/pulse/data-hubs-marklogic-vs-hadoop-kurt-cagle
• https://engineering.linkedin.com/blog/2016/04/kafka-ecosystem-at-linkedin
• http://www.datanami.com/2015/10/05/how-uber-uses-spark-and-hadoop
• https://blog.twitter.com/2015/handling-five-billion-sessions-a-day-in-real-time
• http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflix.html

More Related Content

What's hot

Mike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event ProcessingMike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
VoltDB
 
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
VoltDB
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data ops
Lars Albertsson
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!
dclsocialmedia
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Jon Su
 
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketBusiness Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketMongoDB
 
Data-Driven User Experience
Data-Driven User ExperienceData-Driven User Experience
Data-Driven User Experience
dclsocialmedia
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
Lars Albertsson
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
Sheetal Pratik
 
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Lightbend
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
Neo4j
 
Preparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000DPreparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000D
dclsocialmedia
 
How Verizon Uses Disruptive Developments for Organized Progress
How Verizon Uses Disruptive Developments for Organized ProgressHow Verizon Uses Disruptive Developments for Organized Progress
How Verizon Uses Disruptive Developments for Organized Progress
MongoDB
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010Charlie Hull
 
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j
 
Content Development: Measuring the Trends
Content Development: Measuring the TrendsContent Development: Measuring the Trends
Content Development: Measuring the Trends
dclsocialmedia
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
DataWorks Summit
 
Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data Integration
Michael Rainey
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
VoltDB
 
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for KeysManaging Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
dclsocialmedia
 

What's hot (20)

Mike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event ProcessingMike Stonebraker on Designing An Architecture For Real-time Event Processing
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
 
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
Eat Your Data and Have It Too: Get the Blazing Performance of In-Memory Opera...
 
The lean principles of data ops
The lean principles of data opsThe lean principles of data ops
The lean principles of data ops
 
DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!DITA's New Thang: Going Mapless!
DITA's New Thang: Going Mapless!
 
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbtSiligong.Data - May 2021 - Transforming your analytics workflow with dbt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
 
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to MarketBusiness Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
 
Data-Driven User Experience
Data-Driven User ExperienceData-Driven User Experience
Data-Driven User Experience
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
Ready for Fast Data: How Lightbend Enables Teams To Build Real-Time, Streamin...
 
Graphs for Enterprise Architects
Graphs for Enterprise ArchitectsGraphs for Enterprise Architects
Graphs for Enterprise Architects
 
Preparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000DPreparing Your Legacy Data for Automation in S1000D
Preparing Your Legacy Data for Automation in S1000D
 
How Verizon Uses Disruptive Developments for Organized Progress
How Verizon Uses Disruptive Developments for Organized ProgressHow Verizon Uses Disruptive Developments for Organized Progress
How Verizon Uses Disruptive Developments for Organized Progress
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010
 
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with GraphsNeo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
Neo4j GraphTalk Düsseldorf - Building intelligent solutions with Graphs
 
Content Development: Measuring the Trends
Content Development: Measuring the TrendsContent Development: Measuring the Trends
Content Development: Measuring the Trends
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
 
Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data Integration
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
 
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for KeysManaging Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
Managing Deliverable-Specific Link Anchors: New Suggested Best Practice for Keys
 

Viewers also liked

Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
Piet Zijlstra
 
Testing For Web Accessibility
Testing For Web AccessibilityTesting For Web Accessibility
Testing For Web Accessibility
Hagai Asaban
 
What is the Joomla Framework and why do we need it?
What is the Joomla Framework and why do we need it?What is the Joomla Framework and why do we need it?
What is the Joomla Framework and why do we need it?
Rouven Weßling
 
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
Thư Nguyễn
 
Catalogue siemens contactor 3 tf6
Catalogue siemens contactor 3 tf6Catalogue siemens contactor 3 tf6
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
Tai Bún
 
Your first patch to OpenStack
Your first patch to OpenStackYour first patch to OpenStack
Your first patch to OpenStack
openstackindia
 
Tema: MENAXHIMI I RISKUT TË SOLVENCËS
Tema: MENAXHIMI I RISKUT TË SOLVENCËS Tema: MENAXHIMI I RISKUT TË SOLVENCËS
Tema: MENAXHIMI I RISKUT TË SOLVENCËS
Atdhe Mujaj
 
De la administración de salario a la gestión de la Recompensa Total
De la administración de salario a la gestión de la Recompensa TotalDe la administración de salario a la gestión de la Recompensa Total
De la administración de salario a la gestión de la Recompensa Total
APD Asociación para el Progreso de la Dirección
 
Big Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya GargBig Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya Garg
Agile Testing Alliance
 
API Testing
API TestingAPI Testing
API Testing
Bikash Sharma
 
The New Gives and Takes in a testers role
The New Gives and Takes in a testers role The New Gives and Takes in a testers role
The New Gives and Takes in a testers role
Agile Testing Alliance
 
Blood collection and anticoagulants
Blood collection and anticoagulantsBlood collection and anticoagulants
Blood collection and anticoagulants
Janani Mathialagan
 
Nghị định số 39/2016/NĐ-CP
Nghị định số 39/2016/NĐ-CPNghị định số 39/2016/NĐ-CP
Nghị định số 39/2016/NĐ-CP
kim chi
 

Viewers also liked (15)

Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
Helderheid in Wegdekreflectie CROW infradagen 2016 (Paper 106) 160622
 
Testing For Web Accessibility
Testing For Web AccessibilityTesting For Web Accessibility
Testing For Web Accessibility
 
Hướng dẫn nhanh Danfoss VLT 8000
Hướng dẫn nhanh Danfoss VLT 8000Hướng dẫn nhanh Danfoss VLT 8000
Hướng dẫn nhanh Danfoss VLT 8000
 
What is the Joomla Framework and why do we need it?
What is the Joomla Framework and why do we need it?What is the Joomla Framework and why do we need it?
What is the Joomla Framework and why do we need it?
 
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
Nghị định 44/2016/NĐ-CP ngày 15 tháng 5 năm 2016 có hiệu lực ngày 01 tháng 7 ...
 
Catalogue siemens contactor 3 tf6
Catalogue siemens contactor 3 tf6Catalogue siemens contactor 3 tf6
Catalogue siemens contactor 3 tf6
 
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
44 2016 nd-cp_quy định chi tiết một số điều của luật atvslđ về hoạt động kiểm...
 
Your first patch to OpenStack
Your first patch to OpenStackYour first patch to OpenStack
Your first patch to OpenStack
 
Tema: MENAXHIMI I RISKUT TË SOLVENCËS
Tema: MENAXHIMI I RISKUT TË SOLVENCËS Tema: MENAXHIMI I RISKUT TË SOLVENCËS
Tema: MENAXHIMI I RISKUT TË SOLVENCËS
 
De la administración de salario a la gestión de la Recompensa Total
De la administración de salario a la gestión de la Recompensa TotalDe la administración de salario a la gestión de la Recompensa Total
De la administración de salario a la gestión de la Recompensa Total
 
Big Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya GargBig Data - Hadoop and MapReduce - Aditya Garg
Big Data - Hadoop and MapReduce - Aditya Garg
 
API Testing
API TestingAPI Testing
API Testing
 
The New Gives and Takes in a testers role
The New Gives and Takes in a testers role The New Gives and Takes in a testers role
The New Gives and Takes in a testers role
 
Blood collection and anticoagulants
Blood collection and anticoagulantsBlood collection and anticoagulants
Blood collection and anticoagulants
 
Nghị định số 39/2016/NĐ-CP
Nghị định số 39/2016/NĐ-CPNghị định số 39/2016/NĐ-CP
Nghị định số 39/2016/NĐ-CP
 

Similar to Tech view on Regulatory Compliance

Webinar: How Banks Manage Reference Data with MongoDB
 Webinar: How Banks Manage Reference Data with MongoDB Webinar: How Banks Manage Reference Data with MongoDB
Webinar: How Banks Manage Reference Data with MongoDB
MongoDB
 
Atlanta hadoop users group july 2013
Atlanta hadoop users group july 2013Atlanta hadoop users group july 2013
Atlanta hadoop users group july 2013Christopher Curtin
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
Inside Analysis
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
Webinar: How MongoDB is Used to Manage Reference Data - May 2014
Webinar: How MongoDB is Used to Manage Reference Data - May 2014Webinar: How MongoDB is Used to Manage Reference Data - May 2014
Webinar: How MongoDB is Used to Manage Reference Data - May 2014
MongoDB
 
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
MongoDB
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
Rob Winters
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova Generazione
MongoDB
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
MongoDB
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
Dremio Corporation
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
Idan Tohami
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
Mark Rittman
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
Mark Rittman
 
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Real Time Interactive Queries IN HADOOP: Big Data Warehousing MeetupReal Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Caserta
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
Zohar Elkayam
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
Travis Oliphant
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
CQD
 
Big data in the enterprise: When to use what?
Big data in the enterprise: When to use what?Big data in the enterprise: When to use what?
Big data in the enterprise: When to use what?Jesus Rodriguez
 

Similar to Tech view on Regulatory Compliance (20)

Webinar: How Banks Manage Reference Data with MongoDB
 Webinar: How Banks Manage Reference Data with MongoDB Webinar: How Banks Manage Reference Data with MongoDB
Webinar: How Banks Manage Reference Data with MongoDB
 
Atlanta hadoop users group july 2013
Atlanta hadoop users group july 2013Atlanta hadoop users group july 2013
Atlanta hadoop users group july 2013
 
Hadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter PointHadoop and the Data Warehouse: Point/Counter Point
Hadoop and the Data Warehouse: Point/Counter Point
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Webinar: How MongoDB is Used to Manage Reference Data - May 2014
Webinar: How MongoDB is Used to Manage Reference Data - May 2014Webinar: How MongoDB is Used to Manage Reference Data - May 2014
Webinar: How MongoDB is Used to Manage Reference Data - May 2014
 
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
MongoBD London 2013: Real World MongoDB: Use Cases from Financial Services pr...
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova Generazione
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
 
Options for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current MarketOptions for Data Prep - A Survey of the Current Market
Options for Data Prep - A Survey of the Current Market
 
Data lake – On Premise VS Cloud
Data lake – On Premise VS CloudData lake – On Premise VS Cloud
Data lake – On Premise VS Cloud
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Real Time Interactive Queries IN HADOOP: Big Data Warehousing MeetupReal Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
Real Time Interactive Queries IN HADOOP: Big Data Warehousing Meetup
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Big data in the enterprise: When to use what?
Big data in the enterprise: When to use what?Big data in the enterprise: When to use what?
Big data in the enterprise: When to use what?
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

Tech view on Regulatory Compliance

  • 1. Tech view on Regulatory Compliance MarkLogic User Group Benelux Meetup December 2016 Speaker: Alexander L. de Goeij
  • 2. About me • Architect / Consultant • Financial Services: Core Trading • Regulations: EMIR, MiFID II • Architecture: Enterprise / Solution / Project Architect • Consulting: IT Strategy, implementations, vendor selection, etc. • Business degree, Tech addiction.
  • 3. “Regulations really make my life more fun! ” As said by no-one, ever.
  • 4. “Regulations really make my life more fun! ” As said by no-one, ever. everyone who gets to use cool databases! exciting
  • 5. The challenge we think we are facing: TransformExtract Source Data Happy Regulator Load Send extractload Some Application
  • 6. The actual challenge we are facing: Happy Regulators DB 1Load Source Data Extract Email FTP REST SOAP Tool 2Load Extract Thing NLoad Extract Database you didn’t know still existed
  • 7. Current solution: Doesn’t work anymore: • Auditability / Process checks included in Regulations. • Obligation to re-report. • More complex Ad-Hoc requests from the Regulator. • Not suited for Real-Time reporting. • Waste of money…
  • 8. What do we need? • Auditability: keep original data in original format to prove results, keep track of ‘who-did-what’ with the data. • Consistency: real-time requirement from regulator demands more than eventual consistency. • Forward Flexibility: we know we don’t know what we will have to report tomorrow.
  • 9. Looking to technology for a better answer!
  • 10. Your favorite RDBMS • ACID, consistent, and blazing fast if you buy Exadata • Normalize your way out, and fail. • Not fit for processing/reporting across different data objects: e.g. Trades and Mortgages • Try to do NoSQL with SQL (innovative, but terribly slow and impossible to maintain) Example of what not to do: SQL SQL
  • 11. MongoDB • Free! Open Source! GridFS! • Have to transform data on ingest (to JSON) as most data is XML • Eventual consistency (AKA data loss) means not real-time. • Good at homogeneous data. • Still master-slave, and scaling issues • Brilliant for RAD / prototyping! Where things go wrong: Source: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
  • 12. Cassandra (DataStax) • Favors data duplication over normalization • Very fast (if you duplicate well) but does not do JOINs • Used by ING as main component of their Risk grid (YouTube) • Excellent for time series data Source: https://academy.datastax.com/resources/getting-started-time-series-data-modeling
  • 14. MarkLogic • Focused on heterogeneously structured data • Bitemporal, if you dare • Semantics / RDF Triples • ACID, Consistent, stores original file • ABAC & redaction in enterprise version • Rules, Workflows, Alerts, Triggers • Not a COTS!
  • 15. Ok, so now what?
  • 16. Two approaches to a solution Infra approach: • Build everything yourself, use open source components E.g.: • Hadoop • Cassandra + Kafka Platform approach: • Focus on application and business logic, not on infra E.g.: • MarkLogic • Spark (without Hadoop)
  • 17. Akka ActorsAkka Actors Spark SparkKafkaKafka Infra approach (SMACK example) • Used (and designed) by Netflix, LinkedIn, Uber, Twitter • Massive amounts of event processing (IoT) • HA and Geo distributed • Scala, Python, R, Java(Script) • Asynchronous everywhere • Near impossible to destroy: reactive, self-healing, back- pressure. Kafka Akka Actors Play REST APIs Cassandra Spark Mesos OS Bare Metal Bare Metal Bare Metal Bare Metal Cassandra Cassandra Zookeeper Marathon Play REST APIsPlay REST APIs
  • 18.
  • 19. Platform approach MarkLogic Insert Time Series Database here Spark Source Data Qualitative Quantitative Data Flows Data Stores Analytics Feedback Loop Happy Regulator • Schema transformations • Business Rules • Workflow • Rights management
  • 20. Main take-aways • There are no one-stop solutions • Don’t pick bleeding edge stuff if you need it to work • Focus on Business benefit of investment in Regulatory Compliance • Separate the platform from the project! • Start small, think big
  • 21. Thank you for listening ! Alexander L. de Goeij alexander@aldg.nl
  • 22. References • https://academy.datastax.com/resources/getting-started-time-series-data-modeling • http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/ • http://hortonworks.com/products/data-center/hdp/ • https://www.linkedin.com/pulse/data-hubs-marklogic-vs-hadoop-kurt-cagle • https://engineering.linkedin.com/blog/2016/04/kafka-ecosystem-at-linkedin • http://www.datanami.com/2015/10/05/how-uber-uses-spark-and-hadoop • https://blog.twitter.com/2015/handling-five-billion-sessions-a-day-in-real-time • http://techblog.netflix.com/2013/12/announcing-suro-backbone-of-netflix.html