SlideShare a Scribd company logo
Amr Hassan
amr@match2lists.com
Accurately Match, Merge and De-dupe
Millions of records in minutes
Before we created
Match2Lists
We needed to match millions of records of our customers and 3rd party data
We ran a B2B Consulting firm providing Segmentation & Data Visualisation
To many false-positives and
30%-40% missed matches
Phoenix Ltd Fenix
Fuzzy
Match
Fuzzy
Non-Match
GSK PLC
GlaxoSmithKline
Beecham (met at
conference)
So we tried most Fuzzy Logic
software
Why ?
Why not ?
Fuzzy logic
was just…
too fuzzy
Has 3 Clear Objectives
Highest
matchresults
The least
Amountof time
visually
Simpleto use
DAT
AINTO Information?
How do you blend
INTERN
AL
DAT
A
EXTERN
AL
DAT
A
Connect
data
Despite
Very
DIFFERENT:
Company names
Company types
Abbreviations
People Names
Addresses
Word Orders
We developed more advanced
ata matching algorithms & Approac
Corroborative matching
Iterative matching
Contextual fuzzy logic
Probabilistic logic
word order permutations
Noise word elimination
character transformations
Synonym analysis
We developed more advanced
ata matching algorithms & Approac
Need For
To run these algorithms on each field for multi-million records datasets
= billions of permutations
SPEED
302520151050
IS CAPABLE OF
MATCHING
YOU
RDATA
200MILLION
RECORDS SECONDS
IN
SPEED
On
A science
art
It is also an
We recognised matching is not just
+MATCH2LISTS
ALGORITHMS
user KNOWLEDGE
OF their DATA
BOTH WORK TOGETHER USING OUR
Easy visual interface
match visualiser
1 - Apply
Match
Settings in
Seconds
2 - Assess
Match
Quality in
Minutes
3 - Approve
and Download
Match Results
ASSESS
✔
30SECONDS
match visualiser
Iterative Matching with different
criteria
= Highest Match Rates in Minutes
Unilever Beteiligungs Gmbh
Ge Medical Systems Private Limited
General Electric Company
Stichting Administratiekantoor
Unilever N.V.
Unilever Plc
DE-duplicate data easily
Unilever Beteiligungs Gmbh
Ge Medical Systems Private Limited
General Electric Company
Stichting Administratiekantoor
Unilever N.V.
Unilever Plc
De-Dupe
DE-duplicate data easily
De-Dupe
Unilever Plc
General Electric
Company
Unilever Beteiligungs Gmbh
Ge Medical Systems Private Limited
General Electric Company
Stichting Administratiekantoor
Unilever N.V.
Unilever Plc
DE-duplicate data easily
Customer Data Wallet Size DataDun & Bradstreet Data
Your CRM Data
STEP3 D&B Data
Wallet Size Data
Merge Match2DnBMatch
Blend data from different sources
No technical skills required
Anyone can use it
Strategy
Analysts
Sales &
Marketing
Finance &
operations
Disk-Memory Data Exchange
Despite the data compression, the data
exchange between disk and memory is both
efficient and rapid.
Scripting Functionality
Excellent scripting feature allows us to write our
own User Defined Functions that run at high
speed.
Less MEMORY = Less Cost
EXASOL required only 10% to 20% of the memory
configuration of our previous solution when parallel
ran both solutions during the transition phase
5 Minute Reboot Time !
The ability to reboot Match2Lists in 5 minutes to
perform system upgrades means practically no
disruption for our customers
Speed and Performance
Data matching is 3X faster than our prior
solution : 10 seconds to match 5 Million records
30 seconds to match 200 million records
Data Compression is Excellent
Data compression is impressive which translates to
lower hardware requirements. As customers and
their data continue to grow, this is a key benefit.
Match2Lists’ Experience with
Outstanding Support and Great Teamwork
Fast, Smooth and Faultless Transition to EXASOL
Match merge de-dupe match2dnb
Your
data
In
minutes
Download ResultsSelect ProjectUpload your Data
54321
Review MatchesPreprocessing
5 Simple Steps
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
02 Aug’16USASalesForce – CRM Account active 168,287
CRM Data
11 Jul’16
20 Mar’16
USA
USA
Addressable Market – Top 4000 Companies
MarTech – San Francisco Registrants
active
active
11,827
928
Subscriber Data
05 Jun’16
20 Aug’16
01 May’16
*G*
*G*
*G*
Forbes – 2000 & Worldwide Subsidiaries
Segmetrix Top 2500 by Wallet Size
Our Global Segment 500 Accounts
active
active
active
434,230
2500
500
Reference Data
23 Jun’16
15 Jun’16
DEU
DEU
Channel Partner 1 – Sales Out
Channel Partner 2 – Sales Out
active
active
18,231
34,109
Partner Data 01 Sep’16
18 Aug’16
UK
UK
Rhetorik UK – 25K Sites
D&B Top Companies– Tech & Finance
active
active
23,800
890
Contact Lists
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
Company ID
Address
Address
Address
Manually select
Field types from
menu
Check auto-
detected field types
field types
menu
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
Corroborative matching
Iterative matching
Fuzzy logic only when applicable
Probabilistic logic
All word order permutations
Noise word elimination
Special character transformations
Synonym analysis
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
THE MATCH VISUALISER
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
THE MATCH VISUALISER
Objective : Maximise Match
Rate
1st match setting
✔ Select fields to use
★ set similarity strengths
30SECON
DS
UNDE
R
click any or each score band to assess
results
APPROVE
ENTIRE
Score ranges
IF Results look
good
Down to this level
56%
2nd
RUNMatch Setting
Approve results
You’ve approved
93%
DOWNLO
AD
Results
5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
all DONE !
That’s IT Select which
fields you want
to download
from each LIST
Post / Zip CodeAddress Fields ( 1, 2 and 3 )Company Name City Country
W5 2AU
W2 6JR
E14 4QB
SW1X 7NW
CF14 7YT
ME14 2LE
TA2 8QY
LS11 5AD
WF6 1TN
26-30 Uxbridge Road
121-141 Westbourne Terrace,
10 Cabot Square, Canary Wharf
1 Knightsbridge Green
Maynard Centre, Forest Farm
Springfield Mill J Whatman Wa y
Crown Industrial Estate, Priorswood Road
Asda House, Southbank, Great Wilson St
Unit 1, Foxbridge Way
Kantar Media
Coley Porter Bell
Ogilvy Group (UK)
J Walter Thompson
GE Healthcare
Whatman plc
Amphenol Limited
ASDA Stores Limited
International Procurement & Logistics
London
London
London
London
Wales
S West
N East
N East
UK
UK
UK
UK
UK
UK
UK
UK
UK
NR3 1PD
WF10 5QL
St Crispins, Duke Street
Witwood Common Lane, Witwood
Stationery Office (UK ltd)
DHL Supply Chain
East
N East
UK
UK
W5 2AU
W2 6JR
E14 4QB
SW1X 7NW
26-30 Uxbridge Road
121-141 Westbourne Terrace,
10 Cabot Square, Canary Wharf
1 Knightsbridge Green
Kantar Media
Coley Porter Bell
Ogilvy Group (UK)
J Walter Thompson
London
London
London
London
WPP PLC
WPP PLC
WPP PLC
WPP PLC
UK
UK
UK
UK
120376
120376
120376
120376
2839
2839
2839
2839
UK
UK
UK
UK
CF14 7YT
ME14 2LE
TA2 8QY
Maynard Centre, Forest Farm
Springfield Mill J Whatman Wa y
Crown Industrial Estate, Priorswood Road
GE Healthcare
Whatman plc
Amphenol Limited
Wales
S West
General Electric Company
General Electric Company
General Electric Company
USA
USA
USA
5929
5929
5929
5578
5578
5578
UK
UK
UK
LS11 5AD
WF6 1TN
Asda House, Southbank, Great Wilson St
Unit 1, Foxbridge Way
ASDA Stores Limited
International Procurement & Logistics
N East
N East
Wal-Mart Stores, Inc.
Wal-Mart Stores, Inc.
USA
USA
180339
180339
8079
8079
UK
UK
Global Ultimate Company HQ Country WW Emp SIC Code
NR3 1PD
WF10 5QL
St Crispins, Duke Street
Witwood Common Lane, Witwood
Stationery Office (UK ltd)
DHL Supply Chain
East
N East
Deutsche Post AG
Deutsche Post AG
Germany
Germany
6313
6313
4669
4669
UK
UK
Download ResultsSelect ProjectUpload your Data Review MatchesPreprocessing
SOURCE LIST MATCH LIST
Global UltimateID
Global UltimateParent Name
Design your output file
Select what fields
you want from your
source list
Select the fields
of the matched
records
WW Emp
Site Name
Site Address 1
Site Address 2
Site Address
Site State / County
Site Post Code
SIC Code
WPP PLC
WPP PLC
WPP PLC
WPP PLC
UK
UK
UK
UK
120376
120376
120376
120376
2839
2839
2839
2839
General Electric Company
General Electric Company
General Electric Company
USA
USA
USA
5929
5929
5929
5578
5578
5578
Wal-Mart Stores, Inc.
Wal-Mart Stores, Inc.
USA
USA
180339
180339
8079
8079
Deutsche Post AG
Deutsche Post AG
Germany
Germany
6313
6313
4669
4669
Post / Zip CodeAddress Fields ( 1, 2 and 3 )Company Name City Country
W5 2AU
W2 6JR
E14 4QB
SW1X 7NW
CF14 7YT
ME14 2LE
TA2 8QY
LS11 5AD
WF6 1TN
26-30 Uxbridge Road
121-141 Westbourne Terrace,
10 Cabot Square, Canary Wharf
1 Knightsbridge Green
Maynard Centre, Forest Farm
Springfield Mill J Whatman Wa y
Crown Industrial Estate, Priorswood Road
Asda House, Southbank, Great Wilson St
Unit 1, Foxbridge Way
Kantar Media
Coley Porter Bell
Ogilvy Group (UK)
J Walter Thompson
GE Healthcare
Whatman plc
Amphenol Limited
ASDA Stores Limited
International Procurement & Logistics
London
London
London
London
Wales
S West
N East
N East
UK
UK
UK
UK
UK
UK
UK
UK
UK
NR3 1PD
WF10 5QL
St Crispins, Duke Street
Witwood Common Lane, Witwood
Stationery Office (UK ltd)
DHL Supply Chain
East
N East
UK
UK
W5 2AU
W2 6JR
E14 4QB
SW1X 7NW
26-30 Uxbridge Road
121-141 Westbourne Terrace,
10 Cabot Square, Canary Wharf
1 Knightsbridge Green
Kantar Media
Coley Porter Bell
Ogilvy Group (UK)
J Walter Thompson
London
London
London
London
UK
UK
UK
UK
CF14 7YT
ME14 2LE
TA2 8QY
Maynard Centre, Forest Farm
Springfield Mill J Whatman Wa y
Crown Industrial Estate, Priorswood Road
GE Healthcare
Whatman plc
Amphenol Limited
Wales
S West
UK
UK
UK
LS11 5AD
WF6 1TN
Asda House, Southbank, Great Wilson St
Unit 1, Foxbridge Way
ASDA Stores Limited
International Procurement & Logistics
N East
N East
UK
UK
Global Ultimate Company HQ Country WW Emp Industry
NR3 1PD
WF10 5QL
St Crispins, Duke Street
Witwood Common Lane, Witwood
Stationery Office (UK ltd)
DHL Supply Chain
East
N East
WPP PLC
WPP PLC
WPP PLC
WPP PLC
UK
UK
UK
UK
120376
120376
120376
120376
2839
2839
2839
2839
General Electric Company
General Electric Company
General Electric Company
USA
USA
USA
5929
5929
5929
5578
5578
5578
Wal-Mart Stores, Inc.
Wal-Mart Stores, Inc.
USA
USA
180339
180339
8079
8079
Deutsche Post AG
Deutsche Post AG
Germany
Germany
6313
6313
4669
4669
UK
UK
Data visualization Data warehouseCRM
Amr Hassan
amr@match2lists.com
Thank you
Accurately match, merge and de-dupe
Millions of records in minutes
Match
Requests
Processed
Matches
Your servers
workflows
Solutions CSVMATCHED
Match2lists servers
match your
DATA
to
To Create a
custom
solution
For high volume matching
Match
Requests
Processed
Matches
Your servers
Automate
d
workflow
CSVMATCHED
Match2lists servers
Yourcustomers
Match
Requests
Processed
Matches
Your servers
+Self-serve
access
CSVMATCHED
Match2lists servers
Yourcustomers

More Related Content

Similar to Big Data LDN 2017: Matching and De-duping Big Data in the Cloud – in Minutes – Can It Be Done?

Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Databricks
 
Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807
Dreamforce07
 
What is Oracle Demantra
What is Oracle Demantra What is Oracle Demantra
What is Oracle Demantra
Amit Sharma
 
The Future of Project Management from Microsoft
The Future of Project Management from MicrosoftThe Future of Project Management from Microsoft
The Future of Project Management from Microsoft
David J Rosenthal
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedRevolution Analytics
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
Dataiku
 
HyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalyticsHyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalyticsJerry Jermann
 
BigData @ comScore
BigData @ comScoreBigData @ comScore
BigData @ comScoreeaiti
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Amazon Web Services
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
pzjnjr6rsg
 
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
Informatik Aktuell
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for Insurance
Geetha Sreedhar, MBA
 
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
ScyllaDB
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
Safe Software
 
Data profiling-best-practices
Data profiling-best-practicesData profiling-best-practices
Data profiling-best-practices
Blaise Cheuteu
 
Sap Sddemo
Sap SddemoSap Sddemo
Sap Sddemo
Uma Maheswara rao
 
Mazda Star Barcelona
Mazda Star BarcelonaMazda Star Barcelona
Mazda Star Barcelona
CardinaleWay Mazda
 
MaxTECH Technical Training Presentation from MaximoWorld 2018
MaxTECH Technical Training Presentation from MaximoWorld 2018MaxTECH Technical Training Presentation from MaximoWorld 2018
MaxTECH Technical Training Presentation from MaximoWorld 2018
Helen Fisher
 
Unlock Your Manufacturing Data - Oct 2013
Unlock Your Manufacturing Data - Oct 2013Unlock Your Manufacturing Data - Oct 2013
Unlock Your Manufacturing Data - Oct 2013
simotech
 
Understanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce PlatformUnderstanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce Platform
Salesforce Developers
 

Similar to Big Data LDN 2017: Matching and De-duping Big Data in the Cloud – in Minutes – Can It Be Done? (20)

Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at NationwideDeploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
Deploying Enterprise Scale Deep Learning in Actuarial Modeling at Nationwide
 
Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807Aen007 Kenigsberg 091807
Aen007 Kenigsberg 091807
 
What is Oracle Demantra
What is Oracle Demantra What is Oracle Demantra
What is Oracle Demantra
 
The Future of Project Management from Microsoft
The Future of Project Management from MicrosoftThe Future of Project Management from Microsoft
The Future of Project Management from Microsoft
 
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results RevealedIs Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
Is Revolution R Enterprise Faster than SAS? Benchmarking Results Revealed
 
Dataiku productive application to production - pap is may 2015
Dataiku    productive application to production - pap is may 2015 Dataiku    productive application to production - pap is may 2015
Dataiku productive application to production - pap is may 2015
 
HyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalyticsHyperconvergedFantasyAnalytics
HyperconvergedFantasyAnalytics
 
BigData @ comScore
BigData @ comScoreBigData @ comScore
BigData @ comScore
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
 
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...
 
MondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for InsuranceMondCloud Semantic Data Hub for Insurance
MondCloud Semantic Data Hub for Insurance
 
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Data profiling-best-practices
Data profiling-best-practicesData profiling-best-practices
Data profiling-best-practices
 
Sap Sddemo
Sap SddemoSap Sddemo
Sap Sddemo
 
Mazda Star Barcelona
Mazda Star BarcelonaMazda Star Barcelona
Mazda Star Barcelona
 
MaxTECH Technical Training Presentation from MaximoWorld 2018
MaxTECH Technical Training Presentation from MaximoWorld 2018MaxTECH Technical Training Presentation from MaximoWorld 2018
MaxTECH Technical Training Presentation from MaximoWorld 2018
 
Unlock Your Manufacturing Data - Oct 2013
Unlock Your Manufacturing Data - Oct 2013Unlock Your Manufacturing Data - Oct 2013
Unlock Your Manufacturing Data - Oct 2013
 
Understanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce PlatformUnderstanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce Platform
 

More from Matt Stubbs

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Matt Stubbs
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Matt Stubbs
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data Platform
Matt Stubbs
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Matt Stubbs
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Matt Stubbs
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Matt Stubbs
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Matt Stubbs
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Matt Stubbs
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Matt Stubbs
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPR
Matt Stubbs
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Matt Stubbs
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Matt Stubbs
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Matt Stubbs
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Matt Stubbs
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Matt Stubbs
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Matt Stubbs
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Matt Stubbs
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Matt Stubbs
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Matt Stubbs
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Matt Stubbs
 

More from Matt Stubbs (20)

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data Platform
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPR
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
 

Recently uploaded

The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 

Recently uploaded (20)

The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 

Big Data LDN 2017: Matching and De-duping Big Data in the Cloud – in Minutes – Can It Be Done?

  • 2. Accurately Match, Merge and De-dupe Millions of records in minutes
  • 3. Before we created Match2Lists We needed to match millions of records of our customers and 3rd party data We ran a B2B Consulting firm providing Segmentation & Data Visualisation
  • 4. To many false-positives and 30%-40% missed matches Phoenix Ltd Fenix Fuzzy Match Fuzzy Non-Match GSK PLC GlaxoSmithKline Beecham (met at conference) So we tried most Fuzzy Logic software Why ? Why not ?
  • 6. Has 3 Clear Objectives Highest matchresults The least Amountof time visually Simpleto use
  • 9. We developed more advanced ata matching algorithms & Approac Corroborative matching Iterative matching Contextual fuzzy logic Probabilistic logic word order permutations Noise word elimination character transformations Synonym analysis
  • 10. We developed more advanced ata matching algorithms & Approac
  • 11. Need For To run these algorithms on each field for multi-million records datasets = billions of permutations SPEED
  • 13. A science art It is also an We recognised matching is not just
  • 15. BOTH WORK TOGETHER USING OUR Easy visual interface
  • 17. 1 - Apply Match Settings in Seconds 2 - Assess Match Quality in Minutes 3 - Approve and Download Match Results ASSESS ✔ 30SECONDS match visualiser
  • 18. Iterative Matching with different criteria = Highest Match Rates in Minutes
  • 19. Unilever Beteiligungs Gmbh Ge Medical Systems Private Limited General Electric Company Stichting Administratiekantoor Unilever N.V. Unilever Plc DE-duplicate data easily
  • 20. Unilever Beteiligungs Gmbh Ge Medical Systems Private Limited General Electric Company Stichting Administratiekantoor Unilever N.V. Unilever Plc De-Dupe DE-duplicate data easily
  • 21. De-Dupe Unilever Plc General Electric Company Unilever Beteiligungs Gmbh Ge Medical Systems Private Limited General Electric Company Stichting Administratiekantoor Unilever N.V. Unilever Plc DE-duplicate data easily
  • 22. Customer Data Wallet Size DataDun & Bradstreet Data Your CRM Data STEP3 D&B Data Wallet Size Data Merge Match2DnBMatch Blend data from different sources
  • 23. No technical skills required Anyone can use it Strategy Analysts Sales & Marketing Finance & operations
  • 24. Disk-Memory Data Exchange Despite the data compression, the data exchange between disk and memory is both efficient and rapid. Scripting Functionality Excellent scripting feature allows us to write our own User Defined Functions that run at high speed. Less MEMORY = Less Cost EXASOL required only 10% to 20% of the memory configuration of our previous solution when parallel ran both solutions during the transition phase 5 Minute Reboot Time ! The ability to reboot Match2Lists in 5 minutes to perform system upgrades means practically no disruption for our customers Speed and Performance Data matching is 3X faster than our prior solution : 10 seconds to match 5 Million records 30 seconds to match 200 million records Data Compression is Excellent Data compression is impressive which translates to lower hardware requirements. As customers and their data continue to grow, this is a key benefit. Match2Lists’ Experience with Outstanding Support and Great Teamwork Fast, Smooth and Faultless Transition to EXASOL
  • 25. Match merge de-dupe match2dnb Your data In minutes
  • 26. Download ResultsSelect ProjectUpload your Data 54321 Review MatchesPreprocessing 5 Simple Steps
  • 27. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
  • 28. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing 02 Aug’16USASalesForce – CRM Account active 168,287 CRM Data 11 Jul’16 20 Mar’16 USA USA Addressable Market – Top 4000 Companies MarTech – San Francisco Registrants active active 11,827 928 Subscriber Data 05 Jun’16 20 Aug’16 01 May’16 *G* *G* *G* Forbes – 2000 & Worldwide Subsidiaries Segmetrix Top 2500 by Wallet Size Our Global Segment 500 Accounts active active active 434,230 2500 500 Reference Data 23 Jun’16 15 Jun’16 DEU DEU Channel Partner 1 – Sales Out Channel Partner 2 – Sales Out active active 18,231 34,109 Partner Data 01 Sep’16 18 Aug’16 UK UK Rhetorik UK – 25K Sites D&B Top Companies– Tech & Finance active active 23,800 890 Contact Lists
  • 29. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing Company ID Address Address Address Manually select Field types from menu Check auto- detected field types field types menu
  • 30. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
  • 31. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing Corroborative matching Iterative matching Fuzzy logic only when applicable Probabilistic logic All word order permutations Noise word elimination Special character transformations Synonym analysis
  • 32. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing
  • 33. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing THE MATCH VISUALISER
  • 34. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing THE MATCH VISUALISER Objective : Maximise Match Rate 1st match setting ✔ Select fields to use ★ set similarity strengths 30SECON DS UNDE R click any or each score band to assess results APPROVE ENTIRE Score ranges IF Results look good Down to this level 56% 2nd RUNMatch Setting Approve results You’ve approved 93% DOWNLO AD Results
  • 35. 5. Download Results2. Select Project1. Upload your Data 4. Review Matches3. Preprocessing all DONE ! That’s IT Select which fields you want to download from each LIST
  • 36. Post / Zip CodeAddress Fields ( 1, 2 and 3 )Company Name City Country W5 2AU W2 6JR E14 4QB SW1X 7NW CF14 7YT ME14 2LE TA2 8QY LS11 5AD WF6 1TN 26-30 Uxbridge Road 121-141 Westbourne Terrace, 10 Cabot Square, Canary Wharf 1 Knightsbridge Green Maynard Centre, Forest Farm Springfield Mill J Whatman Wa y Crown Industrial Estate, Priorswood Road Asda House, Southbank, Great Wilson St Unit 1, Foxbridge Way Kantar Media Coley Porter Bell Ogilvy Group (UK) J Walter Thompson GE Healthcare Whatman plc Amphenol Limited ASDA Stores Limited International Procurement & Logistics London London London London Wales S West N East N East UK UK UK UK UK UK UK UK UK NR3 1PD WF10 5QL St Crispins, Duke Street Witwood Common Lane, Witwood Stationery Office (UK ltd) DHL Supply Chain East N East UK UK W5 2AU W2 6JR E14 4QB SW1X 7NW 26-30 Uxbridge Road 121-141 Westbourne Terrace, 10 Cabot Square, Canary Wharf 1 Knightsbridge Green Kantar Media Coley Porter Bell Ogilvy Group (UK) J Walter Thompson London London London London WPP PLC WPP PLC WPP PLC WPP PLC UK UK UK UK 120376 120376 120376 120376 2839 2839 2839 2839 UK UK UK UK CF14 7YT ME14 2LE TA2 8QY Maynard Centre, Forest Farm Springfield Mill J Whatman Wa y Crown Industrial Estate, Priorswood Road GE Healthcare Whatman plc Amphenol Limited Wales S West General Electric Company General Electric Company General Electric Company USA USA USA 5929 5929 5929 5578 5578 5578 UK UK UK LS11 5AD WF6 1TN Asda House, Southbank, Great Wilson St Unit 1, Foxbridge Way ASDA Stores Limited International Procurement & Logistics N East N East Wal-Mart Stores, Inc. Wal-Mart Stores, Inc. USA USA 180339 180339 8079 8079 UK UK Global Ultimate Company HQ Country WW Emp SIC Code NR3 1PD WF10 5QL St Crispins, Duke Street Witwood Common Lane, Witwood Stationery Office (UK ltd) DHL Supply Chain East N East Deutsche Post AG Deutsche Post AG Germany Germany 6313 6313 4669 4669 UK UK Download ResultsSelect ProjectUpload your Data Review MatchesPreprocessing SOURCE LIST MATCH LIST Global UltimateID Global UltimateParent Name Design your output file Select what fields you want from your source list Select the fields of the matched records WW Emp Site Name Site Address 1 Site Address 2 Site Address Site State / County Site Post Code SIC Code WPP PLC WPP PLC WPP PLC WPP PLC UK UK UK UK 120376 120376 120376 120376 2839 2839 2839 2839 General Electric Company General Electric Company General Electric Company USA USA USA 5929 5929 5929 5578 5578 5578 Wal-Mart Stores, Inc. Wal-Mart Stores, Inc. USA USA 180339 180339 8079 8079 Deutsche Post AG Deutsche Post AG Germany Germany 6313 6313 4669 4669
  • 37. Post / Zip CodeAddress Fields ( 1, 2 and 3 )Company Name City Country W5 2AU W2 6JR E14 4QB SW1X 7NW CF14 7YT ME14 2LE TA2 8QY LS11 5AD WF6 1TN 26-30 Uxbridge Road 121-141 Westbourne Terrace, 10 Cabot Square, Canary Wharf 1 Knightsbridge Green Maynard Centre, Forest Farm Springfield Mill J Whatman Wa y Crown Industrial Estate, Priorswood Road Asda House, Southbank, Great Wilson St Unit 1, Foxbridge Way Kantar Media Coley Porter Bell Ogilvy Group (UK) J Walter Thompson GE Healthcare Whatman plc Amphenol Limited ASDA Stores Limited International Procurement & Logistics London London London London Wales S West N East N East UK UK UK UK UK UK UK UK UK NR3 1PD WF10 5QL St Crispins, Duke Street Witwood Common Lane, Witwood Stationery Office (UK ltd) DHL Supply Chain East N East UK UK W5 2AU W2 6JR E14 4QB SW1X 7NW 26-30 Uxbridge Road 121-141 Westbourne Terrace, 10 Cabot Square, Canary Wharf 1 Knightsbridge Green Kantar Media Coley Porter Bell Ogilvy Group (UK) J Walter Thompson London London London London UK UK UK UK CF14 7YT ME14 2LE TA2 8QY Maynard Centre, Forest Farm Springfield Mill J Whatman Wa y Crown Industrial Estate, Priorswood Road GE Healthcare Whatman plc Amphenol Limited Wales S West UK UK UK LS11 5AD WF6 1TN Asda House, Southbank, Great Wilson St Unit 1, Foxbridge Way ASDA Stores Limited International Procurement & Logistics N East N East UK UK Global Ultimate Company HQ Country WW Emp Industry NR3 1PD WF10 5QL St Crispins, Duke Street Witwood Common Lane, Witwood Stationery Office (UK ltd) DHL Supply Chain East N East WPP PLC WPP PLC WPP PLC WPP PLC UK UK UK UK 120376 120376 120376 120376 2839 2839 2839 2839 General Electric Company General Electric Company General Electric Company USA USA USA 5929 5929 5929 5578 5578 5578 Wal-Mart Stores, Inc. Wal-Mart Stores, Inc. USA USA 180339 180339 8079 8079 Deutsche Post AG Deutsche Post AG Germany Germany 6313 6313 4669 4669 UK UK Data visualization Data warehouseCRM
  • 39. Accurately match, merge and de-dupe Millions of records in minutes
  • 40. Match Requests Processed Matches Your servers workflows Solutions CSVMATCHED Match2lists servers match your DATA to To Create a custom solution For high volume matching