SlideShare a Scribd company logo
9/19/2019 Heiko Paulheim 1
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim
9/19/2019 Heiko Paulheim 2
Introductory Example: GPS vs. Smart Phones
• Tests show: smart phones do the job better
– with smart phones on the rise, GPS sales decline
0
5.000
10.000
15.000
20.000
25.000
30.000
GPSsales
Smart phonesales
Source: Statista
Data for Germany;
US looks similar
9/19/2019 Heiko Paulheim 3
Computer Science Interlude: Navigation
• Problem: find the shortest path through a network
• Solution: known since the 1950s
– can be written down in less than 20 lines
End
Start
2km
2km
1km
1km
1km
3km
2km
1km
9/19/2019 Heiko Paulheim 4
Computer Science Interlude: Navigation
• Usually, we do not want the shortest way
– but the fastest
• We need to estimate times
End
Start
0:05 0:15
0:10
0:10
0:15
0:15
0:05
0:10
9/19/2019 Heiko Paulheim 5
Estimating Times for Edges
• Static: path length and speed limit
• Dynamic: live car movements
• Google Maps: owned by Google
– So is Android (market share US: 48%, Germany: 73%, China: 79%)
– i.e., about one android phone in every other car
Source: https://gs.statcounter.com/os-market-share/mobile/
9/19/2019 Heiko Paulheim 6
Visual Depiction
• One Android phone in every other car
Image: Bing Maps
9/19/2019 Heiko Paulheim 7
Improving Navigation
• Ingredients:
– A simple standard textbook algorithm from the 1950s
– A lot of data
• Better navigation
– Usually: not by smarter algorithms
– But by better (=bigger) data!
End
Start
0:05
0:10
0:15
0:10 0:25
0:10
0:15
0:15
0:05
Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
9/19/2019 Heiko Paulheim 8
A.I. Winters and A Paradigm Shift
• AI has a massive uptake since the 2010s
– But using very different paradigms
1st
AI Winter
2nd
AI Winter
Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
9/19/2019 Heiko Paulheim 9
An Example for AI: Go
• 1990s
– Using handcrafted rules
• i.e., smart algorithms
– Often defeated by children
2010s
Using data from millions of
games
i.e., big data
AlphaGo: Beat some of world’s
best players in 2016
9/19/2019 Heiko Paulheim 10
AI in the Big Data Age (1)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
smarter
algorithm
more
data
9/19/2019 Heiko Paulheim 11
AI in the Big Data Age (2)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
more data:
trivial baseline
beats smart
algorithms
9/19/2019 Heiko Paulheim 12
Big Data: Long vs. Wide Data
• Long data = more records of the same kind
– e.g., GPS data from more users
• Wide data = more information about the same records
– e.g., additional information about users
Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
9/19/2019 Heiko Paulheim 13
It’s All about Patterns in Data
• Examples
– Traffic movements
– Online user behavior
– Cliques in social networks
– …
• Methods:
– Data Mining
– Machine Learning
– …
→ Intensively researched since the 1980s
Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
9/19/2019 Heiko Paulheim 14
Patterns in Long Data
9/19/2019 Heiko Paulheim 15
Patterns in Long Data
9/19/2019 Heiko Paulheim 16
Patterns in Wide Data
9/19/2019 Heiko Paulheim 18
Big Data: Long vs. Wide Data
• Example: YouTube (owned by Google)
– Display videos to the user that are as interesting as possible
• Long data: users’ interaction histories
• Wide data:
users’ interaction histories + Google Web searches + visited places
+ Google Play music preferences + ...
9/19/2019 Heiko Paulheim 19
Big Data: Long vs. Wide Data
• Example: Facebook
– Display as much content of interest as possible
• Long data: user profile and interactions
• Wide data:
user profile and interactions + WhatsApp chats
In Germany,
OVG Hamburg
prohibits this
combination!
Image: https://www.instagram.com/p/Bt3OG4DFOsK/
9/19/2019 Heiko Paulheim 20
Big Data: Long vs. Wide Data
• Example: WeChat
• Started as chat application
– showing advertisement based on chats
– later added: apps-in-app (shopping, payment, …)
– CS perspective: rather an OS than an app
• Long data
– Many people’s chats
• Wide data
– Chats
– Shopping history (also includes: products viewed)
– Payment history
Image: Wikipedia
9/19/2019 Heiko Paulheim 21
Take Aways
• Modern AI Systems
– Rely on massive amounts of data
– Processed with fairly simple algorithms
• Algorithms are often well known
– e.g., textbooks, research papers
– It is hard to own an algorithm
• Data is crucial
– Longer data (e.g., acquiring more customers)
– Wider data (e.g., merging businesses)
– It is easy to own data
9/19/2019 Heiko Paulheim 22
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim

More Related Content

What's hot

Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
Heiko Paulheim
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF Data
Heiko Paulheim
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Heiko Paulheim
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
Heiko Paulheim
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Heiko Paulheim
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
Heiko Paulheim
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
Heiko Paulheim
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
Heiko Paulheim
 
How news organizations are using data to tell
How news organizations are using data to tellHow news organizations are using data to tell
How news organizations are using data to tell
peterverweij
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
voginip
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
chris wiggins
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
voginip
 
Linked Open Data enhanced Knowledge Discovery
Linked Open Data enhanced  Knowledge DiscoveryLinked Open Data enhanced  Knowledge Discovery
Linked Open Data enhanced Knowledge Discovery
Heiko Paulheim
 
Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013
Lars G. Svensson
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
Heiko Paulheim
 
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH
 
Data on the web
Data on the webData on the web
Data on the web
Alejandra Garcia Rojas
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are goingEuropean Data Forum
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
chris wiggins
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
chris wiggins
 

What's hot (20)

Knowledge Graphs on the Web
Knowledge Graphs on the WebKnowledge Graphs on the Web
Knowledge Graphs on the Web
 
Type Inference on Noisy RDF Data
Type Inference on Noisy RDF DataType Inference on Noisy RDF Data
Type Inference on Noisy RDF Data
 
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI SystemsKnowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
Knowledge Matters! The Role of Knowledge Graphs in Modern AI Systems
 
From Wikis to Knowledge Graphs
From Wikis to Knowledge GraphsFrom Wikis to Knowledge Graphs
From Wikis to Knowledge Graphs
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Towards Knowledge Graph Profiling
Towards Knowledge Graph ProfilingTowards Knowledge Graph Profiling
Towards Knowledge Graph Profiling
 
What the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open DataWhat the Adoption of schema.org Tells about Linked Open Data
What the Adoption of schema.org Tells about Linked Open Data
 
How much is a Triple?
How much is a Triple?How much is a Triple?
How much is a Triple?
 
How news organizations are using data to tell
How news organizations are using data to tellHow news organizations are using data to tell
How news organizations are using data to tell
 
Searching for reliable business information: free versus fee
Searching for reliable business information: free versus feeSearching for reliable business information: free versus fee
Searching for reliable business information: free versus fee
 
data science: past, present, and future
data science: past, present, and futuredata science: past, present, and future
data science: past, present, and future
 
The changing landscape of search for business information
The changing landscape of search for business informationThe changing landscape of search for business information
The changing landscape of search for business information
 
Linked Open Data enhanced Knowledge Discovery
Linked Open Data enhanced  Knowledge DiscoveryLinked Open Data enhanced  Knowledge Discovery
Linked Open Data enhanced Knowledge Discovery
 
Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013Linked data in the German National Library at the OCLC IFLA round table 2013
Linked data in the German National Library at the OCLC IFLA round table 2013
 
Fast Approximate A-box Consistency Checking using Machine Learning
Fast Approximate  A-box Consistency Checking using Machine LearningFast Approximate  A-box Consistency Checking using Machine Learning
Fast Approximate A-box Consistency Checking using Machine Learning
 
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
[EN] Breaking the Barriers of Traditional Records Management | Ulrich Kampffm...
 
Data on the web
Data on the webData on the web
Data on the web
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
 
data science history / data science @ NYT
data science history / data science @ NYTdata science history / data science @ NYT
data science history / data science @ NYT
 
data history / data science @ NYT
data history / data science @ NYTdata history / data science @ NYT
data history / data science @ NYT
 

Similar to Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective

Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv
 
Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Michael Klemen
 
Google Trends Analysis
Google Trends AnalysisGoogle Trends Analysis
Google Trends Analysis
Awara Direct Search
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
Heiko Paulheim
 
Social business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank NestlerSocial business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank Nestler
LetsConnect
 
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
panagenda
 
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBig Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Broadridge
 
Opportunities for IT and SLA Professionals to Collaborate
Opportunities for IT and SLA Professionals to CollaborateOpportunities for IT and SLA Professionals to Collaborate
Opportunities for IT and SLA Professionals to Collaborate
Anand Deshpande
 
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton
Building an 'Internet of Things' ( IoT ) technology cluster in BrightonBuilding an 'Internet of Things' ( IoT ) technology cluster in Brighton
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton
Bill Harpley
 
What Is That DMP Good For, Anyway?
What Is That DMP Good For, Anyway?What Is That DMP Good For, Anyway?
What Is That DMP Good For, Anyway?
MediaPost
 
Technology in the consumer world
Technology in the consumer worldTechnology in the consumer world
Technology in the consumer world
KarenMcBride13
 
Citizen Participation - Case Study on Participatory Apps in Germany
Citizen Participation - Case Study on Participatory Apps in GermanyCitizen Participation - Case Study on Participatory Apps in Germany
Citizen Participation - Case Study on Participatory Apps in Germany
Tobias Siebenlist
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Geoffrey Fox
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Geoffrey Fox
 
What is big data?
What is big data?What is big data?
What is big data?
Clement Levallois
 
The Biggest Lies That Digital Marketers Tell Themselves
The Biggest Lies That Digital Marketers Tell ThemselvesThe Biggest Lies That Digital Marketers Tell Themselves
The Biggest Lies That Digital Marketers Tell Themselves
Samuel Scott
 
Big Data and Social Media
Big Data and Social MediaBig Data and Social Media
Big Data and Social MediaAmy Shuen
 
Managing Environmental Data in the Google Age
Managing Environmental Data in the Google AgeManaging Environmental Data in the Google Age
Managing Environmental Data in the Google Age
Thierry Gregorius
 
Birnbach Communications Predictions For 2012
Birnbach Communications Predictions For 2012Birnbach Communications Predictions For 2012
Birnbach Communications Predictions For 2012
NormanB
 
Ps113 transactis-june2010
Ps113 transactis-june2010Ps113 transactis-june2010
Ps113 transactis-june2010
Ian Jindal
 

Similar to Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective (20)

Eduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old lawsEduserv Symposium 2013 - New technologies & paradigms, old laws
Eduserv Symposium 2013 - New technologies & paradigms, old laws
 
Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016Procurement as a key player in the digital enterprise WKO VIENNA 13092016
Procurement as a key player in the digital enterprise WKO VIENNA 13092016
 
Google Trends Analysis
Google Trends AnalysisGoogle Trends Analysis
Google Trends Analysis
 
Exploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data MiningExploiting Linked Open Data as Background Knowledge in Data Mining
Exploiting Linked Open Data as Background Knowledge in Data Mining
 
Social business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank NestlerSocial business Fireside Chat with Frank Nestler
Social business Fireside Chat with Frank Nestler
 
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
Social Connections 14 - Social Business Fireside Chat with Frank Nestler (Evo...
 
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data AnalyticsBig Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
Big Data in the Fund Industry: From Descriptive to Prescriptive Data Analytics
 
Opportunities for IT and SLA Professionals to Collaborate
Opportunities for IT and SLA Professionals to CollaborateOpportunities for IT and SLA Professionals to Collaborate
Opportunities for IT and SLA Professionals to Collaborate
 
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton
Building an 'Internet of Things' ( IoT ) technology cluster in BrightonBuilding an 'Internet of Things' ( IoT ) technology cluster in Brighton
Building an 'Internet of Things' ( IoT ) technology cluster in Brighton
 
What Is That DMP Good For, Anyway?
What Is That DMP Good For, Anyway?What Is That DMP Good For, Anyway?
What Is That DMP Good For, Anyway?
 
Technology in the consumer world
Technology in the consumer worldTechnology in the consumer world
Technology in the consumer world
 
Citizen Participation - Case Study on Participatory Apps in Germany
Citizen Participation - Case Study on Participatory Apps in GermanyCitizen Participation - Case Study on Participatory Apps in Germany
Citizen Participation - Case Study on Participatory Apps in Germany
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
What is big data?
What is big data?What is big data?
What is big data?
 
The Biggest Lies That Digital Marketers Tell Themselves
The Biggest Lies That Digital Marketers Tell ThemselvesThe Biggest Lies That Digital Marketers Tell Themselves
The Biggest Lies That Digital Marketers Tell Themselves
 
Big Data and Social Media
Big Data and Social MediaBig Data and Social Media
Big Data and Social Media
 
Managing Environmental Data in the Google Age
Managing Environmental Data in the Google AgeManaging Environmental Data in the Google Age
Managing Environmental Data in the Google Age
 
Birnbach Communications Predictions For 2012
Birnbach Communications Predictions For 2012Birnbach Communications Predictions For 2012
Birnbach Communications Predictions For 2012
 
Ps113 transactis-june2010
Ps113 transactis-june2010Ps113 transactis-june2010
Ps113 transactis-june2010
 

More from Heiko Paulheim

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Heiko Paulheim
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
Heiko Paulheim
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
Heiko Paulheim
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
Heiko Paulheim
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Heiko Paulheim
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Heiko Paulheim
 
Combining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly DetectionCombining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly Detection
Heiko Paulheim
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
Heiko Paulheim
 
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
Heiko Paulheim
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Heiko Paulheim
 
Detecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpediaDetecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpedia
Heiko Paulheim
 
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionIdentifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Heiko Paulheim
 
Extending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List PagesExtending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List Pages
Heiko Paulheim
 

More from Heiko Paulheim (13)

Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...Knowledge Graph Generation  from Wikipedia in the Age of ChatGPT:  Knowledge ...
Knowledge Graph Generation from Wikipedia in the Age of ChatGPT: Knowledge ...
 
What_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdfWhat_do_Knowledge_Graph_Embeddings_Learn.pdf
What_do_Knowledge_Graph_Embeddings_Learn.pdf
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Weakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on TwitterWeakly Supervised Learning for Fake News Detection on Twitter
Weakly Supervised Learning for Fake News Detection on Twitter
 
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and OntologyData-driven Joint Debugging of the DBpedia Mappings and Ontology
Data-driven Joint Debugging of the DBpedia Mappings and Ontology
 
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on TopServing DBpedia with DOLCE - More Than Just Adding a Cherry on Top
Serving DBpedia with DOLCE - More Than Just Adding a Cherry on Top
 
Combining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly DetectionCombining Ontology Matchers via Anomaly Detection
Combining Ontology Matchers via Anomaly Detection
 
Gathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia EntitiesGathering Alternative Surface Forms for DBpedia Entities
Gathering Alternative Surface Forms for DBpedia Entities
 
Mining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMinerMining the Web of Linked Data with RapidMiner
Mining the Web of Linked Data with RapidMiner
 
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
Data Mining with Background Knowledge from the Web - Introducing the RapidMin...
 
Detecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpediaDetecting Incorrect Numerical Data in DBpedia
Detecting Incorrect Numerical Data in DBpedia
 
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier DetectionIdentifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection
 
Extending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List PagesExtending DBpedia with Wikipedia List Pages
Extending DBpedia with Wikipedia List Pages
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 

Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspective

  • 1. 9/19/2019 Heiko Paulheim 1 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim
  • 2. 9/19/2019 Heiko Paulheim 2 Introductory Example: GPS vs. Smart Phones • Tests show: smart phones do the job better – with smart phones on the rise, GPS sales decline 0 5.000 10.000 15.000 20.000 25.000 30.000 GPSsales Smart phonesales Source: Statista Data for Germany; US looks similar
  • 3. 9/19/2019 Heiko Paulheim 3 Computer Science Interlude: Navigation • Problem: find the shortest path through a network • Solution: known since the 1950s – can be written down in less than 20 lines End Start 2km 2km 1km 1km 1km 3km 2km 1km
  • 4. 9/19/2019 Heiko Paulheim 4 Computer Science Interlude: Navigation • Usually, we do not want the shortest way – but the fastest • We need to estimate times End Start 0:05 0:15 0:10 0:10 0:15 0:15 0:05 0:10
  • 5. 9/19/2019 Heiko Paulheim 5 Estimating Times for Edges • Static: path length and speed limit • Dynamic: live car movements • Google Maps: owned by Google – So is Android (market share US: 48%, Germany: 73%, China: 79%) – i.e., about one android phone in every other car Source: https://gs.statcounter.com/os-market-share/mobile/
  • 6. 9/19/2019 Heiko Paulheim 6 Visual Depiction • One Android phone in every other car Image: Bing Maps
  • 7. 9/19/2019 Heiko Paulheim 7 Improving Navigation • Ingredients: – A simple standard textbook algorithm from the 1950s – A lot of data • Better navigation – Usually: not by smarter algorithms – But by better (=bigger) data! End Start 0:05 0:10 0:15 0:10 0:25 0:10 0:15 0:15 0:05 Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
  • 8. 9/19/2019 Heiko Paulheim 8 A.I. Winters and A Paradigm Shift • AI has a massive uptake since the 2010s – But using very different paradigms 1st AI Winter 2nd AI Winter Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
  • 9. 9/19/2019 Heiko Paulheim 9 An Example for AI: Go • 1990s – Using handcrafted rules • i.e., smart algorithms – Often defeated by children 2010s Using data from millions of games i.e., big data AlphaGo: Beat some of world’s best players in 2016
  • 10. 9/19/2019 Heiko Paulheim 10 AI in the Big Data Age (1) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation smarter algorithm more data
  • 11. 9/19/2019 Heiko Paulheim 11 AI in the Big Data Age (2) • Algorithms are fairly simple and well known • Data matters Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation more data: trivial baseline beats smart algorithms
  • 12. 9/19/2019 Heiko Paulheim 12 Big Data: Long vs. Wide Data • Long data = more records of the same kind – e.g., GPS data from more users • Wide data = more information about the same records – e.g., additional information about users Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
  • 13. 9/19/2019 Heiko Paulheim 13 It’s All about Patterns in Data • Examples – Traffic movements – Online user behavior – Cliques in social networks – … • Methods: – Data Mining – Machine Learning – … → Intensively researched since the 1980s Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
  • 14. 9/19/2019 Heiko Paulheim 14 Patterns in Long Data
  • 15. 9/19/2019 Heiko Paulheim 15 Patterns in Long Data
  • 16. 9/19/2019 Heiko Paulheim 16 Patterns in Wide Data
  • 17. 9/19/2019 Heiko Paulheim 18 Big Data: Long vs. Wide Data • Example: YouTube (owned by Google) – Display videos to the user that are as interesting as possible • Long data: users’ interaction histories • Wide data: users’ interaction histories + Google Web searches + visited places + Google Play music preferences + ...
  • 18. 9/19/2019 Heiko Paulheim 19 Big Data: Long vs. Wide Data • Example: Facebook – Display as much content of interest as possible • Long data: user profile and interactions • Wide data: user profile and interactions + WhatsApp chats In Germany, OVG Hamburg prohibits this combination! Image: https://www.instagram.com/p/Bt3OG4DFOsK/
  • 19. 9/19/2019 Heiko Paulheim 20 Big Data: Long vs. Wide Data • Example: WeChat • Started as chat application – showing advertisement based on chats – later added: apps-in-app (shopping, payment, …) – CS perspective: rather an OS than an app • Long data – Many people’s chats • Wide data – Chats – Shopping history (also includes: products viewed) – Payment history Image: Wikipedia
  • 20. 9/19/2019 Heiko Paulheim 21 Take Aways • Modern AI Systems – Rely on massive amounts of data – Processed with fairly simple algorithms • Algorithms are often well known – e.g., textbooks, research papers – It is hard to own an algorithm • Data is crucial – Longer data (e.g., acquiring more customers) – Wider data (e.g., merging businesses) – It is easy to own data
  • 21. 9/19/2019 Heiko Paulheim 22 Big Data, Smart Algorithms, and Market Power A Computer Scientist’s Perspective Heiko Paulheim Chair for Data Science University of Mannheim Heiko Paulheim