Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspective
1. 3/28/19 Heiko Paulheim 1
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim
2. 3/28/19 Heiko Paulheim 2
Introductory Example: GPS vs. Smart Phones
• Tests show: smart phones do the job better
– with smart phones on the rise, GPS sales decline
0
5 .0 0 0
1 0 .0 0 0
1 5 .0 0 0
2 0 .0 0 0
2 5 .0 0 0
3 0 .0 0 0
G P S s a le s
S m a rt p h o n e s a le s
Source: Statista
3. 3/28/19 Heiko Paulheim 3
Computer Science Interlude: Navigation
• Problem: find the shortest path through a network
• Solution: known since the 1950s
– can be written down in less than 20 lines
End
Start
2km
2km
1km
1km
1km
3km
2km
1km
4. 3/28/19 Heiko Paulheim 4
Computer Science Interlude: Navigation
• Usually, we do not want the shortest way
– but the fastest
• We need to estimate times
End
Start
0:05 0:15
0:10
0:10
0:15
0:15
0:05
0:10
5. 3/28/19 Heiko Paulheim 5
Estimating Times for Edges
• Static: path length and speed limit
• Dynamic: live car movements
• Google Maps: owned by Google
– So is Android
– 57M smart phones in Germany, market share of Android: 80%
●
i.e., one android phone in every other car
6. 3/28/19 Heiko Paulheim 6
Visual Depiction
• One Android phone in every other car
Image: Bing Maps
7. 3/28/19 Heiko Paulheim 7
Improving Navigation
• Ingredients:
– A simple standard textbook algorithm from the 1950s
– A lot of data
• Better navigation
– Usually: not by smarter algorithms
– But by better (=bigger) data!
End
Start
0:05
0:10
0:15
0:10 0:25
0:10
0:15
0:15
0:05
Image: https://neo4j.com/blog/top-13-resources-graph-theory-algorithms/
8. 3/28/19 Heiko Paulheim 8
A.I. Winters and A Paradigm Shift
• AI has a massive uptake since the 2010s
– But using very different paradigms
1st
AI Winter
2nd
AI Winter
Fast & Horvitz (2016): Long-Term Trends in the Public Perception of Artificial Intelligence
9. 3/28/19 Heiko Paulheim 9
An Example for AI: Go
• 1990s
– Using handcrafted rules
●
i.e., smart algorithms
– Often defeated by children
• 2010s
– Using data from millions of
games
●
i.e., big data
– AlphaGo: Beat some of
world’s best players in 2016
10. 3/28/19 Heiko Paulheim 10
AI in the Big Data Age (1)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
smarter
algorithm
more
data
11. 3/28/19 Heiko Paulheim 11
AI in the Big Data Age (2)
• Algorithms are fairly simple and well known
• Data matters
Banko & Brill (2001): Scaling to Very Very Large Corpora for Natural Language Disambiguation
more data:
trivial baseline
beats smart
algorithms
12. 3/28/19 Heiko Paulheim 12
Big Data: Long vs. Wide Data
• Long data = more records of the same kind
– e.g., GPS data from more users
• Wide data = more information about the same records
– e.g., additional information about users
Lehmberg & Hassanzadeh (2018): Ontology Augmentation Through Matching with Web Tables
13. 3/28/19 Heiko Paulheim 13
Big Data: Long vs. Wide Data
• Example: YouTube (owned by Google)
– Display videos to the user that are as interesting as possible
• Long data: users’ interaction histories
• Wide data:
users’ interaction histories + Google Web searches + visited places
+ Google Play music preferences + ...
14. 3/28/19 Heiko Paulheim 14
Big Data: Long vs. Wide Data
• Example: Facebook
– Display as much content of interest as possible
• Long data: user profile and interactions
• Wide data:
user profile and interactions + WhatsApp chats
In Germany,
OVG Hamburg
prohibits this
combination!
Image: https://www.instagram.com/p/Bt3OG4DFOsK/
15. 3/28/19 Heiko Paulheim 15
It’s All about Patterns in Data
• Examples
– Traffic movements
– Online user behavior
– Cliques in social networks
– …
• Methods:
– Data Mining
– Machine Learning
– …
→ Intensively researched since the 1980s
Image: https://factordaily.com/balaraman-ravindran-reinforcement-learning/
19. 3/28/19 Heiko Paulheim 19
Take Aways
• Modern AI Systems
– Rely on massive amounts of data
– Processed with fairly simple algorithms
• Algorithms are often well known
– e.g., textbooks, research papers
– It is hard to own an algorithm
• Data is crucial
– Longer data (e.g., acquiring more customers)
– Wider data (e.g., merging businesses)
– It is easy to own data
20. 3/28/19 Heiko Paulheim 20
Big Data, Smart Algorithms, and Market Power
A Computer Scientist’s Perspective
Heiko Paulheim
Chair for Data Science
University of Mannheim
Heiko Paulheim