How Big Data revolutionizes decision support in tourism
1. How Big Data
revolutionizes decision
support in tourism
Prof. Dr. Wolfram Höpken
Hochschule Ravensburg-Weingarten
wolfram.hoepken@hs-weingarten.de
24th January 2017
2. 2 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Credit card companies can
predict divorce with 95%
accuracy, two years out,
based on your purchasing
decisions
www.bearron.com
3. 3 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Agenda
Business intelligence & data mining in Tourism
Latest BI trends
Benefit and potential of BI in Tourism
Conclusion
4. 4 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Agenda
Business intelligence & data mining in Tourism
Latest BI trends
Benefit and potential of BI in Tourism
Conclusion
5. 5 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Applications of BI & data mining in Tourism
Revenue management
• Explanation of booking and
cancellation behavior
• Prediction of tourism demand
• Prediction of flight prices
(DINAMO: Yield management system
developed by American Airlines 1988)
Product optimization & sales
• Explanation of tourists’
consumption behavior
• Optimization of product bundles /
market basket analysis
• Cross selling
Customer relationship
management
• Customer segmentation
• Adaptive marketing
6. 6 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Current situation
The big data potential
Explosive growth of available data on nearly all relevant
tourism processes and activities
Transactions (booking, stay, consumption, etc.)
Navigation behavior on websites / online platforms
Customer feedback and product reviews
Increase in computing power and storage capacity
The challenge
This valuable information typically remains unused
“we are drowning in information but starved for knowledge”
(John Naisbitt)
7. 7 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Agenda
Business intelligence & data mining in Tourism
Latest BI trends
Benefit and potential of BI in Tourism
Conclusion
8. 8 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Traditional BI applications
Characteristics of traditional BI applications
Focus: typical business transactions
Clear seperation of operative and dispositive systems
Data: internal, structured
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Operative systems (OLTP) Dispositive systems (OLAP)
9. 9 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 1: Operational BI
Direct feedback into operative systems
Automatic consideration of analysis results within
operative systems
• Dynamic price setting, yield management
• Intelligent product recommendations
• Personalisation of offers and marketing (targeting)
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Stronger focus on
analytical BI
• Prediction models for
demand prediction
• Cluster analysis for
customer
segmentation
• Association rules for
product
recommendations and
cross selling
Operative systems (OLTP) Dispositive systems (OLAP)
10. 10 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 2: Integration of big data sources
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Web content
• User generated content (customer feedback / opinions)
• Data on markets and competitors (e.g. changes in demand structure,
price changes)
Operative systems (OLTP) Dispositive systems (OLAP)
External data sources
11. 11 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 2: Integration of big data sources
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Web content
• Economic data (e.g. GDP, employment data in sending
countries)
• Weather data (historic weather data and weather forecasts)
Environment
data
Operative systems (OLTP) Dispositive systems (OLAP)
External data sources
12. 12 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 2: Integration of big data sources
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Web Content
• Interactions with local infrastructure (light, air conditioning,
minibar, stereo equipment, TV, telephone, etc. e.g. in hotel room)
Environment
data
Local
infrastructure
Operative systems (OLTP) Dispositive systems (OLAP)
External data sources Interactions with environment
13. 13 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 2: Integration of big data sources
Datawarehouse
Reporting
OLAP
Data
mining
CRS
ERP
CRM
Online
platforms
Web content
• Location tracking (GPS-based)
• Reaching POIs (QR code/RFID/NFC-based)
Environment
data
Local
infrastructure
Movement
profiles
Operative systems (OLTP) Dispositive systems (OLAP)
External data sources Interactions with environment
14. 14 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Trend 2: Integration of big data sources
Datawarehouse
Operative systems (OLTP)
Reporting
OLAP
Data
mining
Dispositive systems (OLAP)
CRS
ERP
CRM
Online
platforms
External data sources
Web content
Environment
data
Local
infrastructure
Interactions with environment
Movement
profiles
Typical characteristics of big data sources
• Often unstructured (web content)
• Very large data volumes
• External
15. 15 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Agenda
Business intelligence & data mining in Tourism
Latest BI trends
Benefit and potential of big data in Tourism
Conclusion
16. 16 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Revenue Management
Prediction of demand based on google search volume
Prediction based on search terms „Hotel/Hostel/Pension in Berlin“
Using data mining techniques artificial neural networks, k-nearest neighbor and the
statistical approach linear regression
Achieves satisfactory results: relative error 5,68% compared to 3,58% for
autoregressive approach
Enables predictions under changing conditions or singular events
Enables to identify most important search terms, driving tourism arrivals
Tourist arrivals
arrivals
googlesearchvolume
17. 17 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Revenue Management
Prediction of demand based on big data sources
Tourist arrivals and google online traffic
Tourist arrivals and jet fuel price
Predicting tourist arrivals based on
past arrivals and big data
Used data sources: google online traffic, jet fuel
price, GDP of sending countries, price level of
destination & alternative destinations
18. 18 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Revenue Management
Prediction of demand based on big data sources
Data mining technique K-nearest neighbour (k-NN) as prediction method
19. 19 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Revenue Management
Prediction of demand based on big data sources
Including big data sources significantly increases prediction performance
MAE (mean average
error) over all sending
countries for the
prediction method k-
NN is reduced from
620 to 432, thus, by
30%
20. 20 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Revenue Management
Estimation of demand based on customer reviews
Customer feedback region Halland (2003-2016) Customer feedback region Halland (2014-2016)
Tourist arrivals region Halland (2014-2016)Cross-correlogram arrivals - feedback
-1
-0.5
0
0.5
1
0 1 2 3 4 5
Time lag (in month)
Customer
reviews enable
short-term
estimation of
tourist arrivals,
esp. in
extraordinary
situations
21. 21 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Optimization of real product
Analysis of customer feedback (sentiment analysis)
Extraction of customer feedback from review platforms
Preprocessing
Tokenizing, stop word removal, stemming,
TF-IDF word vector creation,
POS tagging (part-of-speech),
N-gram creation
Classification into topic,
subjectivity and sentiment
Support vector machines (SVM)
Naïve Bayes
K-nearest neighbour (k-NN)
Method Accuracy
Topic detection
SVM (with POS tagging) 72.36%1
Naïve Bayes
(with POS tagging)
49.72%1
k-NN (with k = 8) 57.08%1
Dictionary-based 71.28%2
Subjectivity detection
SVM 65.50%1
Naïve Bayes 60.67%1
k-NN (with k = 5) 55.50%1
Dictionary-based 82.63%2
Sentiment detection
SVM (with bigrams) 76.80%1
Naïve Bayes (with trigrams) 69.80%1
k-NN (with k = 8) 69.60%1
Dictionary-based 71.28%2
22. 22 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Optimization of real product
Analysis of customer feedback (sentiment analysis)
Detailed analysis of customer feedback
(positive/negative statements)
23. 23 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Optimization of real product
Analysis customer feedback (sentiment analysis)
Benchmarking along product topics
24. 24 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Optimization of real product
Dynamic topic detection
Identification of (fine-grained) topics mentioned in customer feedback
(based on unsupervised learning techniques)
Approach Accuracy
Identification of frequent words
(nouns only)
82.86%
Keyword Clustering
(nouns only, sentences-based,
k=80)
88.45%
LSI - Latent Semantic Indexing
(nouns only, sentences-based,
k=80)
85.46%
NER – Named Entity Recognition
(Naïve Bayes, 2 words +/- as
context)
75.17%
Fine-grained topics
with keywords
Predefined high-level
topics
restaurant
service
staff
center
city
halmstad
station
train
walk
hotel
parking
dinner
food
food &
beverage
staff location
breakfast
place
beach
hotel
location
25. 25 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Adaptive marketing and product recommendations
based on consumption and movement patterns
Movement patterns extracted from foursquare
Typical analyses
• Association rule analysis and
sequential pattern mining to
identify spatial behaviour and
movement patterns
26. 26 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Movement patterns extracted from flickr
Clustering of
flickr foto uploads
(by DBSCAN)
27. 27 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Movement patterns extracted from flickr
Association rules
Rule Sup % Conf % Lift
1, 3 → 8 1 53.3 2.97
1
3
8
28. 28 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Movement patterns extracted from flickr
More fine-grained
clustering for city
center
(by k-means)
29. 29 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Movement patterns extracted from flickr
Association rules
Rule Sup % Conf % Lift
1,2 → 3 1.6 100 7.86
1
2
3
30. 30 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Marketing & sales
Movement patterns extracted from flickr
Sequential patterns
Frequent Sequence Sup %
<Max-Joseph-Platz>
<Odeonsplatz>
1.6
Frequent Sequence Sup %
<Frauenkirche>
<Hofbräuhaus>
1.3
Frequent Sequence Sup %
<Frauenkirche>
<Heilig-Geist-
Kirche>
1.3
31. 31 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Agenda
Business intelligence & data mining in Tourism
Latest BI trends
Benefit and potential of BI in Tourism
Conclusion
32. 32 Prof. Dr. Wolfram HöpkenBusiness Intelligence & Data Mining in Tourism
Conclusion
Current trends
Tourists leave traces during nearly all touristic activities
Booking/consumption behavior, information need, preferences,
movement patterns, feedback, etc.
Big Data
Today all this information can technically be gathered and
analysed
Improvement of decision support
Adaptation/optimization of operative processes and
personalization of customer interactions (Operational BI)
Challenge: Evaluation of feasability
Do the available data sources deliver the required knowledge and
can the intended decision support or customer benefit be realized?