Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Automated hyperlink text analysis of city websites projected image representation on the web
1. ENTER 2016 Research Track Slide Number 1
Automated Hyperlink Text Analysis of
City Websites – Projected Image
Representation on the Web
Weismayer, Christian1
, Pezenka, Ilona2
, Loibl, Wilhelm3
1
MODUL University Vienna, Austria
christian.weismayer@modul.ac.at, http://www.modul.ac.at
2
FHWien der WKW, Austria
ilona.pezenka@fh-wien.ac.at, http://www.fh-wien.ac.at
3
University of Chester, UK
wilhelm.loibl@chester.ac.uk, http://www.chester.ac.uk
2. ENTER 2016 Research Track Slide Number 2
Outline
1. Introduction
2. Literature review
2.1. Destination image
2.2. Projected destination image on the web
1. Methodology and Results
2. Conclusion and Recommendations
3. Limitations and Further Research
4. References
3. ENTER 2016 Research Track Slide Number 3
Introduction
•Travel information is important for the image formation (Baloglu, 1999;
Baloglu & McCleary, 1999; Beerli & Martín, 2004; Fakeye & Crompton, 1991; Gartner, 1993)
•Image allows to differentiate destinations (Baloglu & Brinberg, 1997)
•Image is a crucial factor in tourists’ destination choices (Crompton &
Ankomah, 1993; Gartner, 1989; Goodall, 1988; Moutinho, 1987)
•Destination image influences travellers’ behaviour (Baloglu & McCleary, 1999;
Bigné, Sanchez, & Sanchez, 2001; Gartner, 1989; Gunn, 1972; Hunt, 1975)
•City marketers need to understand their own as well as
competing images (Baloglu & Brinberg, 1997)
•Information agents differ in terms of cost/credibility... and
their affect on the image formation process (Gartner, 1993)
•Website information influences destination image (Gretzel, Yuan, & Fesenmaier,
2000; Jeong, Holland, Jun, & Gibson, 2012; Kaplanidou & Vogt, 2006)
4. ENTER 2016 Research Track Slide Number 4
Destination image
•Difference between perceived (tourist) and projected image
(marketer) (Bramwell & Rawding, 1996)
•Projected image (Gunn, 1972): organic (general non-tourism) vs.
induced image (influenced by tourism agancies‘ marketing
activities)
•Primary (visiting experience) vs. secondary image (guide books,
recommendations) (Phelps, 1986)
•Image is influenced by demand factors (tourist) and supply
factors (destination) (Stabler, 1990)
•Push (intrinsic desire) and pull factors (destination attributes)
(Dann, 1977)
5. ENTER 2016 Research Track Slide Number 5
Projected destination
image on the web
•Content analysis (manually) of websites including photos and
text (Govers and Go, 2004)
•Content analysis (CATPAC) of online tourist information sources
including user-generated content such as travel blogs (Choi et al., 2007)
•Co-occurrence of connotative nouns and country names
(automated web queries) (Mazanec, 2010)
•URL content in comparison with user-generated search engine
log-files (Xiang et al., 2009)
•Content analysis (CATPAC, WORDER) of tour operators‘
websites (Stepchenkovs and Morrison, 2006)
6. ENTER 2016 Research Track Slide Number 6
Projected destination
image on the web
•Websites are important information sources prior to a visit (Beirne &
Curry, 1999; Wang & Fesenmaier, 2004; Xiang, Gretzel, & Fesenmaier, 2008)
•DMOs need to consider the design and content of travel
websites (Jeong et al., 2012, p. 25)
•Content is the most important website characteristic (Kaplanidou and Vogt,
2006)
•Majority of travel website studies focus on perceived image
rather than projected image but more should explore the supply
side (Tasci and Gartner, 2007)
=> Approach at hand uses hyperlink text that is seen as an
accurate website description (Brin and Page, 2012)
7. ENTER 2016 Research Track Slide Number 7
Methodology and Results
Pre-selection of 75 European city websites
Web crawler used URLs as seeds
Hyperlink text collected in July 2015
Software used: R (R Development Core Team, 2005)
Part of speech (POS) tagging not necessary: no information
lost by splitting up 498,633 phrases into single terms
Numbers and XML tags (elements, attributes, special
characters) removed
Minimum appearance constraint on the global frequency:
terms showing up on less than 1/3 of the websites deleted
8. ENTER 2016 Research Track Slide Number 8
Methodology and Results
Synonyms: (e.g. American “center” and British “centre”)
collapsing synonyms into one term by ExactMatchFilter or
ContainsFilter not useful (e.g. ‘car’ contained in ‘a la carte’)
[package wordnet (Feinerer & Hornik, 2015b) used to search via the Jawbone Java WordNet
API library (Wallace, 2007; Princeton University, 2010) ]
Synset: (e.g. adjective opposites) useless as most terms are nouns
Stemming: (e.g. „parking“ to „park“) reduce word complexity to
radicals erasing suffixes not useful
[package snowball (Bouchet-Valat, 2013) using Porter’s word-stemming algorithm]
Stopwords: non-touristic terms deleted
Singular/plural terms and content-wise identical terms collapsed
9. ENTER 2016 Research Track Slide Number 9
Unique hyperlink term
frequencies on city websites
(87 terms on a minimum of 24 websites out of 73 websites)
10. ENTER 2016 Research Track Slide Number 10
Overall absolute term frequencies
(row sums of the term-document matrix)
11. ENTER 2016 Research Track Slide Number 11
Weighting
packages lsa (Wild, 2015a) and tm (Feinerer & Hornik, 2015a)
Local (3): alters term frequencies of single cities
•Unmodified: strengthens varying frequencies
•Logarithm: reduces dominance of outliers
•Binary: terms show up or not (eliminates over- and underrepresentation)
Global (2): alters the way in which single cities come into play
•Length normalization (cells equal 1 divided by the document vector length):
reduces dominance of city websites with many hyperlinks
•Inverse (cells equal 1 + log of the
number of documents divided by
the number of documents where
the term appears): reduces
impact of irrelevant terms and
highlights discriminative ones
12. ENTER 2016 Research Track Slide Number 12
Latent semantic hyperlink space
(terms & city websites)
13. ENTER 2016 Research Track Slide Number 13
Hyperlink term dendrogram
(21 dimensions, city average: 14.34)
e.g.:
Online information: google, twitter, facebook
Transportation/infrastructure: bus, airport, parking, car
Going out: district, nightlife, bars, design
Sightseeing: excursions, sights
General information/trip conditions: booking, online, press, guided
Trip planning: day, free, calendar, history
Cultural resources: exhibitions, theatre, wine, website, palace,
food, stay, trips, train, discover
(some dimensions match with Serna, Marchiori, Gerrikagoitia, Alzua-Sorzabal, & Cantoni, 2015)
14. ENTER 2016 Research Track Slide Number 14
Example based on the
original observed term frequency
Sightseeing Going-out
Excursions Sights Nightlife District Design Bars
Salzburg 2 2 0 0 0 0
Palma de Mallorca 0 2 3 1 1 1
Representation strength of single dimensions on each city website
Salzburg promotes the image of a historic place, emphasis on sights
and museums
Palma de Mallorca highlights its nightlife and going-out possibilities
15. ENTER 2016 Research Track Slide Number 15
Conclusion and Recommendations
• Hyperlink text analysis complements the prevalent
content analyses of websites
• Hyperlink text is an adequate analytical tool for
analysing a destination‘s projected image
• Hyperlink text analysis provides a quick way of regularly
checking the image communicated via websites
• Link denotations refer to website content and permit
indications in terms of the image positioning strategy
intended by DMOs
16. ENTER 2016 Research Track Slide Number 16
Limitations
• Presumption that hyperlink text represents the overall website
and permits indications regarding the image positioning
strategies
• Items of some dimensions do not have clear commonalities
• Categorization optimization using adapted ontologies (Serna et al., 2015)
• Replication studies to test the applicability to other destinations
• Comparison study to verify the image aspects (e.g. exploration
of the demand side)
• Comparison of the whole textual content of city sub-websites
with the hyperlink text information to verify the dimensions
Further Research
17. ENTER 2016 Research Track Slide Number 17
References
Baloglu, S. (1999). A path analytic model of visitation intention involving information sources, sociopsychological motivations, and destination image. Journal of Travel and Tourism Marketing, 8(3), 81–90.
Baloglu, S., & Brinberg, D. (1997). Affective Images of Tourism Destination. Journal of Travel Research, 35(4), 11-15.
Baloglu, S., & McCleary, K. W. (1999). A model of destination image formation. Annals of Tourism Research 26(4), 868–897.
Beerli, A, & Martín, J. D. (2004). Factors influencing destination image. Annals of Tourism Research 31(3), 657–681.
Beirne, E., & Curry, P. (1999). The Impact of the Internet on the Information Search Process and Tourism Decision-Making. In D. Buhalis & W. Schertler (Eds.), Information and Communication Technologies in Tourism,. Wien,
Austria: Springer.
Bigné, J., Sanchez, M., & Sanchez, J. (2001). Tourism Image, Evaluation Variables and After Purchase Behavior: Inter-relationships. Tourism Management, 22(6), 607-616.
Bouchet-Valat, M. (2013). SnowballC: Snowball stemmers based on the C libstemmer UTF-8 library. R package version 0.5.1.
Bramwell, B., & Rawding, L. (1996). Tourism marketing images of industrial cities. Annals of Tourism Research, 23(1), 201-221.
Brin, S., & Page, L. (2012). Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks, 56(18), 3825-3833.
Buhalis, D., & Law, R. (2008). Progress in information technology and tourism management: 20 years on and 10 years after the Internet—The state of eTourism research. Tourism Management, 29(4), 609–623.
Choi, S., Lehto X. Y., & Morrison, A. M. (2007). Destination image representation on the web: contest analysis of Macau travel related websites. Tourism Management, 28(1), 118–129.
Chon, K. (1990). The role of destination image in tourism: a review and discussion. Tourism Review, 45(2), 2-9.
Crompton, J. L. (1979). Motivations for pleasure vacation. Annals of Tourism Research, 6(4), 408-424.
Crompton, J. L., & Ankomah, P. (1993). Choice Set Propositions in Destination Decisions. Annals of Tourism Research, 20(3), 461-476.
Dann, G. M. (1977). Anomie, ego-enhancement and tourism. Annals of Tourism Research, 4(4), 184-194.
Dann, G. M. (1996). Tourists' images of a destination-an alternative analysis. Journal of Travel & Tourism Marketing, 5(1-2), 41-55.
Echtner, C. M., & Ritchie, J. B. (1991). The meaning and measurement of destination image. The Journal of Tourism Studies, 2(2), 2-12.
Fakeye, P. C, & Crompton, J. L. (1991). Image difference between prospective, first-time and repeat visitors to the Lower Rio Grande Valley. Journal of Travel Research, 30(2), 10–16.
Feinerer, I., Hornik, K. & Meyer, D. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5): 1-54.
Feinerer, I. & Hornik, K. (2015a). tm: Text Mining Package. R package version 0.6-2.
Feinerer, I. & Hornik, K. (2015b). Wordnet: WordNet Interface. R package version 0.1-10.
Frías D. M., Rodríguez M. A., & Castaneda J. A. (2008). Internet vs. travel agencies on pre-visit destination image formation: An information processing view. Tourism Management, 29(1), 163–179.
Gartner, W. G. (1993). Image formation process. Journal of Travel and Tourism Marketing, 2(2/3), 191–215.
Gartner, W. (1989). Tourism image: Attribute Measurement of State Tourism Products using Multidimensional Scaling Techniques. Journal of Travel Research, 28(2), 16-20.
Goodall, B. (1988). How Tourists choose Holidays. In B. Goodall, & G. Ashworth (Eds.), Marketing in the Tourism Industry. London: Routledge.
Govers, R., Go, F. M. (2004). Projected destination image online: Website content analysis of pictures and text. Information Technology & Tourism, 7(2), 73-89.
Gretzel, U., Yuan, Y., & Fesenmaier, D. (2000). Preparing for the New Economy: Advertising Strategies and Change in Destination Marketing Organizations. Journal of Travel Research 39(2), 146-156.
Gunn, C. (1972). Vacationscape: Designing Tourist Regions. New York: Van Nostrand.
Hunt, J. (1975). Images as Factor in Tourism Development. Journal of Travel Research, 13(3), 1-7.
ITB World Travel Trends Report 2014/2015 (n.d). Retrieved July 25, 2015, from http://www.it-berlin.de/media/itb/itb_dl_de/itb_itb_berlin/itb_itb_academy/ITB_2015_WTTR_Report_A4_4.pdf
Jeong, C., Holland, S., Jun, S. H., & Gibson, H. (2012). Enhancing Destination Image through Travel Website Information. International Journal of Tourism Research 14(1), 16–27.
Kaplanidou, K., & Vogt, C. (2006). A Structural Analysis of Destination Travel Intentions as a Function of Web Site Features. Journal of Travel Research, 45(2), 204-2016.
Law, R., Qi, S., & Buhalis, D. (2010). Progress in tourism management: A review of website evaluation in tourism research. Tourism Management, 31(3), 297–313.
Mazanec, J. A. (2010). Tourism-Receiving Countries in Connotative Google Space. Journal of Travel Research, 49(4), 501-512.
Moutinho, L. (1987). Consumer Behavior in Tourism. European Journal of Marketing, 21(10), 3-44.
Phelps, A. (1986). Holiday destination image - the problem of assessment: an example developed in Menorca. Tourism Management, 7(3), 168-180.
Pike, S., & Ryan, C. (2004). Destination positioning analysis through a comparison of cognitive, affective, and conative perceptions. Journal of Travel Research, 42(4), 333-342.
Princeton University (2010). About WordNet. Princeton University. Retrieved from http://wordnet.princeton.edu.
R Development Core Team (2005). R: A Language and Environment for StatisticalComputing, R Foundation for Statistical Computing, Vienna, available at: www.R-project.org (accessed 16 July 2015).
Serna, A., Marchiori, E., Gerrikagoitia, J. K., Alzua-Sorzabal, A., & Cantoni, L. (2015). An Auto-Coding Process for Testing the Cognitive-Affective and Conative Model of Destination Image. In Information and Communication
Technologies in Tourism 2015. Springer International Publishing.
Stabler, M. J. (1990). The image of destination regions: theoretical and empirical aspects. In B. Goodall, & G. Ashworth (Eds.), Marketing in the tourism Industry: the promotion of destination regions. London: Routledge.
Stepchenkova, S., & Morrison, A. M. (2006). The destination image of Russia: From the online induced perspective. Tourism Management, 27(5), 943–956.
Tasci, A. D. A., & Gartner, W. C. (2007). Destination Image and its Functional Relationships. Journal of Travel Research, 45 (4), 413-25.
Wallace, M. (2007). Jawbone Java WordNet API. Retrieved from http://mfwallace.googlepages.com/jawbone.
Wang, Y., & Fesenmaier, D. (2004). Towards understanding members’ general participation in and active contribution to an online travel community. Tourism Management, 25(6), 709–722.
Wild, F. (2015a). An Open Source LSA Package for R. R package version 0.73.1.Wild, F. (2015b). An Open Source LSA Package for R. Reference manual.
Xiang, Z., Gretzel, U., & Fesenmaier, D. (2009). Semantic Representation of Tourism on the Internet. Journal of Travel Research, 47(4), 440-453.
18. ENTER 2016 Research Track Slide Number 18
Automated Hyperlink Text Analysis of
City Websites – Projected Image
Representation on the Web
Weismayer, Christian1
, Pezenka, Ilona2
, Loibl, Wilhelm3
1
MODUL University Vienna, Austria
christian.weismayer@modul.ac.at, http://www.modul.ac.at
2
FHWien der WKW, Austria
ilona.pezenka@fh-wien.ac.at, http://www.fh-wien.ac.at
3
University of Chester, UK
wilhelm.loibl@chester.ac.uk, http://www.chester.ac.uk