Breaking the Kubernetes Kill Chain: Host Path Mount
Towards a better understanding of the cognitive destination image of the Basque Country based on the analysis of UGC
1. Towards a better understanding of the cognitive
destination image of the Basque Country based on
the analysis of UGC
Ainhoa Serna Ph.D.
Aurkene Alzua, Ph.D
Jon Kepa Gerrikagoitia, Ph.D
ENTER 2014 Research Track
Slide Number 1
3. Introduction
Opinions and UGC
• The opinions and experiences are central to almost all human
activities and are key factors in influencing our behavior. Our beliefs
and perceptions of reality, and the decisions we make, are largely
conditioned by how others see and evaluate the world.
• When we have to make a decision often seek the opinions of others
(Liu, 2012).
• Consumers no longer rely exclusively on official sources to obtain
information about destinations and services. Visitors take into higher
condideration UGC because those comments are based on
experience (Pan et al., 2008) obtaining a more complete image of the
destination and its products (Stepchenkova, 2013; Pang et al., 2011).
ENTER 2014 Research Track
Slide Number 3
4. Introduction
Social Networks
• With the explosion of Web 2.0 consumers have a place
where to share their experiences (Çakmak, 2012). Thus,
users opinion gains weight against traditional marketing
techniques (Ditoiu et al, 2012).
• Social networks are affecting very directly to brands and
e-commerce. 65% of active Facebook accounts follow a
brand (Llantada, 2013). Facebook reaches 85% of Internet
users and Twitter 32%.
• Social networks support the complete cycle of tourism:
inspiration, planning, booking, stay and content
generation based on experiences at the destination.
ENTER 2014 Research Track
Slide Number 4
5. Introduction
Social Networks
• Much of the UGC is generated by social networks
like Twitter, Facebook, ... millions of users use
them every day and this information is valuable
for modeling the vision of the tourist
destination.
• Volume of UGC (reviews, forums, blogs, twitter,
comments, and posts on social networks) has
reached astronomical proportions (Liu, 2012)
making imposible the work of monitoring
manually the opinions (Subrahmanian, 2009).
ENTER 2014 Research Track
Slide Number 5
6. Introduction
Destination Image
• The UGC and social networks become a valuable source to get
insights about the image of the destination.
• The analysis of the discourses on the web provides an approach to
extract information and knowledge associated with how destination
is perceived, motivations, and underlying discourses.
• A proper management of perceptions or images that a potential
visitor can have on a destination contributes to create a different
positioning (Stepchenkova and Morrison, 2008), accordingly the
destination image is presented as a key factor (Echtner and Richie,
1991, 1993; Chon, 1991).
• Despite the great importance of the destination image (Gartner,
1986, 1987), there is not a universally accepted and validated model
(Crompton, 1977; Baloglu & McCleary, 1999a).
» Choosen reference model, Gómez, M. et al (2013).
ENTER 2014 Research Track
Slide Number 6
7. Introduction
Aim
• Our aim is (what):
• to model the cognitive destination image of the
Basque Country by visitors,
• having as reference the conceptual model
proposed by Gomez, M. et al (2013)
• based on data collected from relevant digital
media: destination sites, general social networks,
traveler blogs, reviews, ... etc.
ENTER 2014 Research Track
Slide Number 7
8. Introduction
Aim
• This work involves (how):
• automatic semantic identification of
opinions from social networks
• using a text mining tool adapted to the
domain of the destination,
• developed by CICtourGUNE and
Mondragon Unibertsitatea.
ENTER 2014 Research Track
Slide Number 8
9. Introdution
Previous Research
• Destination image has been defined as an individual’s overall perception or total
set of impressions of a place (Hunt, 1975; Phelps, 1986; Fakeye and Crompton,
1991)
• Research has been conducted since the early 1970s (Hunt, 1975; Mayo, 1973) and
comprises hundreds of studies to date with several assement methods
– Most tourism scholars agree that destination image holds at least two
distinctive components – cognitive (knowledge and beliefs) and affective
(feelings towards a destination).
– Other authors proposed that destination images are formed by four
interrelated components: cognitive, affective, evaluative and behavioural
(Boulding, 1956).
– Echtner and Ritchie (2003) argue that a destination image is processed in a
three dimensional space: namely functional, psychological and holistic.
ENTER 2014 Research Track
Slide Number 9
10. Introdution
Previous Research
• From the cognitive perspective (M.Gomez et al, 2013) purpose a destination
image model based on 4 dimensions: Natural and Cultural Resources,
Infrastructures and Socio-economic environment, Social constraints and
atmosphere constraints
• The literature considers the amount and variety of secondary data sources as
an external variable that contributes significantly to the formation of the image
(Gartner & Hunt, 1987; Um & Crompton, 1990; Bojanic, 1991, Gartner, 1993;
Font, 1997; Baloglu, 1999):
• (Stepchenkova , 2012) used UGC and social networks (Flickr) to create maps
representing the projected and perceived image of a destination.
• (Andrade, 2010) uses destination websites to build the destination image by
the tourist.
• (Xiang & Gretzel, 2010) have used the eWoM, specifically travel blogs, in the
evaluation of a destination image.
ENTER 2014 Research Track
Slide Number 10
11. Methodology
• The model will be generated according to
next steps:
@1. Defining and selecting relevant data
@2. Extracting categories from the selected data
sources
@3. Transforming the categories into a model
ENTER 2014 Research Track
Slide Number 11
12. Methodology
Sources
3.1 Defining and selecting task relevant data
– A prior analysis was carried out taking into account the
quality and quantity of the data from a double
perspective, the projected image by the DMO and the
perceived image by the visitor.
ENTER 2014 Research Track
Slide Number 12
14. Methodology
Data Gathering
3.2 Extracting categories from the selected
data sources
•3.2.1 Data gathering
– The data extraction is an automatic process
achieved by a software program
ENTER 2014 Research Track
Slide Number 14
15. Methodology
Data Gathering
Different data gathering techniques are used depending on
the source:
• Minube provides an API (Application Programme
Interface) allowing searches at territory (Araba,
Bizkaia, Gipuzkoa) and city (Bilbao, San Sebastián,
Vitoria-Gasteiz) disaggregation level.
• Data from twitter si extracted using the Twitter4J API
(Java library)
• Facebook is queried through the Facebook Graph
API.
• The remaining data sources are dumped using web
crawling and scraping techniques developed ad-hoc.
ENTER 2014 Research Track
Slide Number 15
16. Methodology
Entity recognition and categorization
3.2.2 Entity Categorization
•The process of grouping entities into entity categories is called entity
categorization.
•First entities are extracted achiving the named entity recognition (NER)
process (Hobbs and Riloff, 2010; Mooney and Bunescu, 2005; Sarawagi, 2008).
•Once the recognized entities are classified a list of categories, entities and
number of occurrences is obtained.
ENTER 2014 Research Track
Slide Number 16
17. Methodology
Entity recognition and categorization
• Enhanced and adapted instruments by the research group:
• An ontology to obtain the categories.
• A text mining tool based on the Wordnet lexical database
that implements an algorithm for learning generalized
association rules (hierarchy for categorization and pairs for
association)
ENTER 2014 Research Track
Slide Number 17
18. Methodology
3.3 Transforming the categories into a model
Matching of the categories and subcategories with the four
dimensions of the model Gomez, M et al. (2013) automatically.
1.The first dimension (REC) represents the natural and cultural
resources (attractions) including nature, architecture, cultural
attractions, customs and cultural activities.
2.The second dimension (INF) represents infrastructures and the
socioeconomic environment: accommodation, entertainment,
shopping, development and value for price.
3.The third dimension (SOC) represents Social Conditions:
sustainability, cleaning, security, pollution ...
4.The fourth dimension (ATM) represents the Atmosphere: relax,
absence of mass, “peace and rest”.
ENTER 2014 Research Track
Slide Number 18
20. Evaluation Results
• The resulting model allows us to approach the cognitive
destination image of the tourist about the Basque Country
based on the user-generated content about experiences
related to basque territory.
ENTER 2014 Research Track
Slide Number 20
21. Evaluation Results
Museo Guggenheim
San Juan
De Gaztelugache
Torres Isozaki
Bosque de
Oma
ENTER 2014 Research Track
Playa de Sopelana
Santuario de
Urkiola
Slide Number 21
23. Evaluation Results
• Language: About 70% of the visitors in Euskadi
are Spanish speakers, the sample of the
elaborated discourse taken into account is
almost completely in this language.
• The projected image in twitter and facebook
uses the English to translate original messages in
spanish. The projected image is fostered by
embassadors who are residents and basque
language (eu) appears in discourses.
ENTER 2014 Research Track
Slide Number 23
24. Evaluation Results
• TripAdvisor is the network with the most number of
comments with little influence and presence of the DMO
(Destination Management Organization) but spontaneous
user-generated content, thus it is an important data
source to know the perceived image of the destination.
• The DMO does an important effort of communication and
promotion by Facebook and twitter.
• The type of discourse is mostly related to the source, by
the type of messages or the constraints of the channel.
The more elaborated discourses show more and better
described entities.
ENTER 2014 Research Track
Slide Number 24
25. Evaluation Results
The categorization process obtains as results a table by data source
with the identified entities, number of occurrences, and the
corresponding category
ENTER 2014 Research Track
Slide Number 26
27. Conclusions and Future lines
• TripAdvisor is the social network with the most
number of purely user-generated comments.
• TripAdvisor has little influence and presence of
the DMO.
• The DMO does an important effort promoting the
destination using Facebook and twitter, having
this activity a direct influence on the projected
image.
ENTER 2014 Research Track
Slide Number 28
28. Conclusions and Future lines
• The type of discourse is related to the
source, by the type of messages or the
constraints of the channel. More
elaborated discourses -> better described
entities.
• All the analyzed sources have a major
impact on the first dimension concerning
natural and cultural resources.
ENTER 2014 Research Track
Slide Number 29
29. Conclusions and Future lines
• The DMO´s efforts are focused on the first
dimension, everything related to resources
such as Nature, Architecture, Cultural
Attractions, Traditions, Cultural Activities.
• The broadcasting of the event agenda by
the DMO has a big presence in facebook
and twitter.
ENTER 2014 Research Track
Slide Number 30
30. Conclusions and Future lines
• The cultural attributes of the destination
are relevant in the discourses.
• There is a significant interest in the
subcategory of cultural activities: theater,
exhibition, jazz, ... etc.
ENTER 2014 Research Track
Slide Number 31
31. Conclusions and Future lines
• The third and fourth dimensions may be
considered in the communication strategy of the
DMO.
• Within the natural and cultural resources, that
gastronomy becomes the most relevant category
by the quantity of related terms and concepts
that appear especially in the user-generated
content by visitors in all sources.
ENTER 2014 Research Track
Slide Number 32
32. Conclusions and Future lines
• The named entity recognition and
categorization process provides many
variables that are out of the scope of this
paper and will be studied in future works,
like the sentiment of the speech, the most
important points of interest that are
commented, attractiveness of a place,….
ENTER 2014 Research Track
Slide Number 33
33. Conclusions and Future lines
• To sum up, the analysis of the usergenerated content using domain oriented
text mining tools provides an approach to
understand the destination in the way what
people says and perceives and what DMO
communicates.
ENTER 2014 Research Track
Slide Number 34
Editor's Notes
The user-generated have allowed substantial changes in the dynamic of the travel and tourism sector.
The user-generated content UGC (User Generated Content), and in particular, the online comments have allowed substantial changes in the dynamic of entire sectors. Such is the case of the travel and tourism sector.
Consumers no longer rely exclusively on official sources to obtain information about destinations and services. Thanks to new technologies and the growing number of platforms that support communication between different stakeholders, nowadays consumers can access vast amounts of information about products from different sources. According to some studies, visitors take into higher condideration UGC because those comments are based on experience (Pan et al., 2008). This allows consumers to obtain a more complete image of the destination and its products (Stepchenkova, 2013; Pang et al., 2011).
With the explosion of Web 2.0, blogs and social networks, consumers have a place where to share their experiences with different brands giving their opinions, positive or negative about any product or service (Çakmak, 2012). Thus, users opinion gains weigh against brands and traditional marketing techniques (Ditoiu et al, 2012). According to the National Statistics Institute (INE) of Spain, 80% of Internet users access the net to learn about products, brands and services. In another study conducted in 2009 by the Association for Research in Media (AIMC), 75.5% of Spanish Internet users have been documented on the internet before making a purchase of products or services.
The user-generated content UGC (User Generated Content), and in particular, the online comments have allowed substantial changes in the dynamic of entire sectors. Such is the case of the travel and tourism sector.
Consumers no longer rely exclusively on official sources to obtain information about destinations and services. Thanks to new technologies and the growing number of platforms that support communication between different stakeholders, nowadays consumers can access vast amounts of information about products from different sources. According to some studies, visitors take into higher condideration UGC because those comments are based on experience (Pan et al., 2008). This allows consumers to obtain a more complete image of the destination and its products (Stepchenkova, 2013; Pang et al., 2011).
With the explosion of Web 2.0, blogs and social networks, consumers have a place where to share their experiences with different brands giving their opinions, positive or negative about any product or service (Çakmak, 2012). Thus, users opinion gains weigh against brands and traditional marketing techniques (Ditoiu et al, 2012). According to the National Statistics Institute (INE) of Spain, 80% of Internet users access the net to learn about products, brands and services. In another study conducted in 2009 by the Association for Research in Media (AIMC), 75.5% of Spanish Internet users have been documented on the internet before making a purchase of products or services.
The user-generated content UGC (User Generated Content), and in particular, the online comments have allowed substantial changes in the dynamic of entire sectors. Such is the case of the travel and tourism sector.
Consumers no longer rely exclusively on official sources to obtain information about destinations and services. Thanks to new technologies and the growing number of platforms that support communication between different stakeholders, nowadays consumers can access vast amounts of information about products from different sources. According to some studies, visitors take into higher condideration UGC because those comments are based on experience (Pan et al., 2008). This allows consumers to obtain a more complete image of the destination and its products (Stepchenkova, 2013; Pang et al., 2011).
With the explosion of Web 2.0, blogs and social networks, consumers have a place where to share their experiences with different brands giving their opinions, positive or negative about any product or service (Çakmak, 2012). Thus, users opinion gains weigh against brands and traditional marketing techniques (Ditoiu et al, 2012). According to the National Statistics Institute (INE) of Spain, 80% of Internet users access the net to learn about products, brands and services. In another study conducted in 2009 by the Association for Research in Media (AIMC), 75.5% of Spanish Internet users have been documented on the internet before making a purchase of products or services.
The user-generated content UGC (User Generated Content), and in particular, the online comments have allowed substantial changes in the dynamic of entire sectors. Such is the case of the travel and tourism sector.
Consumers no longer rely exclusively on official sources to obtain information about destinations and services. Thanks to new technologies and the growing number of platforms that support communication between different stakeholders, nowadays consumers can access vast amounts of information about products from different sources. According to some studies, visitors take into higher condideration UGC because those comments are based on experience (Pan et al., 2008). This allows consumers to obtain a more complete image of the destination and its products (Stepchenkova, 2013; Pang et al., 2011).
With the explosion of Web 2.0, blogs and social networks, consumers have a place where to share their experiences with different brands giving their opinions, positive or negative about any product or service (Çakmak, 2012). Thus, users opinion gains weigh against brands and traditional marketing techniques (Ditoiu et al, 2012). According to the National Statistics Institute (INE) of Spain, 80% of Internet users access the net to learn about products, brands and services. In another study conducted in 2009 by the Association for Research in Media (AIMC), 75.5% of Spanish Internet users have been documented on the internet before making a purchase of products or services.
and as the mental portrayal of a destination (Crompton, 1979; Woodside and Ronkainen, 1993; Kotler et al, 1993; Middleton, 1994; Milman and Pizam, 1995; Alhemoud and Armstrong, 1996; Seaton and Bennett, 1996).
(structured surveys of human respondents, mixed-methods approach textual data from blogs and websites)
Our aim is to model the cognitive destination image of the region of Euskadi in Spain by visitors, having as reference the conceptual model proposed by Gomez, M. et al (2013) based on data collected from different digital media: destination sites, general social networks, traveler, blogs, reviews, ... etc. To achieve this it is necessary to investigate the meaning of the data to facilitate the generation of the model that helps the understanding the cognitive image of the destination.
In this analysis different channels were taken into account: channels focused on tourism, channels conducted by DMO's, DMO ́s specific channels in general social networks, ...
In this step the information is extracted either through API's ("Application Programming Interface") or web scraping techniques depending on the interfaces of data retrieval offered by the data providers.
just providing the user account, in this case “TvEuskadi”. The Twitter API limits access to the latest 3.200 states of a user, therefore this is the recovered sample from the 7.266 tweets published from this channel until 26/04/2013.
This API shows in a simple and consistent way the Facebook Social Graph (Facebook Social Graph). The social graph repesents graph objects (eg, people, photos, events, pages, etc..) and the connections among them (eg, friend relationships, shared content, labels forums, etc..).
The sample periodicity is defined according to the activity of the channel in terms of creation of new contents.
Furthermore, in Wordnet, nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms named synsets. This process is necessary because in natural language text, people often write the same entity in different ways.
As can be observed in the sample text (written in Spanish), categorization is based on listed entites. There are some location names as "Borda de Urbía", “Arantzatzu”, “Aitzkorri”. Also common names such as “hora”, “caminata”, “casona”, “café”, “sidra”, “patxaran”, “bocadillo”, “plato combinado”, “vistas”, “foto”, “tranquilidad”, “precio” that are entities that will belong to different categories.
Among these common names there are concepts we group in the gastronomy category: “café”, “sidra”, “patxaran”, “sandwich”, “plato combinado”. In addition, “paisaje”, “vistas” are part of the natural environment and natural resources.
In our actual experiment the text mining tool discovered a large number of interesting and important non- taxonomic conceptual relations, the atmosphere concept appears in many different and rich ways qualified as pleasant, charming, good, great, cozy, intimate, caring, lively, relaxed, disco, comfortable, enjoy the atmosphere, lively atmosphere, friendly surfer, footballing environment, unique atmosphere, cosmopolitan, sailor, summery and festive. The same way the term magic appears related to magical day, magical forest, climate magical, magical, magical labyrinth, magical atmosphere / magical moment, magical sunset, magical mountains. All this terms generated by the user match with a dimension of the model.
In the same way, the “people” term appears qualified as great, good, friendly, simple, nice, unique, endearing, enjoyable.
Refering the previous example from minube, we can see that the experience has to to with more than one dimension of the model. The location names are concepts aligned with the first dimension of the model, natural and cultural resources. The concepts refered to mobility, duration and price impact in the second dimension. Finally, Concepts like “buen lugar para reposar ” are matched with de atmosphere dimensión.
Entidades – common and proper nouns
Adjetivos y adverbios para la polaridad
The resulting model allows us to approach the cognitive destination image of the tourist about the Basque Country based on the user-generated content about experiences related to basque territory. The model implements the four dimensions that will capture the conceptual structures of the cognitive image of the destination by the perspective of the DMO and the visitor
As about 70% of the visitors in Euskadi are Spanish speakers, the sample of the elaborated discourse taken into account is almost completely in this language.
In addition, this source may be taken into account by the DMO as promotional channel.
In order to validate the Gomez, M. et al (2013) model it is important to underline that the perceived cognitive image based on user-generated content has a great impact on the first dimension
The presented conclusions are partial observations and interpretations that can be stated from such a valuable source and results.