This is my article on culture oriented user modeling
presented at the ASE International Conference on Social Informatics in Washington D.C. US on 14 December 2012. This work received the Best Paper Award.
The net is rife with rumours that spread through microblogs and social media. Not all the claims in these can be verified. However, recent work has shown that the stances alone that commenters take toward claims can be sufficiently good indicators of claim veracity, using e.g. an HMM that takes conversational stance sequences as the only input. Existing results are monolingual (English) and mono-platform (Twitter). This paper introduces a stanceannotated Reddit dataset for the Danish language, and describes various implementations of stance classification models. Of these, a Linear SVM provides predicts stance best, with 0.76 accuracy / 0.42 macro F1. Stance labels are then used to predict veracity across platforms and also across languages, training on conversations held in one language and using the model on conversations held in another. In our experiments, monolinugal scores reach stance-based veracity accuracy of 0.83 (F1 0.68); applying the model across languages predicts veracity of claims with an accuracy of 0.82 (F1 0.67). This demonstrates the surprising and powerful viability of transferring stance-based veracity prediction across languages.
The ladder of inference is an important tool in understanding how we think and helping others understand why we think what we do. When working with others, this is an important capability. This brief video explains how to use the ladder to develop your leadership capability.
The ladder of inference is an important tool in understanding how we think and helping others understand why we think what we do. When working with others, this is an important capability. This brief video explains how to use the ladder to develop your leadership capability.
Humans communicate on many levels: spoken language, tone, body language, style and personality. The fact that we have complex cultural identities and a host of differing past experiences increases the probability of cross-cultural miscommunications. This workshop presents major cross-cultural communication theories, ways that cultural values, power, privilege and differences affect the way we communicate, tools for questioning assumptions, and ways to improve cross-cultural communications skills.
The net is rife with rumours that spread through microblogs and social media. Not all the claims in these can be verified. However, recent work has shown that the stances alone that commenters take toward claims can be sufficiently good indicators of claim veracity, using e.g. an HMM that takes conversational stance sequences as the only input. Existing results are monolingual (English) and mono-platform (Twitter). This paper introduces a stanceannotated Reddit dataset for the Danish language, and describes various implementations of stance classification models. Of these, a Linear SVM provides predicts stance best, with 0.76 accuracy / 0.42 macro F1. Stance labels are then used to predict veracity across platforms and also across languages, training on conversations held in one language and using the model on conversations held in another. In our experiments, monolinugal scores reach stance-based veracity accuracy of 0.83 (F1 0.68); applying the model across languages predicts veracity of claims with an accuracy of 0.82 (F1 0.67). This demonstrates the surprising and powerful viability of transferring stance-based veracity prediction across languages.
The ladder of inference is an important tool in understanding how we think and helping others understand why we think what we do. When working with others, this is an important capability. This brief video explains how to use the ladder to develop your leadership capability.
The ladder of inference is an important tool in understanding how we think and helping others understand why we think what we do. When working with others, this is an important capability. This brief video explains how to use the ladder to develop your leadership capability.
Humans communicate on many levels: spoken language, tone, body language, style and personality. The fact that we have complex cultural identities and a host of differing past experiences increases the probability of cross-cultural miscommunications. This workshop presents major cross-cultural communication theories, ways that cultural values, power, privilege and differences affect the way we communicate, tools for questioning assumptions, and ways to improve cross-cultural communications skills.
Managing content to enhance member valueSteve Drake
Managing content can enhance member value of your association. This presentation outlines key elements of association content marketing and management and ends with notes from brainstorming teams of association professionals attending the presentation at the St. Louis Society of Association Executives June meeting.
Emails as Social Media: How and Why Non-Profits and Educational Orgs Should U...Dan Jones
In 2015, our organization went from having an clunky, outdated, legacy email list to a modern system. By giving our email list the same attention we do our social media channels (Facebook, Twitter, etc.), our impact has grown measurably. This is how we did it.
Using Social Media to Promote Your Research (Translate MedTech edition)Kirsten Thompson
Using Social Media to Promote Your Research is a workshop developed by Kirsten Thompson and Sally Dalton, University of Leeds. It was facilitated in June 2019 as part of the Translate MedTech programme for the Yorkshire and Humber region.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
More Related Content
Similar to A User Modeling Oriented Analysis of Cultural Backgrounds in Microblogging
Managing content to enhance member valueSteve Drake
Managing content can enhance member value of your association. This presentation outlines key elements of association content marketing and management and ends with notes from brainstorming teams of association professionals attending the presentation at the St. Louis Society of Association Executives June meeting.
Emails as Social Media: How and Why Non-Profits and Educational Orgs Should U...Dan Jones
In 2015, our organization went from having an clunky, outdated, legacy email list to a modern system. By giving our email list the same attention we do our social media channels (Facebook, Twitter, etc.), our impact has grown measurably. This is how we did it.
Using Social Media to Promote Your Research (Translate MedTech edition)Kirsten Thompson
Using Social Media to Promote Your Research is a workshop developed by Kirsten Thompson and Sally Dalton, University of Leeds. It was facilitated in June 2019 as part of the Translate MedTech programme for the Yorkshire and Humber region.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
4. The Lewis Model of Cultures
Richard Lewis (2000) “When cultures collide:
Managing successfully across cultures”
Hispanic
America
MULTIACTIVE
Italy,
Portugal,
Spain
Argentina,
Brazil,
Chile,
Sub-Saharan
Mexico
Africa
USA
China
LINEARACTIVE
REACTIVE
UK
Germany,
Switzerland
Japan
Vietnam
4
5. Personality Traits
Multi-active
Linear-active
Talks half the time
Does one thing at a time
Plans ahead step by step
Polite but direct
Partly conceals feelings
Talks most of the time
Does several things at once
Plans grand outline only
Emotional
Displays feelings
Reactive
Listens most of the time
Reacts to partner’s action
Looks at general principles
Polite, indirect
Conceals feelings
5
6. Personalizing E-commerce
• Customized product descriptions
• User preferences and previous purchase history
- may not be directly available or not up to date
• Targeted advertisements
Web Site
Advertisiment Platform
http://google.com
Search Results
http://amazon.com
Web shop
http://groupon.com
Web site and e-mail
http://triggit.com
Facebook
6
7. Culture-oriented User Modeling
•
•
•
•
Adapting Applications to Cultural Origins
Using Social Web Data
Finding Microblogging Patterns
Creating Culture-oriented User Profiles
describing specific user preferences
When cultural background is not known,
can we find cultural cues from microblogs ?
7
8. Inferring User Cultural Traits
Culture-specific
User Traits
Differences in Behaviour
Microblogging
Patters
Differences in Microblogging
Adaptation
Employing User Profiles
Culture-oriented
User Modeling
Creating User Profiles
8
9. Content
Activity
• Tweeting Mobility (geo-locations)
• Posting on Weekends
• Friends and Followers
• User Mentions
Conversation
• URLs and Hashtags
• Automatically-detected Languages
Social
Twitter-specific Features
• Retweets and Replies
9
10. Example: German User
• User A from Berlin, German language specified in
Twitter Profile
• URLs and Hashtags: 49 and 4
• Automatically-detected Languages -2
• Tweeting Mobility (geo-locations) -7
• Posting on Weekends -23 out of 100
• User Mentions - 75
• Friends and Followers: 50 and 96
• Retweets and Replies: 1 and 28
11. Example: German User
Gute Nacht! #TWoff
Weekend’s Tweet
Language: de
tweet place: Berlin
I'm at Laroy w/ @username http://t.co/Ct0ObmPz
URLs: 0
Workday’s Tweet
Tags: 1
Language: de
Mentions: 0
tweet place: Sweden 1 w/ @username) http://t.co/8c
Tschüss Madrid :) (@ Terminal (Stockholm)
URLs:
Detected Languages:
Workday’s Tweet 1
Language: Tags: 0
es
Mentions: 1
English: 43
tweet place: Spain
German: 23
URLs: 1
Other: 34
Tags: 0
Mentions: 1
Twitter User Profile Information
Location: Germany (Berlin)
Language: German
14. Crawling & Data Processing
Twitter
API
Retrieve Streams
(CURL)
Performance
Report
Tests
(Matlab)
Store JSON
(java)
MySQL
(Tweets)
MySQL
(User Profiles)
Select and Store
Features (java)
15. Country Total Number Users Posted 100
Of Users
or More Tweets
Japan
4885
2984
Spain
4906
3119
Brazil
4910
2935
USA
1714
1316
Germany 2823
1644
1 199 800 tweets
19. Classification Models
1
Language Codes
Number of
LANG
DEF
3
DEF+LANG
19
• URLs
• Hashtags
• Automatically-detected
Languages
• Geo-locations Detected
• Posts on Weekends
• Friends
• Followers
• User Mentions
• Retweets
• Replies
2
20. Decision Tree (LANG Feature)
Language Code
>= 4.5
< 4.5
JP
>= 3.5
< 3.5
>= 2.5
< 2.5
< 1.5
< 0.5
>= 1.5
>= 0.5
BR
DE
DE
BR
ES
Language
Japanese
Spanish
Portuguese
German
English
Other
Code
5
4
3
2
1
0
20
25. Key Findings (Cultural Groups)
• Linear-active Users prefer sharing URLs and Hashtags,
and have larger social networks.
• Reactive users do not share so many Hashtags, they,
however, tend to Reply more than Multi-active users.
They employ the least of foreign languages, have lowest
tweeting mobility and tweet mostly on Weekends.
• Multi-active users generally employ more foreign
languages in their content.
25
26. Key Findings (Country Groups)
• German users share the most of Hashtags and tend to
reply;
• Users from the USA share the most of URLs, have
largest social networks than others and tweeting
mobility;
• Spanish users tend to retweet and mention other users;
• Brazilian users reply the least;
• Users from Japan tweet the most on weekends and
share the least of hashtags and user mentions, employ
the least of foreign languages and have lowest tweeting
26
mobility.
27. Adaptation Options
When appropriate, creating adaptive apps such as ecommerce or social network web sites to fit user
preferences for:
• sharing content;
• employing foreign languages;
• changing locality;
• communicating with other users.
27
28. Further Work
• Employ larger data set;
• Include more countries and add features;
• Extend our platform for other social networking
web sites;
• Recommending products/content in accord to user
cultural origings
28
29. Conclusions
Culture-oriented User Modeling
• Found microblogging patterns for cultural groups
• Employed them for identifying cultural origins
• Got insights on culture-oriented user modeling and
adaptation
29
@article{ilina2012user, title={A User Modeling Oriented Analysis of Cultural Backgrounds in Microblogging}, author={Ilina, Elena}, journal={HUMAN JOURNAL}, volume={1}, number={4}, pages={166--181}, year={2012} } Full-text is at: http://ojs.scienceengineering.org/index.php/human/article/view/43/18
On this slide you see the outline of my presentation.The main question I was concerned with washow different cultural groups could be identified on microblogs like Twitter.In this context, we can ask ourselves, why is it important to understand user cultural backgrounds?How could such knowledge be exploited for adaptive applications?For addressing these questions, I used a sociological study called the Lewis Model of Cultures.This model describes cultural communication differences, which I also tried to find in Twitter micro-blogs.
The Lewis model is represented as a triangle where apexes show extreme cultural dimensions.The cultural dimensions are linked with countries of origin and explain persons’ communication attitudes.Multi-active people from Hispanic America and Brazil focus on dialogs with other people and generally display their feelings.In opposite, Reactive people from Vietnam tend to conceal their feelings, they are generally very polite and good listeners.Linear-active people from Germany and Switzerland are generally great organizers and focus on planning activities.All countries in between these extremes have mixture of cultural dimensions.Note that this is a general model – some users might be “outliers” and will not fit in their “county-stereotypes”.
http://www.google.nl/imgres?imgurl=http://www.crossculture.com/UserFiles/Image/LMR-table-new.gif&imgrefurl=http://senseinzanzibar.wordpress.com/page/3/&usg=__DT4TjH1NW71F5RrJycaRiY1260I=&h=379&w=500&sz=25&hl=en&start=0&sig2=pYvsvRCCvLYx4bv5iFL6Pw&zoom=1&tbnid=TMMQEAt9p-zxQM:&tbnh=96&tbnw=126&ei=WEI-UIXMD-bP0QWi-IHICQ&itbs=1&iact=hc&vpx=74&vpy=106&dur=44&hovh=196&hovw=258&tx=119&ty=150&sig=115419424352996972569&page=1&ndsp=2&ved=1t:429,r:0,s:0,i:52In The Lewis model, cultural dimensions are associated with personality traits.I found this idea appealing to me.I have met a number of talkative multi-active persons and am fascinated by their ability to talk eloquently and using gestures a lot.However, for linear-active people encounters with such multi-active persons can be sometimes overwhelming. Also, impatience and concealed feelings of linear-active persons can be perceived as offensive and cold to multi-active persons.Lewis explains that most of us are multi-active, but businesses mostly address linear-active customers. This can lead to lost sales.We could employ different strategies for targeting persons from different cultures.When a customer is multi-active, we could focus on more emotional side of the product rather than just listing factual characteristics appealing more to linear-active persons.
E-commerce web sites like Amazon can benefit from customised product descriptions and behaviour targeted advertisements.User location, languages used and previous purchases can be collected from history logs.When user is new to the system, such information is not available.As a solution, user preferences could also be collected from the social web and micro-blogs.Targeted ads are already created using data on user preferences collected from social networking sites.For instance, one company (http://triggit.com/) uses Facebook for placing their advertisements and collecting user data from a web statistics service.
Deleted {At this point, you might ask yourself:Why would there be a need to identify user cultural origins? And how could this be exploited in information and web systems?} Information on the cultural origin of a user is important for improving the user experience in adaptive applications, requiring for instance Location of an userA functionality or particular Design preferences.Often, such information is not available, for example because a user is new to the system or has not given details.In this case, user characteristics and preferences can be mined from the social web.From the Twitter micro-blogs employed in my experiments, I could find out amongst others User locationsLanguages usedSocial connections. Further, as we will see on next slides, our experiments has shown that users from different cultural groups blog differently.Cultural microblogging patterns could then be employed to build culture-oriented user profiles.
Now, the question arises on how we could learn about user traits based on the cultural background of a user?For this, I propose to derive behavioral patterns by mining the microblogging activities of users.With known microblogging patterns, I then identify a user as belonging to a particular cultural group.This allowed me to create user profiles with preferences information on culture-specific user traits, to be used in the adaptation process.
I assumed that Twitter data can be used for finding cultural patterns in microblogging behavior.Indeed, with millions of active users and easy to access open profiles, Twitter is an excellent platform to study microblogging behavior.The features I have analyzed include amongst others usage of URLshashtagsuser mentions languages employedThey were grouped into the following feature groups: Content-based featuresActivity-based featuresSocial Network-based features Conversation-based features These features were defined as indicators for cultural patterns.It is important to mention that only the Twitter features usage for the selected user groups was analyzed, no private information was stored or shared as result of experiments.
select userid where languages>2 and countries>2 and origin like "germany" limit 5;select test,urls,tags,languages,countries,weekend,users,friends,followers,userRetweet,userReply from features where userid="452248921” and test=5; (test=5)select id,username,origin,language,location,timezone from User_culture where id="452248921";Consider the following example for feature selection:Take a user from Berlin, Germany, having as preferred language German in his profile.In randomly selected 100 of his tweets we found 49 and 4 URLs and hashtags respectively.In his content, we automatically detected two languages, English and German.Each tweet’s meta-data can contain also geo-coordinates or the place of the tweeting.We identified 7 different countries in the meta-data of 100 tweets.The user also tweeted most of the time on workdays.He had 50 friends and 96 followers. In his 100 tweets we found 75 mentions of other users.Out of 100 posts, he had 1 retweet and 28 replies to other users.
select count(language),language from languages where userid="452248921" group by language;select tweets_culture.id,tweets_culture.content,languages.language from tweets_culture,languages where tweets_culture.id=languages.id order by rand() limit 5;Mode=1112Here we see three tweets posted by our user, in English and German.Since tweets are short, informal and include hashtags or URLs, automatic language detection is challenging.For detecting languages, a threshold was used, disregarding a specific language when lower than 5.Similarly, we deal with locations identified with the help of geonames web service and Google Maps. A location is disregarded when it is detected less than 5 times out of the 100 tweets for a user.
I will now come to the experimentation setup, data collection process and present the results.In the experiments I try to find microblogging behavioral differences for people from different cultural origins. Next, found microblogging patterns are used to classify users in their respective user groups.The experimental setup consists of the five main steps:Users whose tweets originate from respective geographic locations were selected.Next, their tweets were collectedAnd, user profiles based on the meta-data and content of the tweets were createdTo classify users into their respective cultural user groups I used decision treesAs a final step, I have assessed the classification performance.
This slide shows the initially defined geo-coordinated for five selected countries: Japan, United States, Brazil, Spain and Germany.We used these coordinates to identify users tweeting around the selected places.And, also users tweeting from intermediate locations defined by the bounding geo-coordinates box.We employed Streaming Twitter API to retrieve Users for the 5 countries analyzed. After running the CURL for several days, we had a list of users and their preferred languages defined in the Twitter profile.The information was further stored in the MySQL database.Matlab mode=1111https://maps.google.com/maps/ms?ie=UTF8&hl=en&t=h&oe=UTF8&msa=0&msid=103387750622659154041.0004537732c8262c4ae94crawling.kmlselect location, count(userid) as n from features_final3 where origin='usa' group by location;South San Francisco
On this slide we see the simplified Lewis Model of Cultures.Lewis explains, thatlinear-active persons are cool and factual plannersmulti-active are generally warm, emotional and loquaciousreactive persons have different time perceptions, and generally are dialog and people-centeredBased on this, I established the following hypothesises:linear-active prefer using hashtags and URLs to organize their Tweeting posts and share factual information with other users (SUPPORTED)multi-active persons might have greater social networks (NOT SUPPORTED), retweet and reply the most (SUPPORTED), and might employ more foreign languages in their posts compared to other users (SUPPORTED)reactive persons might tweet mostly on weekend and reply a lot (SUPPORTED)Further results showed thatlinear-active persons have larger contacts network on Twitter and have greater tweeting mobility. They tweet from different locations the most.This finding could be explained with the use of Twitter for business purposes (advertisement) by Americans and Germans (which could be further investigated) Overall, a different behavior between the analyzed user groups was found.
The next two slides present clusters of user groups by countries and cultural groups. They show the differences between user groups in the respect of the features set which includes hashtags, user mentions, URLs and other aforementioned elements.Based on Multivariate Analysis of variance, two variables are used to distinguish between user groups. These variables are calculated from the means of features analyzed.Here we can see for instance, that the German user group considerably overlaps with other user groups. This appears to reflect the behavioral similarities between the German user groups and others.In contrast, Japan is depicted at the lower portion, showing that users from Japan appear to behave quite differently from other user groups.
This slide shows the clusters for the user groups according to the Lewis model. On the culture-level, variable c1 helps to separate the reactive user group depicted in the red cluster from the other two clusters, multi-active users and linear-active users. Considering the analyzed features set, this indicates that reactive users from Japan behave indeed differently on Twitter. The distance between group means for Reactive and two other user groups is greater than between linear-active and multi-active user groups.Having determined the distance between the means of the user groups, I was intrigued if this feature set could be used to predict users’ belonging to a respective user group.
For this, I created three main classification models.The first model used language codes to predict users’ groupsThe second one (Default) was used to classify users according to their usage of the features listed on the rightThe third model is a combination of the previous two models.
The first decision tree is based on the languages defined in the Twitter user profile.A language code is assigned to each language (as shown on the right). In case a language does not match one of these five, the code value is zero.Interestingly we see that the lower two branches define Brazilian and German user groups, instead of users from the US.Why does this happen? And what does this imply?
As it appears, many users from Brazil and Germany defined English as their preferred language in the Twitter User Profile.The decision tree trained on languages defined in the user profile does not give good prediction results.And, not surprisingly, about a fifth of users were not classified correctly to their respective country group.If we would assume that users who defined English in their user profiles were from the US, we would misclassify large fractions of Brazilian and German users.
I tried to solve this issue by introducing Twitter-specific features.Knowingly, users from a defined country and cultural group employ Twitter features differently.For instance, Japanese users refrain from tagging and mentioning other users, while users from the US use URLs the most.The experiments showed that a decision tree for predicting user groups based on the ten features mentioned before can be created.
We see that the testing error improved slightly compared to the language code feature of the previous test. However, the cross validation error increased two-fold.The decision tree based on the Default features overfits the training set. It seems that the tree structure is sensitive to the training set.Since I would like to have a better performance on the new data set, I combined the aforementioned feature set to get a new decision tree based on the DEF features and language codes.This helped in further decreasing the testing and cross-validation errors.
To achieve the smallest cross-validation error, the decision tree can be cut back (pruned) to the number of nodes.Having 14 terminal nodes, it is possible to achieve more than 90% accuracy (96).As can be seen from the decision tree shown on this slide, the Number of hashtags, Replies, User mentions, Posts published on weekends, and languages identified in user content are useful features next to the language code defined in the user profile.With these features I can associate users to their respective cultural group of the Lewis model.
This slide shows the key findings on the Lewis Cultural Dimensions for the users analyzed.Linear-active users from the US and Germany tend to share URLs and hashtagsAnd have larger social networks in average.This finding is important to keep in mind when considering building social networking applications which enable sharing content and creating social networks.For instance, German users could be provided with easy access to a hashtag sharing functionality.Since Japanese users employ the least of foreign languages and have the lowest tweeting mobility, locality features could be less noticeable in an user interface.In contrast, multi-active users could be provided with more flexible locality functionality to assist in languages selection.
The same, microblogging behavior patterns could also be exploited on a country group level.Since US and German users tend to share URLs and Hashtags, it could be further investigated if sharing hashtags and URLs is used for organizing content or promotional purposes. Already the finding that linear-active people have also the largest social networks appears to support this assumption.It is well known that Twitter and other social networks are widely used for marketing and other business purposes.This is why the implications of different cultural behaviors is paramount for globally-oriented businesses.
It could be suggested to employ social networking patterns in order to find out user preferences towards sharing content such as hashtags and URLs, or localization and conversation needs.This could allow suitable adaptation strategies for applications such as e-commerce dealing with customers from different cultural backgrounds
In a nutshell, I described an approach to mine cultural patterns from microblogs.Ten features on Twitter were analyzed to understand user attitudes in sharing content, social networking preferences and activities.With this, interesting microblogging patterns for more than 10 000 users from five selected countries were found.This enabled me to predict user origins on country level and the user’s cultural dimension in the Lewis model.In our future work, I would like to employ a larger user dataset and include more countries in the analysis.Adding more features and more social networks into the analysis might be very useful to gain a better understanding of cultural differences of a user behavior online.The next target would be to recommend products or content in accord to user cultural origins
The main aim of the study was to employ user microblogging activities for creating culture-oriented user profiles.We found microblogging patterns for the selected user groups.I then employed cultural microblogging patterns for identifying cultural origins of users.These cultural differences can be taken into account when implementing adaptive applications or social networking web sites.Finally, I provide insights on culture-oriented modeling and further adaptation.
Thank you very much.I will be happy to answer your questions.