http://arxiv.org/abs/1511.00722
Text actionability detection is the problem of classifying user authored natural language text, according to whether it can be acted upon by a responding agent. In this paper, we propose a supervised learning framework for domain-aware, large-scale actionability classification of social media messages. We derive lexicons, perform an in-depth analysis for over 25 text based features, and explore strategies to handle domains that have limited training data. We apply these methods to over 46 million messages spanning 75 companies and 35 languages, from both Facebook and Twitter. The models achieve an aggregate population-weighted F measure of 0.78 and accuracy of 0.74, with values of over 0.9 in some cases.
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Identifying actionable messages on social media
1. Identifying Actionable
Messages on Social Media
Nemanja Spasojevic, Adithya Rao
October 29, 2015
@ 2015 IEEE International Big Data Conference
Workshop on Mining Big Data in Social Networks
4. Problem Setting
Identify actionable social media messages in context of an agent of the company.
Actionable message - clear call to action.
Eg. raising an issue to which agent may provide helpful response.
5. Problem Setting
● 75 companies / brands
● 35 languages
● Facebook & Twitter
Identify actionable social media messages in context of an agent of the company.
Actionable message - clear call to action.
Eg. raising an issue to which agent may provide helpful response.
7. Challenges - Company Behaviour
Depending on the company:
1. Categories
a. Media Distribution
b. Telecommunications
c. Retail (Apparel, Beauty, Electronics...)
d. Airlines
e. Manufacturers
2. Different Objectives
3. Various Data Coverage Across Companies
8. Challenges - Social Network
Spoken language depends on platform.
Twitter
Facebook
*public data
9. ar:2015)اﻛﺘﻮﺑﺮ )ﻋﺮضnواﯾﺮﻟﺲ ﻣﺠﺎﻧﺎ راوﺗﺮ ﺑﺎﻷﺿﺎﻓﮫ ﺷﮭﻮر 10 ﻟﻤﺪة %50 ﺧﺼﻢ ﻋﻠﻰ وأﺣﺼﻞ أﺷﺘﺮك , اﻟﺠﺪد ﻟﻠﻌﻤﻼء ﻣﺤﺪودة وﻟﻔﺘﺮة اﻷنnﺑﺎﻗﺎت ﻋﻠﻰ اﺣﺼﻞ LINKDSL ﺑﺎﻹﺿﺎﻓﺔnﻻﺳﻠﻜﻲ راوﺗﺮ ﺑﻠﺲ ﻟﯿﻨﻚ و ﺳﺒﯿﺪ ﻟﯿﻨﻚ و ﻣﺤﺪودة اﻟﻐﯿﺮ *ﻟﻠﺒﺎﻗﺎتn)ﺧﻼل اﻟﺒﺎﻗﺎت أﺳﻌﺎر
اﻟﻌﺮضnﺗﻔﺼﯿﻠﻲ اﻟﻌﻨﻮانnاﻟﻤﺤﺎﻓﻈﺔ و اﻟﻤﻨﻄﻘﺔnارﺿﻰ رﻗﻢnﻋﻨﻮاﻧﻚ و ﺛﻼﺛﻲ اﺳﻤﻚnرﻗﻢ
bg:Всем привет! Пожалуйста подпишитесь на мой канал на twitch. Я делаю трансляции по играм:forza horizon 2,forza motorsport 5 GTA 5 online, Battlfield Hardline,watch dogs,world of tanks. Вот ссылки:
http://www.twitch.tv/lexus9949http://www.twitch.tv/lexus9949http://www.twitch.tv/lexus9949http://www.twit
cs:Dobrý den,ntak po nějakém čase vás zdravím. Přístroj jsem reklamoval a ukázalo se, že se jednalo o vadný kryt přístroje. Což bylo alespoň trochu potěšující, že popsané chování přístroje nebyl záměr, ale chyba. Z
Vašich reakcí se zdálo, že chyba bude všude jinde než v HW.nnBohužel po tom, co byl přís
da:Hej igen nnJeg har nu været forbi, jeres Nokia steder her i Paris og de nægter at have noget med det at gøre.. Og jeg bor på ingen måde i nærheden af de steder!!! Undskyld men det er jo absurd at give nogle
adresser som nægter at gøre det!!! Så hvad er det for en service eller virkelig mangel på sam
de:Gewinne jetzt auf Airbnb die Wohnung des Weihnachtsmannes und feiere Silvester im hohen Norden!rnrn1. Registriere dich bei Airbnbrn2. Klicke auf «Nimm Kontakt auf» unterhalb der Beschreibungrn3.
Wähle beim Check-In den 27.12. ausrn4. Antworte im Kontaktfeld, warum du dich als Haussitter eignestrn5.
es:Buenos díasnnHe recibido un mail en el que se me informa, que a partir nde septiembre, mi tarifa pasa a tener un coste de 43,95€, cuando yo la ncontrate por un precio de 41,95€.nnEn el mismo mail se me
indica que si nlo deseo, puedo efectuar la baja de mi linea sin CP.nnHe llamado al nservicio de At
et:ini digi suka tipu org n這digi很喜歡騙人的 n請大家小心這個 digi了n他昨晚block我facebookn我還有很多 facebookn我不怕他block我n現在我那個 Facebook他block了n他怕我說他的丑事情 n我的
東西有一年了 n他還沒有去做 n每天上網打電話有問題 n他一直說我電話有問題 n我也在他公司買了一架手機試試 n也是一樣上網有問題 n他還一直說我手機壞了的藉口 ndigi是一個大騙子 n我
要求退出digi向他拿回裡面的錢轉台 n他不給我轉台 n一直叫我用完裡面的錢才轉台 n你媽媽
fa:ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ
ﺷﺮ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ ﺷﺮﻛﮫ ﻧﺼﺎﺑﮫ
fi:Mitenkäs kalenteriin sitten pystyy lisäämään sellaisen henkilön syntmäpäivän jolle ei ole puhelinnumeroa eikä kyseinen henkilö ole missään Facebookissa tai vastaavassa. Puhelintietojen taakse kun tallentaa
syntymäpäivän niin ei sekään näy kalenterissa. Ainoastaan facessa olevien henkilöiden.
gu:િમત ્તા ની તે કઇ યા યા હોય...!nિમત ્ કઇ વહાલો દવલો હોય...!nઅને વહાલો દવલો હોય તો એ 'િમત ્' હોય?nnમન ની સાથે મહક ઉઠnઅનેnમા યલો ની સાથે મલક ઉઠnએ િમત ્. ..nnિમત ્nએ જ ની
સંગાથેnકોઇ પણ ગાથા રચાતી હોયn ને ખભે હાથ રાખીn જદગી ગાતી હોય નાચતી હોયn
he:!!!!?????!!!עקיצהnשל לוגו עם הזה המייל את קבלתי PayPalnוכדומה בנק חשבון פרטי לקבל כדי !!!!עקיצה ניסיון כמו לי !!!נראהnזה מה ????!!!!n---nn n nתגובתך את שנקבל עד הוגבל שלך החשבוןn,שלוםnבחשבון בעיה בפתרון לעזרתך זקוקים אנחנו
פת למצוא זמן לנו שיהיה כדי .שלך
hi:बीड़ी अबrnCIGARETTErnबन गयी,rnचटाईrnCARPET बन गयी,rnमु के बाजीrnBOXING बन गयी,rnकु ती हमार rnWRESLING बन गयी,rn ग ल डंडाrnCRICKET बन गया,rn..हमारा भारतrnGREAT बन
गया..rnगाय हमार rnCOW बन गयी,rnशम हया अबrnWOW बन गयी,rnकाढ़ा हमाराrnCHAI बन गया,rnछोरा ब
id: Microsoft rn Oo......................... rn rnnBila ada langkah membekas lara Ada kata merangkai dusta Ada tingkah menoreh luka Mohon maaf lahir dan
bathin Selamat hari raya Idul Fitri 1436 Hrnn [[488232091206393]]rn.................................oO rn
it:Egregio Sig. Boemio,nn nnLa ringraziamo per aver contattato Microsoft mobile devices support.nn nnLe scriviamo in riferimento alla Sua telefonata e alla Sua richiesta di riparazione in garanzia, per
confermarLe che in data 06.08.2015 il Laboratorio Nazionale Prima Comunicazione S.r.l. ha effettuato
ja:★█▓▒ ░►rnしѺ√乇し【ツℓツ】rn✔◄░▒ ▓█★rn✿ ℓ∴ ✿rn✿ ㅜℓ ✿rn✿ ∴ ✿rnn>●●»•••«●●»•••« ●●rn✔rn>··٠٠••●●❥❥❥❥ ✔rn>··٠٠••●●❥❥❥❥ ✔rn
>··٠٠••●●❥❥❥❥ ✔rn·n★█▓▒ ░►✔rnしѺ√乇し【ツℓツ】rn✔◄░▒ ▓█★rn✿ ℓ∴ ✿rn✿ ㅜℓ ✿rn✿ ∴n★█▓▒ ░►✔rnしѺ√乇し【ツℓツ】rn✔◄░▒ ▓█★rn✿ ℓ∴ ✿rn✿ ㅜℓ
✿rn✿ ∴ ✿rnn>●●»•••«●●»•••«
kn:Jesus's ೆಸ ನ ಾ ಎ ೈ ೆ !!!!!!! n ◌ೕವ ಧನ ಾದಗಳ ! ೆ ೖ ಬಯ ದ ೆ ... ನನ ೆ ೇ ಸಲು! nಆ ೆ ೖ ೇವರು, ಯೂ ವ , ಕತ ಾದ ೕಸು ಸ ನ ತಂ ೆಯ ಸೃ ಕತ , ◌ೕವ ಆ ◌ೕ ಾ ದ, ಮ
Challenges - Language
10. Methodology
Domain attributes: Company (c) , Language (l) , Source (s)
Eg. fully specified domain: D = {c, l, s}
Consider all domains: P(D) = {{}, {c}, {l}, {s}, {c, l}, {l, s}, {c, s}, {c, l, s}}
Example: P({yahoo, en, tw}) =
{{}, {yahoo}, {en}, {tw}, {yahoo, en}, {en, tw}, {yahoo, tw}, {yahoo, en, tw}}
Model Building: Build classifiers CD*
for all D* ∈ P(D) , and choose classifier with best F
measure evaluated on validation set.
17. Feature Generation Example
Thanks @<COMPANY> for crashing and keeping us waiting all day for a
response! Really appreciate it!! #waste
"WORD-SENTI_WORD_NET_POSITIVE-SCALED" : 0.088
"WORD-SENTI_WORD_NET_NEGATIVE-SCALED" : 0.025
"KEYWORD_ACTIONABLE-GENERAL-SCALED" : 0.305
"KEYWORD_NON_ACTIONABLE-GENERAL-SCALED" : 0.2
"CHAR_EXCLAMATION_MARK" : 1.0
"CHAR_EXCLAMATION_MARK_0" : 0.642
"CHAR_EXCLAMATION_MARK_1" : 0.821
"CHAR_AT" : 1.0
"CHAR_AT_0" : 0.934
"CHAR_HASH" : 1.0
"CHAR_HASH_0" : 0.902
"EMOTICON-GENERAL-PRESENT" : 1.0
"EMOTICON-POSITIVE-PRESENT" : 1.0
"EMOTICON-POSITIVE-SCALED" : 0.25
"EMOTICON-NEGATIVE-PRESENT" : 1.0
"EMOTICON-NEGATIVE-SCALED" : 0.75
"WORD-GENERAL-GREATER_THAN-10" : 0.17
"CHARACTER-GENERAL-GREATER_THAN-100" : 0.121
18. Feature Generation Example
Thanks @<COMPANY> for crashing and keeping us waiting all day for a
response! Really appreciate it!! #waste
"WORD-SENTI_WORD_NET_POSITIVE-SCALED" : 0.088
"WORD-SENTI_WORD_NET_NEGATIVE-SCALED" : 0.025
"KEYWORD_ACTIONABLE-GENERAL-SCALED" : 0.305
"KEYWORD_NON_ACTIONABLE-GENERAL-SCALED" : 0.2
"CHAR_EXCLAMATION_MARK" : 1.0
"CHAR_EXCLAMATION_MARK_0" : 0.642
"CHAR_EXCLAMATION_MARK_1" : 0.821
"CHAR_AT" : 1.0
"CHAR_AT_0" : 0.934
"CHAR_HASH" : 1.0
"CHAR_HASH_0" : 0.902
"EMOTICON-GENERAL-PRESENT" : 1.0
"EMOTICON-POSITIVE-PRESENT" : 1.0
"EMOTICON-POSITIVE-SCALED" : 0.25
"EMOTICON-NEGATIVE-PRESENT" : 1.0
"EMOTICON-NEGATIVE-SCALED" : 0.75
"WORD-GENERAL-GREATER_THAN-10" : 0.17
"CHARACTER-GENERAL-GREATER_THAN-100" : 0.121
19. Results
Strategy Performance Model:
A - Logistic model for {c, l, s}
B - Logistic model for {}
C - Logistic for best domain
D - Best model-domain
(A) (D)(C)
23. Conclusion
● Scalable Framework for actionability detection on Social Media
● Framework for model building that works across
○ 75 companies
○ 35 languages
○ Twitter And Facebook
● User Perceived
○ F measure - 0.78
○ Accuracy - 0.74
● Future Work - Deep Learning , Better Sentiment Analysis,...