A formal model to the routingquestions problem in the context of              twitter       Cleyton Caetano de Souza
Schedule1. Introduction  1. Problem2. Related Works3. The model  1. The problem  2. Details4. A solution to the model5. Co...
Introduction• Web has became essential  – Web, a repository of information• Search Engines  – Looking answers• Social Netw...
Problem• Could occurs problems when you publish your  question  – None answer  – None see  – Many answers• Direct the answ...
Problem• Informally, the problem that we proposes to  solve is given a question posted by a user  (asker) in Twitter, find...
Related Works• (Morris, Teevan e Panovich 2010a)  – 93.5% of users received answers to their question    after post them a...
The Model• The twitter is defined by the tuple                   𝑇 = {𝑈, 𝑅}• Where 𝑈 = {𝑢1 , … , 𝑢 𝑈 } is a set of users• ...
The Model• Each useru has the attributes  – 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 that contains all users which follows 𝑢  – 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑢 that contai...
The Problem   Given a query 𝑞 posted by 𝑢,   𝑓 ∈ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 and 𝑝 𝑓,𝑞 a function   that tell us the chances of    𝑓 provi...
The problem• We believe that 𝑝 𝑓,𝑞 has a correlation with  three things  – 𝑘 𝑓,𝑞 – the knowledge that 𝑓 in relation with 𝑞...
Knowledge• Each message 𝑚 𝑢 corresponds a fraction of  the total expertise of 𝑢                 𝑘𝑢 =                  𝑘   ...
Knowledge• If 𝑡 𝑞 is the frequency of the token 𝑡 in 𝑞, the  knowledge needed to answer satisfactorily the  question is ca...
Trust• Trust is related to  – Friendship [Schenkel et al 2008]  – Similarity [Kuter and Golbeck 2010]• So we believe (and ...
Friendship• Friendship measures the importance of a user  to another• In Twitter a good estimative of friendship  should c...
Similarity• The similarity measures how to users are  equal under some criterion• Appears intuitively that the similarity ...
Similarity• Any combination of this equations could be  used• We choose use             𝑠𝑖𝑚1 𝑢, 𝑣     𝑠𝑖𝑚2 𝑢, 𝑣     𝑠𝑖𝑚3 𝑢...
Activity• Users not interact with the same intensity• It seems intuitive that the activity level of a  user depends on the...
Activity• Activity means the mean time between the  messages posted by 𝑢                             |𝑀|        𝑡𝑜𝑑𝑎𝑦 − 𝑑 ...
Solving the Model• Calculate the tuples (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) to each  user is a simple task• But, how decides who is the...
Solving the Model• We consider this is a problem of decision  making with multiple criteria• We decide to use the Weight P...
Solving the Model-Step 1• The resolution of the model starts calculating  the tuple (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) to each user   ...
Solving the Model-Step 2• The we display this users in a matrix   𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 𝑥|𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 |                      Cleyton...
Solving the Model-Step 3• We create a function 𝑚𝑎𝑝 𝑥 which will map  the values of (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) in a same scale ...
Solving the Model-Step 4• For each pair 𝑓1 , 𝑓2 |𝑓1 ≠ 𝑓2 we calculate                      𝑥                  𝑦           ...
Solving the Model-Step 5• If 𝑝 𝑓1,𝑓2 > 0 we put 1 in position (𝑓1 , 𝑓2 ) and 0  in position (𝑓2 , 𝑓1 )• If 𝑝 𝑓1,𝑓2 < 0 we ...
Solving the Model-Step 5          Cleyton-UFCG     26
Solving the Model-Step 6 (End)• We calculate the sum of each line of the  matrix, this number represents the number of  vi...
Conclusion• The differential of our research  – We focus in a successful network  – We treat the problem over a new perspe...
Future Works• The model was already implemented• We are investigating if our heuristics are  coherent• We will investigati...
Thank You• Any Question?                    Cleyton-UFCG   30
Upcoming SlideShare
Loading in …5
×

A formal model to the routing questions problem

443 views

Published on

Apresentação no ICWI 2011

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
443
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

A formal model to the routing questions problem

  1. 1. A formal model to the routingquestions problem in the context of twitter Cleyton Caetano de Souza
  2. 2. Schedule1. Introduction 1. Problem2. Related Works3. The model 1. The problem 2. Details4. A solution to the model5. Conclusion6. Future Works Cleyton-UFCG 2
  3. 3. Introduction• Web has became essential – Web, a repository of information• Search Engines – Looking answers• Social Networks – Waiting answers Cleyton-UFCG 3
  4. 4. Problem• Could occurs problems when you publish your question – None answer – None see – Many answers• Direct the answer to someone – You ensure a answer, but will be a good one? Cleyton-UFCG 4
  5. 5. Problem• Informally, the problem that we proposes to solve is given a question posted by a user (asker) in Twitter, find among his followers that user with the characteristics: – (1) knows the answer – (2) has the trust of the questioner – (3) provide the answer quickly Cleyton-UFCG 5
  6. 6. Related Works• (Morris, Teevan e Panovich 2010a) – 93.5% of users received answers to their question after post them and these responses – in 90.1% of cases, were provided within one day• Applications – Aardvark (Horowitz and Kamvar 2010) – Q-Sabe (Andrade et al 2003)• The differential of our research Cleyton-UFCG 6
  7. 7. The Model• The twitter is defined by the tuple 𝑇 = {𝑈, 𝑅}• Where 𝑈 = {𝑢1 , … , 𝑢 𝑈 } is a set of users• And 𝑅 is the set of all relationships 𝑟𝑖,𝑗 between two users 𝑖 and 𝑗. – The existence of 𝑟𝑖,𝑗 means that i follows j, this way 𝑟𝑖,𝑗 ≠ 𝑟𝑗,𝑖 Cleyton-UFCG 7
  8. 8. The Model• Each useru has the attributes – 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 that contains all users which follows 𝑢 – 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑢 that contains all users which are followed by 𝑢 – 𝑀 𝑢 = 𝑚1 , … , 𝑚 𝑀 a ordered list that contains all messages posted for 𝑢• Each message 𝑚 has the attributes – 𝑑 𝑚 - the post date – 𝑠 𝑚 - the string posted Cleyton-UFCG 8
  9. 9. The Problem Given a query 𝑞 posted by 𝑢, 𝑓 ∈ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 and 𝑝 𝑓,𝑞 a function that tell us the chances of 𝑓 provides a good answer– Find: 𝑓– To: 𝑀𝑎𝑥 𝑝 𝑓,𝑞– Over: 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 Cleyton-UFCG 9
  10. 10. The problem• We believe that 𝑝 𝑓,𝑞 has a correlation with three things – 𝑘 𝑓,𝑞 – the knowledge that 𝑓 in relation with 𝑞 – 𝑡 𝑢,𝑓 – the trust of 𝑢 has in 𝑓 – 𝑎 𝑓 – the level of activity of 𝑓• That way will actually want to find the best combination of: 𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 and 𝑎 𝑓 Cleyton-UFCG 10
  11. 11. Knowledge• Each message 𝑚 𝑢 corresponds a fraction of the total expertise of 𝑢 𝑘𝑢 = 𝑘 𝑚𝑢 𝑚 𝑢 ∈𝑀 𝑢• In IR we represent this fraction as a vector of the words/token contained in 𝑚 𝑢• So the 𝑘 𝑢 is a vector where each coordinate represents a token and its value is the frequency of this token in all messages 𝑚 𝑢 Cleyton-UFCG 11
  12. 12. Knowledge• If 𝑡 𝑞 is the frequency of the token 𝑡 in 𝑞, the knowledge needed to answer satisfactorily the question is calculated as a inner product between the vector that represent the follower and the vector that represent the question 𝑘 𝑓,𝑞 = 𝑡𝑞 ∗ 𝑡𝑘𝑢 𝑡∈𝑞 Cleyton-UFCG 12
  13. 13. Trust• Trust is related to – Friendship [Schenkel et al 2008] – Similarity [Kuter and Golbeck 2010]• So we believe (and simplify) 𝑡 𝑢,𝑣 = 𝑓 𝑢,𝑣 ∗ 𝑠𝑖𝑚 𝑢, 𝑣 Cleyton-UFCG 13
  14. 14. Friendship• Friendship measures the importance of a user to another• In Twitter a good estimative of friendship should consider the mentions (connections) between 𝑢 and 𝑣, so |𝑚𝑒𝑛𝑡𝑖𝑜𝑛𝑠 𝑢 𝑣 | 𝑓 𝑢,𝑣 = 𝑚𝑒𝑛𝑡𝑖𝑜𝑛𝑠 𝑢 Cleyton-UFCG 14
  15. 15. Similarity• The similarity measures how to users are equal under some criterion• Appears intuitively that the similarity is related to equality among the attributes 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 ∩ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑣 𝑠𝑖𝑚1 𝑢, 𝑣 ∝ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 ∪ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑣 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑢 ∩ 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑣 𝑠𝑖𝑚2 𝑢, 𝑣 ∝ 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑢 ∪ 𝐹𝑜𝑙𝑙𝑜𝑤𝑖𝑛𝑔 𝑣 𝑠𝑖𝑚3 𝑢, 𝑣 ∝ 𝑠𝑖𝑚(𝑘 𝑢 , 𝑘 𝑣 ) Cleyton-UFCG 15
  16. 16. Similarity• Any combination of this equations could be used• We choose use 𝑠𝑖𝑚1 𝑢, 𝑣 𝑠𝑖𝑚2 𝑢, 𝑣 𝑠𝑖𝑚3 𝑢, 𝑣𝑠𝑖𝑚 𝑢, 𝑣 = ∗ ∗ 1 − 𝑠𝑖𝑚1 𝑢, 𝑣 1 − 𝑠𝑖𝑚2 𝑢, 𝑣 1 − 𝑠𝑖𝑚3 𝑢, 𝑣 Cleyton-UFCG 16
  17. 17. Activity• Users not interact with the same intensity• It seems intuitive that the activity level of a user depends on the frequency with he/she post new tweets Cleyton-UFCG 17
  18. 18. Activity• Activity means the mean time between the messages posted by 𝑢 |𝑀| 𝑡𝑜𝑑𝑎𝑦 − 𝑑 𝑚, 𝑀 𝑢 + 𝑖=1 𝑑 𝑚,𝑖+1 − 𝑑 𝑚,𝑖 𝑎𝑢 = 𝑀𝑢 +1• As lower this value, most active is the user and bigger the chances of him give a answer quickly Cleyton-UFCG 18
  19. 19. Solving the Model• Calculate the tuples (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) to each user is a simple task• But, how decides who is the best? Cleyton-UFCG 19
  20. 20. Solving the Model• We consider this is a problem of decision making with multiple criteria• We decide to use the Weight Product Model to solve based on [Triantaphyllou and Mann 1989] Cleyton-UFCG 20
  21. 21. Solving the Model-Step 1• The resolution of the model starts calculating the tuple (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) to each user 𝑓 𝑢 ∈ 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 Cleyton-UFCG 21
  22. 22. Solving the Model-Step 2• The we display this users in a matrix 𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 𝑥|𝐹𝑜𝑙𝑙𝑜𝑤𝑒𝑟𝑠 𝑢 | Cleyton-UFCG 22
  23. 23. Solving the Model-Step 3• We create a function 𝑚𝑎𝑝 𝑥 which will map the values of (𝑘 𝑓,𝑞 , 𝑡 𝑢,𝑓 , 𝑎 𝑓 ) in a same scale Cleyton-UFCG 23
  24. 24. Solving the Model-Step 4• For each pair 𝑓1 , 𝑓2 |𝑓1 ≠ 𝑓2 we calculate 𝑥 𝑦 𝑧 𝑘 𝑓1 ,𝑞 𝑡 𝑢,𝑓1 𝑎 𝑓1𝑝 𝑓1,𝑓2 = ∗ * 𝑘 𝑓2 ,𝑞 𝑡 𝑢,𝑓2 𝑎 𝑓2• The values 𝑥,𝑦 and 𝑧 are factors of importance and must be between 0 and 1, besides that 𝑥+ 𝑦+ 𝑧=1 Cleyton-UFCG 24
  25. 25. Solving the Model-Step 5• If 𝑝 𝑓1,𝑓2 > 0 we put 1 in position (𝑓1 , 𝑓2 ) and 0 in position (𝑓2 , 𝑓1 )• If 𝑝 𝑓1,𝑓2 < 0 we put 0 in position (𝑓1 , 𝑓2 ) and 1 in position (𝑓2 , 𝑓1 )• If 𝑝 𝑓1,𝑓2 = 0 we put 1 in position (𝑓1 , 𝑓2 ) and 1 in position (𝑓2 , 𝑓1 ) Cleyton-UFCG 25
  26. 26. Solving the Model-Step 5 Cleyton-UFCG 26
  27. 27. Solving the Model-Step 6 (End)• We calculate the sum of each line of the matrix, this number represents the number of victories of each user• In the end we have• The question will be routed to the user with more victories Cleyton-UFCG 27
  28. 28. Conclusion• The differential of our research – We focus in a successful network – We treat the problem over a new perspective – We lead with a recent and interesting problem Cleyton-UFCG 28
  29. 29. Future Works• The model was already implemented• We are investigating if our heuristics are coherent• We will investigating – If the indications of the model are accurate – If direct questions is more effective – What factor of importance is most important Cleyton-UFCG 29
  30. 30. Thank You• Any Question? Cleyton-UFCG 30

×