Alexander Troussov, Ph.D., IBM Dublin Software Lab16th of April 2011, Mathlingvo Seminar, St.Petersburg State University, ...
About AT    IBM Ireland Center for Advanced Studies - Chief Scientist    IBM LanguageWare group – the Architect    Nationa...
Natural Language Understanding is Inferencing (?)    From computational point of view    natural language understanding   ...
Inferencing    Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing    but the truth” ...
Knowledge, Lexico-Semantic Resource        Text             Relevancy5                                     © 2011 Alexande...
Text – Semantic Network                                         NETWORK OF CONCEPTS                                       ...
NLU as inferencing    The concept of a car is relevant to a text.    Car IS-A “on-land travel” (?)    Therefore “on-land t...
Text – Semantic Network                                         NETWORK OF CONCEPTS                                       ...
Demo    – 2 1 Spreading Activation.pdf9                                    © 2011 Alexander Troussov
Agenda Introduction Building Semantic Model SA Research Challenges  – Why SA  – Relayability of inferencing  – What is the...
Text – Semantic Network                                         NETWORK OF CONCEPTS                                       ...
Spreading Activation Methods                               © 2011 Alexander Troussov
There is an increased need for a new generic and formal understanding of spreading     activation as a class of algorithms...
We present spreading activation in a generic form, as a set of methods suitable for mining     multidimensional networks w...
Origin of Spreading Activation Methods     In neurophysiology interactions between neurons is modeled by way of activation...
Notation     A multidimensional network can be modeled as a directed graph, which is a pair     G = (V,E)     where     V ...
Generic description of spreading activation methods (SAM)framework     1.          Initialisation                   Sets t...
Generic description of recomputation phase     We have the list of nodes V n .     1. Input/Output Through Links Computati...
Generic description of recomputation phase     1. Input/Output Through Links Computation.     For each node v we compute t...
Generic description of input/output phase     2. Input/Output of Node Activation     Before the pulse, the node v has the ...
Generic description of recomputation phase     3. Computation of the New Level of Activation     A new value F(v) is compu...
SAM and Methods of Numerical Simulation in Physics Spreading activation algorithms were introduced in 1990s; however the s...
SAM and Methods of Numerical Simulation in Physics     Using the same iterative algorithm, with one set of parameters one ...
24   © 2011 Alexander Troussov
Spreading Activation as a Graphmining Technique     The technique of SAM is quite polymorphic. On this slide we interpret ...
Γαλλία                  People     Παρίσι                       Ναπολέων            Αλέξανδρος        Geographical        ...
France                                                           Russia     Paris                                         ...
Diagram on the previous slide …     What it represents?     How it can be used?28                                © 2011 Al...
France                                                                                Russia     Paris                    ...
Diagram on the previous slide … What it represents?     Data from Facebook, data from Napoleon’s Lotus Notes calendar, str...
Social Context = Knowledge ?                 A New Mathematical Model of Horse Racing     Assume, without the loss of gene...
Representing social context as a knowledge allows us to     benefit from the experience of knowledge based     application...
For instance, the social context modeled as a network is not much different from semantic networks     which are formed fr...
How to model the social context     As multidimensional networks      – The primary source - network models of instantiati...
The primary source – network models of techno-social systems                    Invited                              Joine...
Examples of Graph Models:Folksonomies: – Tripartite Hypergraph     Social bookmarking systems (Del.icio.us, …)      – Wher...
Inferencing – “Soft methods” could provide reliable inferencing     For instance, the social context modeled as a network ...
Natural Language Understanding is Inferencing (?)     From computational point of view     natural language understanding ...
Inferencing     Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing     but the truth...
from Uncertainty to Certainty in Inferencing: phase transitions as a functionof seed size in analogy to ones in percolatio...
And could be explained by combinatorics     A graph showing the approximate probability of at least two people sharing a b...
Simulation     The network (such as a taxonomy of geographical     locations) is the tree of 20,000 nodes. Text is modeled...
Processes in Networks How we study the Earth?  – By looking at the results of the propagation of    waves through the Eart...
Processses     Used goods- trail     Money - walk     Gossip - replication rather than transference (trails rather than wa...
45   © 2011 Alexander Troussov
we are talking about consumability of centrality measurementsproduced by network flow methods like these       (DEMO)46   ...
Key difference between SNA and other approaches to social science     Social sciences usually have focus     on attributes...
Key difference between SNA and other approaches to social science     SNA focus on relationships     between actors     “S...
Prominence     The study of structural properties of networks and their interplay with the processes taking     place on t...
Centrality: Eigenvector Centrality     Eigenvector centrality was introduced by Phillip Bonacich in 1987     “Googles work...
Centrality and the network flow methods     Most of the centrality measurement are based on the network flow process, “tha...
Master Equation                            Numerical Solution     Bonacich Power Centrality, Eigenvector Centrality, Googl...
Master Equation                    Numerical Solution                                                       Computation   ...
It is great to have “the right master equation”!What is the shape of a hanging chain?          – What is the shape of a ha...
It is great to have “the right master equation”!What is the shape of a hanging chain?      What is the shape of a hanging ...
It is great to have “the right master equation”!What is the shape of a hanging chain?             What is the shape of a h...
“Plotting geometric arrangements and forces acting on small segments” evolved into       – Finite difference method       ...
Numerical Solution                                NO Master Equation     “Integrating” evolved into …       – Well, in fin...
Leibniz, Huygens, and Johann Bernoulli knew geometry and mechanics. We dont know     "geometry" and "mechanics” of techno-...
Recommender systems and global/local ranking     Link analysis is frequently employed for ranking and navigation     Graph...
Graphics:   http://strangemaps.wordpress.com/2007/02/07/72-the-world-as-seen-from-new-yorks-9th-avenue/61                 ...
Global Ranking (like Google’s PageRank) –a view on the network from external point - modern, “Copernican” approachSource: ...
Local Ranking – is needed for recommenders – should rely on Ego-centered Ptolemaic view (actually, Poly-Centered, see next...
POLY-CENTRICPoly-Centric   In physical space – navigation               is from one point to another.               In app...
.     Graph-based recommender systems should     recommend             “Important” objects (nodes)     which are also loca...
Web and Communities     Communities in Social Sciences: A tribe learning to survive, a group of engineers working on simil...
Community detection … but What is a Community?     Are you Russian? Yes. Are you Irish? Yes. Are you mathematician? Yes. A...
New methods for community detection are needed  Multiple membership   – Are you Russian? Yes. Are you Irish? Yes. Are you ...
An example of clustering around a node using propagation69                                                         © 2011 ...
70   © 2011 Alexander Troussov
Future work in local dynamic clustering     Troussov et al “Vectorised Spreading Activation” 2010 theorize that the future...
Conways Game of Life72                      © 2011 Alexander Troussov
Conways Game of Life73                      © 2011 Alexander Troussov
Conways Game of Life74                      © 2011 Alexander Troussov
Logic-inspired VSA     Finite difference approximations to differential equations were one of precursors of cellular     a...
VSA & Marker propagation – combining ranking with clustering                                                       My Univ...
VSA & Clustering (Cont.)77                         © 2011 Alexander Troussov
VSA & Clustering (Cont.)78                         © 2011 Alexander Troussov
VSA & Clustering (Cont.)79                         © 2011 Alexander Troussov
My University     An Expert                 A topic                 I’m interested in80                 © 2011 Alexander T...
Tasks / Methods Various terminology in various domains (for instance, from the point of view of IM many tasks falls into t...
Tasks     Avenues to deep socio-semantic analytics and the possibility of high-     quality functionalities for techno-soc...
“Three steps away”   ?      John B.            Axel P.      Dan B.          Tim B.                         Why recommender...
John and Tim – Recommender computes that this is a strong connection because of multiple ways of connections Shortest Path...
John and DerekRecommender computesthat such type ofconnectivity is a weakconnection85 85                      © 2011 Alexa...
Tasks: Generalisation Across Domains - Whom is Claudia connected with?       All of these people                          ...
Ranking              2          1       387                    © 2011 Alexander Troussov
Ranking              3          1       288                    © 2011 Alexander Troussov
Ranking          1       2              …              …89                    © 2011 Alexander Troussov
Nepomuk Recommender     NEPOMUK (Networked Environment for Personalized, Ontology-based Management of Unified     Knowledg...
Nepomuk Recommender (Cont.)     Troussov et al “Social Context as Machine Processable Knowledge” presented the     archite...
Nepomuk     Representing and modeling this ontology as a multidimensional network allows to augment     the ontology on th...
Pile UI93        © 2011 Alexander Troussov
Nepomuk use case: activity management                           A user started to work on a new project CID.              ...
Nepomuk use case: activity management using IBM recommendercodenamed “Galaxy”              Galaxy (IBM hybrid recommender)...
Nepomuk use case: activity management     Galaxy can spot what the user might miss:     “This web page might be relevant t...
Thank you !              © 2011 Alexander Troussov
Upcoming SlideShare
Loading in...5
×

2011 04 troussov_graph_basedmethods-weakknowledge

254

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
254
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2011 04 troussov_graph_basedmethods-weakknowledge

  1. 1. Alexander Troussov, Ph.D., IBM Dublin Software Lab16th of April 2011, Mathlingvo Seminar, St.Petersburg State University, RussiaGraph-based methodsto exploit “weak” knowledge © 2011 Alexander Troussov
  2. 2. About AT IBM Ireland Center for Advanced Studies - Chief Scientist IBM LanguageWare group – the Architect National Geophysical Data Center, Boulder, CO, USA - Visiting scientist – Fuzzy logic based search engine for search in large databases when exact parameters of search are hard to define Observatoire de la Côte d’Azur, Nice, France – Visiting scientist – numerical simulation in stochastic physics Institute of Physics of the Earth (Russian Academy of Sciences) and the International Institute for Earthquake Prediction Theory and Mathematical Geophysics, Moscow, Russia - Lead Researcher – R&D in geophysics and geoinformatics System programming at the Institute of Precise Mechanics, Moscow PhD in Mathematics from Lomonosov Moscow State University2 © 2011 Alexander Troussov
  3. 3. Natural Language Understanding is Inferencing (?) From computational point of view natural language understanding is inferencing – Text which mentions Malahide is probably about Canada (??) Malahide (Canada 2006 Census population 8,828) is a township in Elgin County, Ontario, CanadaSource: Troussov et al. MITACS, Canada, 20103 © 2011 Alexander Troussov
  4. 4. Inferencing Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing but the truth” – Malahide, Co. Dublin – Malahide is a township in Elgin County, Ontario, Canada. – Paradis Gisenyi Malahide is a hotel in Rwanda Solution (Troussov et al. MITACS, Canada, 2010 ): propagation from multiple concepts, for instance, the initial seed for the activation propagation starts at two nodes in a geographical taxonomy: Malahide (Ontario) and Malahide (Co. Dublin) as well as from other concepts mentioned in the text • Text which mentions Malahide and Europe – is a little bit more likely to be about Ireland than about Canada • Text which mentions Malahide and Clontarf – is more likely to be about Ireland than about Canada • … • Cohesive coherent text which mentions: Malahide, Mulhuddart, Lansdowne, Clontarf, Donabate - is almost for sure about Dublin4 © 2011 Alexander Troussov
  5. 5. Knowledge, Lexico-Semantic Resource Text Relevancy5 © 2011 Alexander Troussov
  6. 6. Text – Semantic Network NETWORK OF CONCEPTS Finding “focus” concept Mapping of term mentions to concepts . Mention Mention Mention Mention TEXT © 2011 Alexander Troussov
  7. 7. NLU as inferencing The concept of a car is relevant to a text. Car IS-A “on-land travel” (?) Therefore “on-land travel” is somewhat relevant to the text, …7 © 2011 Alexander Troussov
  8. 8. Text – Semantic Network NETWORK OF CONCEPTS Finding “focus” concept Mapping of term mentions to concepts . Mention Mention Mention Mention TEXT © 2011 Alexander Troussov
  9. 9. Demo – 2 1 Spreading Activation.pdf9 © 2011 Alexander Troussov
  10. 10. Agenda Introduction Building Semantic Model SA Research Challenges – Why SA – Relayability of inferencing – What is the purpose of graph operations Centrality, network flow methods Zoo of algorithms Nepomuk Recommender © 2011 Alexander Troussov
  11. 11. Text – Semantic Network NETWORK OF CONCEPTS Finding “focus” concept Mapping of term mentions to concepts . Mention Mention Mention Mention TEXT © 2011 Alexander Troussov
  12. 12. Spreading Activation Methods © 2011 Alexander Troussov
  13. 13. There is an increased need for a new generic and formal understanding of spreading activation as a class of algorithms rather than a particular algorithm with many parameters Spreading activation (also known as spread of activation) is a method for searching associative networks, neural networks or semantic networks. The method is based on the idea of quickly spreading an associative relevancy measure over the network. Our goal is to give an expanded introduction to the method. We will demonstrate and describe in sufficient detail that this method can be applied to very diverse problems and applications. We present the method as a general framework. First we will present this method as a very general class of algorithms on large (or very large) so-called multidimensional networks which will serve a mathematical model.Source: Troussov, Levner, Bogdan, Judge, Botvich “Spreading activation methods”13 © 2011 Alexander Troussov
  14. 14. We present spreading activation in a generic form, as a set of methods suitable for mining multidimensional networks with oriented weighted links. These graphmining methods might produce results similar to those which might be achieved by soft clustering and fuzzy inferencing. The input object is a function on nodes of the network, and the spread of activation is a technique which provides “spreading” of this function through the network links. The result of the spreading activation is a new function on the nodes. The properties of that function strongly depend on the original function and the parameters of the spreading activation. For instance, when the underlying network is a network of ontological concepts, parameters governing spread might be chosen in such a way that allows “smoothing” of the original function and interpreting the resulting function as “conceptual” summaries of the initial non-zero valued nodes.14 © 2011 Alexander Troussov
  15. 15. Origin of Spreading Activation Methods In neurophysiology interactions between neurons is modeled by way of activation which propagates from one neuron to another via connections called synapses to transmit information using chemical signals. The first spreading activation models were used in cognitive psychology to model this processes of memory retrieval (Collins, A.M. & Loftus, E.F., 1975; Anderson, J.,1983). This framework was later exploited in Artificial Intelligence (AI) as a processing framework for semantic networks and ontologies, and applied to Information Retrieval (Crestani, F., 1997; Aleman-Meza, Halaschek, Arpinar, & Sheth, 2003; Rocha, C, Schwabe, D. & Poggi de Aragao, M., 2004; …) as the result of direct transfer of information retrieval ideas from cognitive sciences to AI.15 © 2011 Alexander Troussov
  16. 16. Notation A multidimensional network can be modeled as a directed graph, which is a pair G = (V,E) where V – is the set of vertices vi E – is the set of edges ej (although in oriented graphs edges are referred to as arcs) init: E → V – is the mapping which provides initial nodes for arcs term: E → V – is the mapping which provides terminal nodes for arcs imp – is importance value of arcs and nodes. For instance, imp(v) where the node v is a geographical location, might be the population. Imp(e) number of phone calls from person init(e) to person term(e). w – “weights”, for instance, the sigmoidal function of imp. w(ej)=0 means that effectively arc ej is ignored w(ej)=1 means that activation of init(ej) strongly affects the activation of term(ej). For instance, when the nodes represent “words”, synonym links might be assigned the value 1. F(E) – is the “activation” function, usually a real valued function on nodes of the network.16 © 2011 Alexander Troussov
  17. 17. Generic description of spreading activation methods (SAM)framework 1. Initialisation Sets the parameters of the algorithm, network, and initial F(E) as a list of non-zero valued nodes V n 2. Iterations (each iteration is one pulse of SAM) – a. List Expansion the list is expanded to include neighbors (including both neighbors following outgoing links, and neighbors which have links to the nodes in the list). Newly added nodes receive a zero valued level of activation – b. Recomputation the value at each node in the list is recomputed based on the values of the function on nodes which have links to the given node and types of connections – c. List Purging The list is purged - we exclude the nodes with the values less than a threshold. – d. Conditions Check To Break Iterations like maximum number of iterations to be performed. 3. Output The list of nodes (value of the function after spread of activation) ranked according F values.17 © 2011 Alexander Troussov
  18. 18. Generic description of recomputation phase We have the list of nodes V n . 1. Input/Output Through Links Computation. – For each node v we compute the input signal to each arc e, such that init(e)=v. When the signal (“activation”) passes through a link e, the activation usually experiences decay by a factor w(e) 2. Input/Output of Node Activation – Before the pulse, the node v has the activation level F(v). • Through incoming links v get more activation, By dissipating the activation through outgoing links, the node v might lose activation. 3. Computation of the New Level of Activation – A new value F(v) is computed based on F(v), Input (v), and Output (v)18 © 2011 Alexander Troussov
  19. 19. Generic description of recomputation phase 1. Input/Output Through Links Computation. For each node v we compute the input signal to each arc e, such that init(e)=v. This computation can be based on the value F(v), the outdegree of a node etc. For instance, if the node v has n outgoing arcs of the same type, each arc e might get input signal: I (e) = F(init(e)) · (1 / outdegree(v)**beta ) where beta might be equal to 1. It could be also less than one, in which case the node v will propagate more activation to its neighbors than it has. When the signal (“activation”) passes through a link e, the activation usually experiences decay by a factor w(e): O (e) = I(e) · w(e)19 © 2011 Alexander Troussov
  20. 20. Generic description of input/output phase 2. Input/Output of Node Activation Before the pulse, the node v has the activation level F(v). Through incoming links v get more activation: Input(v) = Σ O(e) for all links e such that init(e) ∈V n, term(e) = v. By dissipating the activation through outgoing links, the node v might lose activation: Output(v) = Σ I(e) for all links e such that init(e) = v, term(e) ∈V n20 © 2011 Alexander Troussov
  21. 21. Generic description of recomputation phase 3. Computation of the New Level of Activation A new value F(v) is computed based on F(v), Input (v), and Output (v), for example Fnew(v) = F(v) + Input (v)21 © 2011 Alexander Troussov
  22. 22. SAM and Methods of Numerical Simulation in Physics Spreading activation algorithms were introduced in 1990s; however the same iterative methods were used long before in numerical simulation in physics, mechanics, chemistry and engineering sciences. The major distinctions of these algorithms from what is called now as spreading activation are: – a) in physics – such algorithms usually work on a regular mesh (so that the local topology of the graph is encoded into formulas of the recomputation stage) – b) in physics – initial conditions, or initial activation – are usually assigned to all nodes on the mesh; and the use of algorithms for efficient graph traversal is not needed. For instance, steps 2a (List expansion) and 2b (List Purging) in the generic description of SAM framework might be skipped. For instance, one dimensional heat transfer equations might be numerically simulated on a one-dimensional mesh, by iterative methods. On each iteration recomputation stage is based on the formula below: Fnew (v) = ( F(RightNeighbor(v)) + F(LeftNeighbor(v)) ) / 2 Using a different formula, one can simulate the behavior of an oscillating string (although this will require storing tree values at each node - position, mass and velocity of the material point corresponding to the node). © 2011 Alexander Troussov
  23. 23. SAM and Methods of Numerical Simulation in Physics Using the same iterative algorithm, with one set of parameters one can emulate heat transfer; with another set of parameters the same algorithm will show us the behavior of oscillating strings. But the phenomena of heat propagation and string oscillation are quite different (for instance, heat propagation might lead to “thermal death” - the state of equilibrium where the level of activation is the same for all nodes, while oscillation might continue forever). Our illustration concern only basics, while real modeling might be much more complicated, for instance, hear transfer might lead to combustion, where after reaching some level of activation a node generates more “heat” than it gets from neighboring nodes.23 © 2011 Alexander Troussov
  24. 24. 24 © 2011 Alexander Troussov
  25. 25. Spreading Activation as a Graphmining Technique The technique of SAM is quite polymorphic. On this slide we interpret the results of spreading activation in terms of graph mining. – First of all, one can think that after running SAM the most activated nodes will be those nodes, which get the activation from multiple sources, or, in other words, those nodes which minimize the “distance” to the nodes which were initially activated. Therefore these nodes might be considered as potential centroids of strong clusters induced by the initial activation. Since partitioning of the nodes according to these clusters is not immediately available (and is not needed in many applications), SAM algorithms might be considered as methods of soft clustering. – On the other hand, the most activated nodes are those nodes, which are connected to the initial conditions by particular types of directed links (arcs with large weights). Therefore we might consider SAM as an efficient scheme for computing fuzzy inferencing. For such applications replacing a single valued function F by a vector function might be useful. We conclude by noting that SAM algorithms might be used for soft clustering and fuzzy inferencing on networks.25 © 2011 Alexander Troussov
  26. 26. Γαλλία People Παρίσι Ναπολέων Αλέξανδρος Geographical artifacts Relations • Friends • Part of, Instance of, Subcluss • Created26 © 2011 Alexander Troussov
  27. 27. France Russia Paris Moscow Napoleon Alexander Borodino Kutuzov Meeting: Battle of Austerlitz Meeting: Battle of Borodino Project: Invasion of Russia27 © 2011 Alexander Troussov
  28. 28. Diagram on the previous slide … What it represents? How it can be used?28 © 2011 Alexander Troussov
  29. 29. France Russia Paris Moscow Napoleon Alexander Borodino Kutuzov Meeting: Battle of Austerlitz Meeting: Battle of Borodino How this diagram could be used? 1.Network flow process could show the nodes most relevant to the pair “Napoleon” & “Meeting” - Selection WHO – whom to invite Project: - Other nodes – explain recommendations Invasion of Russia 2.When Napoleon opens email or a web page containing W&P he will be advised that the content of this resource is relevant to his project “Invasion of Russia”029 © 2011 Alexander Troussov
  30. 30. Diagram on the previous slide … What it represents? Data from Facebook, data from Napoleon’s Lotus Notes calendar, structure of a Wiki, network of collocations or relations between the entities in W&P, … – The proliferation of Web 2.0 and Enterprise 2.0 technologies has lead to the emergence of massive networks connecting people and various digital artifacts. These networks can be treated as a “weak” knowledge, which nevertheless might be used recommendations and even for such traditional applications as knowledge-based text processing Or instantiation of an ontology related to W&P by Leo Tolstoy – In which case we would probably know that Napoleon is emperor of France, Paris is the capital (not instantiation of a subclass) of France, etc. Ontology provides conceptualization, allow inferencing, but these advantages per se are useless without tedious manual work to encode the rules how to use this additional knowledge. While the knowledge encoded in the topology of the multidimensional network is ready to use provided that methods are tolerant to errors and inconsistencies in data - i.e. the methods are methods of “soft mathematic” – fuzzy inferencing, soft clustering, …30 © 2011 Alexander Troussov
  31. 31. Social Context = Knowledge ? A New Mathematical Model of Horse Racing Assume, without the loss of generality, that each horse in the horse racing is modelled by a wooden ball of radius Ri. = a ball ? ☺31 © 2011 Alexander Troussov
  32. 32. Representing social context as a knowledge allows us to benefit from the experience of knowledge based applications.32 © 2011 Alexander Troussov
  33. 33. For instance, the social context modeled as a network is not much different from semantic networks which are formed from concepts represented in ontologies. And it is possible to use such networks for knowledge based text processing. Representing social context as knowledge allows us to draw experience from such mature R&D area as knowledge-based text processing33 © 2011 Alexander Troussov
  34. 34. How to model the social context As multidimensional networks – The primary source - network models of instantiations of techno-social systems As a “Knowledge” – represented as objects, clauses, XML, graphs, some combination of these34 © 2011 Alexander Troussov
  35. 35. The primary source – network models of techno-social systems Invited Joined Log-files of Techno-Social systems (likeCreated Facebook or IBM’s Lotus Connections) keep track about who did what. Triples could be aggregated into a network. 35 © 2011 Alexander Troussov
  36. 36. Examples of Graph Models:Folksonomies: – Tripartite Hypergraph Social bookmarking systems (Del.icio.us, …) – Where to keep my bookmarks? – Users (actors), resources, tags In social bookmarking systems users describe bookmarks by keywords called tags. The structure behind these social systems, called folksonomies, can be viewed as a tripartite hypergraph of actors, tag and resource nodes. – Three types of citizens of the first class citizens, and hyperplanes – If hyperplanes are made from rubber, they could be schinked to a node, so the hyperplanes will also be citizens of the first class Advantages of the network models (see next slide) – Extensibility – Easy of merge heterogeneous informationSource: Hypergraphs: see Jäschke et al. "Logsonomy — A Search Engine Folksonomy" MediaICWSM 2008AAAI Press (2008)36 © 2011 Alexander Troussov
  37. 37. Inferencing – “Soft methods” could provide reliable inferencing For instance, the social context modeled as a network is not much different from semantic networks which are formed from concepts represented in ontologies. And it is possible to use such networks for knowledge based text processing. Representing social context as knowledge allows us to draw experience from such mature R&D area as knowledge-based text processing37 © 2011 Alexander Troussov
  38. 38. Natural Language Understanding is Inferencing (?) From computational point of view natural language understanding is inferencing – Text which mentions Malahide is probably about Canada (??) Malahide (Canada 2006 Census population 8,828) is a township in Elgin County, Ontario, CanadaSource: Troussov et al. MITACS, Canada, 201038 © 2011 Alexander Troussov
  39. 39. Inferencing Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing but the truth” – Malahide, Co. Dublin – Malahide is a township in Elgin County, Ontario, Canada. – Paradis Gisenyi Malahide is a hotel in Rwanda Solution (Troussov et al. MITACS, Canada, 2010 ): propagation from multiple concepts, for instance, the initial seed for the activation propagation starts at two nodes in a geographical taxonomy: Malahide (Ontario) and Malahide (Co. Dublin) as well as from other concepts mentioned in the text • Text which mentions Malahide and Europe – is a little bit more likely to be about Ireland than about Canada • Text which mentions Malahide and Clontarf – is more likely to be about Ireland than about Canada • … • Cohesive coherent text which mentions: Malahide, Mulhuddart, Lansdowne, Clontarf, Donabate - is almost for sure about Dublin Such rapid “phase transition” from uncertainty to certainty is similar to the transition related to percolation threshold39 © 2011 Alexander Troussov
  40. 40. from Uncertainty to Certainty in Inferencing: phase transitions as a functionof seed size in analogy to ones in percolation In (semantic) networks with high local density the reliability of inferencing from a single concept is almost never sufficient, reliability could be low when inferencing starts from a small number of seed concepts, but inferencing becomes very reliable at some level of the number of the initial seed concepts (which could be explained by combinatorics) Reliability of inferencing40 Number of nodes in the seed © 2011 Alexander Troussov
  41. 41. And could be explained by combinatorics A graph showing the approximate probability of at least two people sharing a birthday amongst a certain number of people. In probability theory, the birthday problem, or birthday paradox, pertains to the probability that in a set of randomly chosen people some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 366 (ignoring February 29 births). But perhaps counter-intuitively, 99% probability is reached with a mere 57 people, and 50% probability with 23 people.41 © 2011 Alexander Troussov
  42. 42. Simulation The network (such as a taxonomy of geographical locations) is the tree of 20,000 nodes. Text is modeled as a list of 100 terms each of which is ambiguous and could be mapped into 8 network nodes. When such mapping happens, we consider that the node (the geographical location represented by the node) could be relevant to the text. We are looking for clusters such as the groups of N nodes each of them is mentioned in the text and the graph distance between each pair of nodes in the cluster is less than three. Such graph structures have low probability of occurrence for small N (N=1 or 2), and their probability sharply decreases to zero for bigger N; correspondingly, our certainty that the graph structure signifies the topicality of the text increases to 1.0 – Text which mentions Malahide, Mulhuddart, Lansdowne, Clontarf, Donabate - is almost for sure about Dublin (Ireland)Source: F. Darena and A. Troussov 201042 © 2011 Alexander Troussov
  43. 43. Processes in Networks How we study the Earth? – By looking at the results of the propagation of waves through the Earth Propagation of seismic wave in the ground and the effect of presence of land mine Similarly, one can study the networks by network flow methods – introducing the processes where something is flowing from node to node across the edges © 2011 Alexander Troussov
  44. 44. Processses Used goods- trail Money - walk Gossip - replication rather than transference (trails rather than walks) E-mail - diffusion by replication Attitudes - spread through replication rather than transfer Infection - spreads like gossip, but does not re-infect Packages - usually the shortest route possible Relevancy in semantic networks Trust - Shortest path or volume?44 © 2011 Alexander Troussov
  45. 45. 45 © 2011 Alexander Troussov
  46. 46. we are talking about consumability of centrality measurementsproduced by network flow methods like these (DEMO)46 © 2011 Alexander Troussov
  47. 47. Key difference between SNA and other approaches to social science Social sciences usually have focus on attributes of individual actors47 © 2011 Alexander Troussov
  48. 48. Key difference between SNA and other approaches to social science SNA focus on relationships between actors “Social network analysis reflects a shift from the individualism common in the social sciences towards a structural analysis”. Garton et al. Studying Online Social Networks Structuralism is an approach to the human sciences that attempts to analyze a specific field (for instance, mythology) as a complex system of interrelated parts. лингвистс Романа Якобсона и Ник. Трубецкоj антрополог Леви-Стросс ~ Complex systems Sociogram: – Jacob Levy Moreno (1889-1974) was a Austrian-American leading psychiatrist and psychosociologist, thinker and educator, the founder of psychodrama, and the foremost pioneer of group psychotherapy. Among Moreno’s primary contributions to sociometrics was the sociogram. The sociogram is a method of representing individuals as points on graphs and using lines and arcs to represent the relationships between the individuals. Graphics from Prof. Hendrik Specks tutorial at 5th Karlsruhe Symposium for Knowledge Management in Theory and Praxis, 200748 © 2011 Alexander Troussov
  49. 49. Prominence The study of structural properties of networks and their interplay with the processes taking place on the network is one of the main problems in the last years in the field of complex network analysis A primary use of graph theory in social network analysis is to identify “important” actors. Centrality and prestige concepts seek to quantify graph theoretic ideas about an individual actor’s prominence within a network by summarizing structural relations among the graph nodes. An actor’s prominence reflects its greater visibility to the other network actors (an audience). An actor’s prominent location takes account of the direct sociometric choices made and choices received (outdegrees and indegrees), as well as the indirect ties with other actors. The two basic prominence classes: – Centrality: Actor has high involvement in many relations, regardless of send/receive directionality (volume of activity) – Prestige: Actor receives many directed ties, but initiates few relations (popularity > extensivity)Source: Wasserman&Faust "Social Network Analysis“ (W&F)49 © 2011 Alexander Troussov
  50. 50. Centrality: Eigenvector Centrality Eigenvector centrality was introduced by Phillip Bonacich in 1987 “Googles workhorse search engine ranking algorithm, PageRank, is actually a variant on an SNA concept - Bonacich Power Centrality. – Bonacich (1987) hypothesized that someones power in society depends on the power of his or her social contacts. Bonacich formalized this mathematically: ci = B(c1Ri1 + c2Ri2 + ... + cnRin) , where ci is the person in question, B is the magnitude of the effect, and Rij is the strength of the relationship between the person in question, i, and each of the other people, j, under consideration. If B=1 , the formula becomes eigenvector centrality, of which PageRank is a variant. Now, Page, et al. (1998) do not cite Bonacich, I am not claiming that they stole the idea - I am merely stating that a social network analyst appears to me to have been the first to think up the concept”. Solomon Messing http://www.stanford.edu/~messing/RforSNA.html50 © 2011 Alexander Troussov
  51. 51. Centrality and the network flow methods Most of the centrality measurement are based on the network flow process, “that focuses on the outcomes for nodes in a network where something is flowing from node to node across the edges” (Borgatti and Everett, M. 2006 ] We interpret this “something” as a relevancy measure; for instance, the initial seed input value which shows nodes of interest in the network. Propagating the relevancy measure through outgoing links allows us to compute the relevancy measure for other network nodes and dynamically rank these nodes according to the relevancy measures. The same paradigm could be used to address the centrality measurements in social network analysis. Centralisation of the network can be achieved when we assume that all the nodes are equally important, and iteratively recompute the relevancy measure based on the connections between nodes.51 © 2011 Alexander Troussov
  52. 52. Master Equation Numerical Solution Bonacich Power Centrality, Eigenvector Centrality, Google’s PageRank – “Googles workhorse search engine ranking algorithm, PageRank, is actually a variant on an SNA concept - Bonacich Power Centrality. Bonacich (1987) hypothesized that someones power in society depends on the power of his or her social contacts. Bonacich formalized this mathematically: ci = B(c1Ri1 + c2Ri2 + ... + cnRin) , where ci is the person in question, B is the magnitude of the effect, and Rij is the strength of the relationship between the person in question, i, and each of the other people, j, under consideration. If B=1 , the formula becomes eigenvector centrality, of which PageRank is a variant. Now, Page, et al. (1998) do not cite Bonacich, I am not claiming that they stole the idea - I am merely stating that a social network analyst appears to me to have been the first to think up the concept”. Solomon Messing http://www.stanford.edu/~messing/RforSNA.html52 © 2011 Alexander Troussov
  53. 53. Master Equation Numerical Solution Computation Master equation easily leads us to a numerical solution53 © 2011 Alexander Troussov
  54. 54. It is great to have “the right master equation”!What is the shape of a hanging chain? – What is the shape of a hanging chain when supported at its ends and acted on only by its own weight? Plotting geometric arrangements and forces acting on small segments of the chain Integrating the results54 © 2011 Alexander Troussov
  55. 55. It is great to have “the right master equation”!What is the shape of a hanging chain? What is the shape of a hanging chain when supported at its ends and acted on only by its own weight? • Galileo: “This chain will assume the form of a parabola” y=x2 Plotting geometric arrangements and forces acting on small segments of the chain Integrating the results55 © 2011 Alexander Troussov
  56. 56. It is great to have “the right master equation”!What is the shape of a hanging chain? What is the shape of a hanging chain when supported at its ends and acted on only by its own weight? • Galileo: “This chain will assume the form of a parabola” y=x2 • But the shape is different: y = (a / 2) ( ex/a + e-x/a ) which was established later by applying calculus Plotting geometric arrangements and forces acting on small segments of the chain Integrating the results ." In 1669, Jungius disproved Galileos claim that the curve of a chain hanging under gravity would be a parabola (MacTutor Archive). The curve is also called the alysoid and chainette. The equation was obtained by Leibniz, Huygens, and Johann Bernoulli in 1691 in Leibnizs solution is on the left. response to a challenge by Jakob Bernoulli”. Huygens illustation is on the right. http://mathworld.wolfram.com/Catenary.html56 © 2011 Alexander Troussov
  57. 57. “Plotting geometric arrangements and forces acting on small segments” evolved into – Finite difference method • In mathematics, finite-difference methods are numerical methods for approximating the solutions to differential equations using finite difference equations to approximate derivatives. – Stencil • In mathematics, especially the areas of numerical analysis concentrating on the numerical solution of partial differential equations, a stencil is a geometric arrangement of a nodal group that relate to the point of interest by using a numerical approximation routine. Stencils are the basis for many algorithms to numerically solve partial differential equations.57 © 2011 Alexander Troussov
  58. 58. Numerical Solution NO Master Equation “Integrating” evolved into … – Well, in financial mathematics solutions are tuned on “stencils”. Numerical solutions are known. Master equation is not known, and is not interesting to know. “Master equation is not known” – this is ok. – But we need to be aware about emergency effects in complex systems: learning how to do something right in a small scale, doesn’t necessarily imply that we’ll do right things in a bigger scale58 © 2011 Alexander Troussov
  59. 59. Leibniz, Huygens, and Johann Bernoulli knew geometry and mechanics. We dont know "geometry" and "mechanics” of techno-social systems (and we don’t even know "geometry" and "mechanics” of semantic network, social networks, …) but we can create small "nodal arrangements" modeling multidimensional networks (for instance, folksonomies) Apply known and novel numerical algorithms and utilize state of the art knowledge to decide which algorithms provides better results. The next step - to check if good properties of the numerical solutions on the micro-level hold true on the mezzo-levelSource: Troussov at MITACS Workshop in Vancouver, Canada, 201059 © 2011 Alexander Troussov
  60. 60. Recommender systems and global/local ranking Link analysis is frequently employed for ranking and navigation Graph-based recommender systems should recommend “Important” objects (nodes, links, subgraphs) which are also are – Close enough to the initial points of interests (query, focus, initial seed) (for instance, in physical space) Global ranking ~ PageRank Breadth first search (BFS) ? Local Ranking !? Recommending a suitable restaurant near the NY 9th avenue (next slide) or the music you might like, the advertisement you should see, etc60 © 2011 Alexander Troussov
  61. 61. Graphics: http://strangemaps.wordpress.com/2007/02/07/72-the-world-as-seen-from-new-yorks-9th-avenue/61 © 2011 Alexander Troussov
  62. 62. Global Ranking (like Google’s PageRank) –a view on the network from external point - modern, “Copernican” approachSource: NOAA62 © 2011 Alexander Troussov
  63. 63. Local Ranking – is needed for recommenders – should rely on Ego-centered Ptolemaic view (actually, Poly-Centered, see next slide) LOCAL RANKING Ego-centered or "personal“ networks provide an Ptolemaic views of their networks from the perspective of the persons (egos) at the centers of their network.Graphics: http://strangemaps.wordpress.com/2007/02/07/72-the-world-as-seen-from-new-yorks-9th-avenue/63 © 2011 Alexander Troussov
  64. 64. POLY-CENTRICPoly-Centric In physical space – navigation is from one point to another. In applications to virtual spaces - navigation is not simply browsing from a single object to another, but by dealing with several objects at the same time . For instance, to get better results in Google we add terms, we remove terms, … To compute recommendation “Whom invite to the meeting”, one can start navigation from two objects representing the user whom recommendation is for and the meeting in question64 © 2011 Alexander Troussov
  65. 65. . Graph-based recommender systems should recommend “Important” objects (nodes) which are also located Close to the initial points of interests (query, initial seed) One of the leading approaches in recommenders is: Results of Global Ranking (Link analysis) are “filtered” according to their proximity to the query In this paper we introduce novel algorithms which could replace two step procedure mentioned above with one step: Local Ranking which simultaneously computes proximity and importance65 © 2011 Alexander Troussov
  66. 66. Web and Communities Communities in Social Sciences: A tribe learning to survive, a group of engineers working on similar problems, … Communities in computer sciences - any empirically found group of people Recent advances in digital technologies invite consideration of organizing as a process that is accomplished by global, flexible, adaptive, and ad hoc networks that can be created, maintained, dissolved, and reconstituted with remarkable alacrity”. Prof. N. Contractor66 © 2011 Alexander Troussov
  67. 67. Community detection … but What is a Community? Are you Russian? Yes. Are you Irish? Yes. Are you mathematician? Yes. Are you practitioner? Yes. – Communities easily overlap, multiple membership and fuzzy belongings At the same time, some communities SHOULD be kept separate – Remember “Strange Case of Dr Jekyll and Mr Hyde” (Robert Louis Stevenson, 1886). • How Google had failed to understand an essential property of real-world social networks • So by testing their social service inside a single context (Google employees only), the developers failed to notice that in real life, people participate in multiple contexts (family, work, friends, etc) that they work actively to keep separate. The reasons for wanting to keep these groups separate can range from wanting to keep an illicit affair secret from your spouse to political activists in oppressive regimes wanting to keep certain connections secret from the government. Another important reason to keep our communities separate, is that we often play different roles - and communicate differently http://www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html67 © 2011 Alexander Troussov
  68. 68. New methods for community detection are needed Multiple membership – Are you Russian? Yes. Are you Irish? Yes. Are you mathematician? Yes. Are you practitioner? Yes. … Fuzzy-belongings – We don’t know the social structures behind on-line “communities” members of an on-line community don’t necessarily have the sense of identity as members of real-life social communities, on-line communities could be project teams or networks of knowledge, … High performance and scalability (agglomerative, local, …) – Clustering as simply partitioning is ruled out because of multimembership – Clustering as partitioning is not possible in real time for many business applications • IBM Intranet: 400K employee, 10K on-line communities (the biggest 23K members), ... Contextualisation of Community Detection – Collaborative filtering systems provide recommendations based on the detection of like- minded users. But the user of a techno-social system whom the prediction is for could be "Matematician", "Irish" etc., or a kind of Dr. Jeckyll / Mr. Hyde persons, etc.(see next68 slide) © 2011 Alexander Troussov
  69. 69. An example of clustering around a node using propagation69 © 2011 Alexander Troussov
  70. 70. 70 © 2011 Alexander Troussov
  71. 71. Future work in local dynamic clustering Troussov et al “Vectorised Spreading Activation” 2010 theorize that the future development of spreading activation (SA) methods might be driven by “physics-inspired” and “logic-inspired” algorithms – SA algorithms have roots in numerical simulation of various physics phenomena, particularly by finite difference methods. – From the other hand, the iterative procedure of SA is essentially the same as the procedure that determines the new state of a cell in cellular automata such as Conway’s Game of Life. Although cellular automata usually perform on rectangular (cubic, etc.) grids, the extension to arbitrary networks is feasible. ~ Marker propagation, MajorClust, Chinese whispers graph clustering algorithm, …71 © 2011 Alexander Troussov
  72. 72. Conways Game of Life72 © 2011 Alexander Troussov
  73. 73. Conways Game of Life73 © 2011 Alexander Troussov
  74. 74. Conways Game of Life74 © 2011 Alexander Troussov
  75. 75. Logic-inspired VSA Finite difference approximations to differential equations were one of precursors of cellular automata (Stephen Wolfram "A New Kind of Science") and of the method of spreading activation (Troussov et al 2009) Iterative computational procedures in cellular automata are the same as in SA. The identity of the computational procedures allows to develop VSA algorithms with hybrid operations over the components of the activation vector. – For instance, “physical” operations could be responsible for the propagation of the activation around the initial seeds, the level of the activation indicates the relevancy of the nodes to the initial seeds. – “Logical” operations could propagate markers, which indicate potential belongings of nodes to clusters. Such hybrid operations will combine ranking with clustering; and is computationally efficient on massive networks since the major time consuming operations – retrieval of nodes – serve both “physical” and “logical” operations. The clustering does not involve partitioning of the whole network.75 © 2011 Alexander Troussov
  76. 76. VSA & Marker propagation – combining ranking with clustering My UniversityAn Expert A topic I’m interested in76 © 2011 Alexander Troussov
  77. 77. VSA & Clustering (Cont.)77 © 2011 Alexander Troussov
  78. 78. VSA & Clustering (Cont.)78 © 2011 Alexander Troussov
  79. 79. VSA & Clustering (Cont.)79 © 2011 Alexander Troussov
  80. 80. My University An Expert A topic I’m interested in80 © 2011 Alexander Troussov
  81. 81. Tasks / Methods Various terminology in various domains (for instance, from the point of view of IM many tasks falls into the category of hidden knowledge discovery)Multidimensional network Techno-Social Systems Networks Theory and Graphpoint of view (A.T.): tasks Theory terminology Recommender Systems Random walksCentralisation PageRank etc Eigenvector centrality Expertise location Recommender systems MotifsLocal topology Link predictionAd hoc generalisation across Expertise location Clusteringdimensions Recommender Systems © 2011 Alexander Troussov
  82. 82. Tasks Avenues to deep socio-semantic analytics and the possibility of high- quality functionalities for techno-social systems (like recommending people to invite into your social network) hinge on the availability of engines which are able – to provide hidden knowledge discovery like • Structural importance of nodes • discovering a new relation in a network that based on the strength of multiple connectivity between the nodes of a social network one can conclude that Dr. Jekyll is related to Mr. Hide), • provide ad hoc generalisation across dimensions. • For instance, the ability to detect that a particular person might serve as an representative of a community or as an expert on a particular topic (the example of such generalisation is the expression frequently attributed to Louis XIV "Letat sest moi (Im the State).")82 © 2011 Alexander Troussov
  83. 83. “Three steps away” ? John B. Axel P. Dan B. Tim B. Why recommender decided that this three steps away connection is a strong connection?83 83 © 2011 Alexander Troussov
  84. 84. John and Tim – Recommender computes that this is a strong connection because of multiple ways of connections Shortest Path vs. Volume of traffic Friends-of-Friends InterestWorkplace84 84 © 2011 Alexander Troussov
  85. 85. John and DerekRecommender computesthat such type ofconnectivity is a weakconnection85 85 © 2011 Alexander Troussov
  86. 86. Tasks: Generalisation Across Domains - Whom is Claudia connected with? All of these people Dirk Martin Claudia Elaine Researcher John Hanna86 © 2011 Alexander Troussov
  87. 87. Ranking 2 1 387 © 2011 Alexander Troussov
  88. 88. Ranking 3 1 288 © 2011 Alexander Troussov
  89. 89. Ranking 1 2 … …89 © 2011 Alexander Troussov
  90. 90. Nepomuk Recommender NEPOMUK (Networked Environment for Personalized, Ontology-based Management of Unified Knowledge) is an open-source software specification that is concerned with the development of a social semantic desktop that enriches and interconnects data from different desktop applications using semantic metadata stored as RDF. Initially, it was developed in the EU 6th framework integrated project Nepomuk (2006-2008) - 17 million Euros, of which 11.5 million was funded by the European Union90 © 2011 Alexander Troussov
  91. 91. Nepomuk Recommender (Cont.) Troussov et al “Social Context as Machine Processable Knowledge” presented the architecture of the hybrid recommender system in the activity centric environment Nepomuk- Simple (EU 6th Framework Project NEPOMUK). “Real” desktops usually have piles of things on them where the users (consciously or unconsciously) grouped together items which are related to each other or to a task. The so called “Pile” UI, used in the Nepomuk-Simple imitates this type of data and metadata organisation which helps to avoid premature categorisation and reduces the retention of useless documents. Metadata describing the user data are stored in the Nepomuk personal information management ontology (PIMO). Proper recommendations, such as recommendation of additional items to add to the pile, apparently should be based on the PIMO, on the textual content of the items in the pile. Although methods of natural language processing for information retrieval could be useful, the most important type of textual processing are those which allows to related concepts in PIMO to the processed texts. Since PIMO changes over the time, this type of natural language processing can’t be performed as preprocessing of all textual context related to the user. Hybrid recommendation needs on-the fly textual processing with the ability to aggregate the current instantiation of PIMO with the results of textual processing.91 © 2011 Alexander Troussov
  92. 92. Nepomuk Representing and modeling this ontology as a multidimensional network allows to augment the ontology on the fly by new information, such as the “semantic” content of the textual information in user documents. Recommendations in the Nepomuk-Simple are computed on the fly by graph-based methods performing in the unified multidimensional network of concepts from the personal information management ontology augmented with concepts extracted from the documents pertaining to the activity in question. Troussov et al. 2008 classify Nepomuk-Simple recommendations into two major types. – The first type of recommendations is recommendation of the additional items to the pile, when the user is working on an activity. – The second type of recommendations arises, for instance, when the user is browsing Web; the Nepomuk-Simple can recommend that current resource might be relevant to one or more activities performed by the user. In both cases there is a need to operate with Clouds (fuzzy sets of PIMO nodes): Clouds describe topicality of documents in terms of PIMO, the pile itself is a Cloud.92 © 2011 Alexander Troussov
  93. 93. Pile UI93 © 2011 Alexander Troussov
  94. 94. Nepomuk use case: activity management A user started to work on a new project CID. Using the Nepomuk SSD, she collects a “pile” of resources she needs while working on the project: MS-Word documents, contacts, etc by drag-and-dropping resources from her desktop, by linking resources from e-mail (Mozilla Thunderbird) and web browser (Firefox) applications.94 © 2011 Alexander Troussov
  95. 95. Nepomuk use case: activity management using IBM recommendercodenamed “Galaxy” Galaxy (IBM hybrid recommender) analyses the pile content and linkage structure as a multidimensional network of concepts extracted from documents and links between concepts, projects, project participants, meetings, document authors, … . and provides handy recommendations of resources she might possibly need95 © 2011 Alexander Troussov
  96. 96. Nepomuk use case: activity management Galaxy can spot what the user might miss: “This web page might be relevant to your CID activity”96 © 2011 Alexander Troussov
  97. 97. Thank you ! © 2011 Alexander Troussov
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×