The Network Data Structure in Computing

  • 7,306 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
7,306
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
94
Comments
0
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Network Data Structure in Computing Marko A. Rodriguez Los Alamos National Laboratory Vrije Universiteit Brussel [email_address] http://cnls.lanl.gov/~marko
  • 2. About me.
    • Marko Antonio Rodriguez.
    • Bachelors of Science in Cognitive Science from U.C. San Diego.
    • Minor in the Arts in Computer Music from U.C. San Diego.
    • Masters of Science in Computer Science from U.C. Santa Cruz.
    • Visiting Researcher at the Center for Evolution, Complexity, and Cognition at the Free University of Brussels.
    • Ph.D. in Computer Science from U.C. Santa Cruz.
    • Researcher at the Los Alamos National Laboratory since 2005.
  • 3. Research trends.
    • MESUR : Metrics from Scholarly Usage of Resources. ( http://www.mesur.org )
    • Neno / Fhat : A Semantic Network Programming Language and Virtual Machine Architecture. ( http://neno.lanl.gov )
    • C D M S : Collective Decision Making Systems. ( http://cdms.lanl.gov )
  • 4. What is a network?
    • A network is a data structure that is used to connect vertices /nodes/dots by means of edges /links/lines.
    • Networks are everywhere.
      • Social : friendship, trust, communication, collaboration.
      • Technological : web-pages, communication, software dependencies, circuits.
      • Scholarly : journals, authors, articles, institutions.
      • Natural : protein interaction, neural, food web.
  • 5. The undirected network.
    • There is the undirected network of common knowledge.
      • Sometimes called an undirected single-relational network.
      • e.g. vertex i and vertex j are “related”.
    • The semantic of the edge denotes the network type .
      • e.g. friendship network, collaboration network, etc.
    i j
  • 6. Example undirected network. Herbert Marko Aric Ed Zhiwu Alberto Jen Johan Luda Stephan Whenzong
  • 7. The directed network.
    • Then there is the directed network of common knowledge.
      • Sometimes called a directed single-relational network.
      • For example, vertex i is related to vertex j , but j is not related to i .
    i j
  • 8. Example directed network. Muskrat Bear Fish Fox Meerkat Lion Human Wolf Deer Beetle Hyena
  • 9. The semantic network.
    • Finally, there is the semantic network
      • Sometimes called a directed multi-relational network.
      • For example, vertex i is related to vertex j by the semantic s , but j is not related to i by the semantic s .
    i j s
  • 10. Example semantic network. SantaFe Marko NewMexico Ryan California UnitedStates LANL livesIn worksWith cityOf originallyFrom stateOf stateOf locatedIn hasLab Cells Atoms madeOf madeOf researches Oregon southOf hasResident Arnold governerOf northOf
  • 11. Google’s PageRank.
    • PageRank
      • Used to rank web-pages that are connected by citation (hyper-link).
    Note: this image was stolen off the web from somewhere.
  • 12. The components to calculate a stationary probability distribution.
    • Take a single “random walker”.
    • Place that random walker on any random vertex in the network.
    • At every time step, the random walker transitions from its current node to an adjacent node in the network (i.e. takes a random outgoing edge from its current node.)
    • Anytime the random walker is at a node, increment a “times visited” counter by 1.
    • Let this algorithm run for an “infinite” amount of time.
    • Normalize the “times visited” counters.
      • That is your centrality vector.
    a 1 0.0123
  • 13. Random walker example. a c b d 0 0 0 0
  • 14. Random walker example. a c b d 1 0 0 0
  • 15. Random walker example. a c b d 1 0 1 0
  • 16. Random walker example. a c b d 1 0 1 1
  • 17. Random walker example. a c b d 1 1 1 1
  • 18. Random walker example. a c b d 1 1 2 1
  • 19. Random walker example. a c b d 1 2 2 1
  • 20. Random walker example. a c b d 2 2 2 1
  • 21. Random walker example. a c b d 2 2 3 1
  • 22. Random walker example. a c b d 2 2 3 2
  • 23. Random walker example. a c b d 2 3 3 2
  • 24. Random walker example. a c b d 2 3 4 2
  • 25. Random walker example. a c b d 66785 133310 133321 66784
  • 26. Random walker example. a c b d 0.167 0.332 0.332 0.167
  • 27. Breather.
  • 28. Example semantic network. SantaFe Marko NewMexico Ryan California UnitedStates LANL livesIn worksWith cityOf originallyFrom stateOf stateOf locatedIn hasLab Cells Atoms madeOf madeOf researches Oregon southOf hasResident Arnold governerOf northOf
  • 29. What is the Semantic Web?
    • The figurehead of the Semantic Web initiative, Tim Berners-Lee, describes the Semantic Web as
      • “ ... an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. ”
    • Perhaps not the best definition. It implies a particular application space--namely the “web metadata and intelligent agents” space.
    • My definition is that the Semantic Web is
      • “ a distributed, standardized semantic network data model--a URG (Uniform Resource Graph). It’s a uniform way of graphing resources. ”
  • 30. What is a resource?
    • Resource = Anything.
      • Anything that can be identified.
    • The Uniform Resource Identifier (URI):
      • <scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]
        • http://www.lanl.gov
        • urn:uuid:550e8400-e29b-41d4-a716-446655440000
        • urn:issn:0892-3310
        • http://www.lanl.gov#MarkoRodriguez
          • prefix it to make it easier on the eyes -- lanl:MarkoRodriguez
    • The Semantic Web
      • “ first identify it, then relate it! ”
  • 31. The technologies of the Semantic Web.
    • Resource Description Framework (RDF): The foundation technology of the Semantic Web. RDF is a highly-distributed, semantic network data model. In RDF, URIs and literals (e.g. ints, doubles, strings) are related to one another in triples.
      • <lanl:marko> <lanl:worksWith> <lanl:jhw>
      • <lanl:jhw> <lanl:wrote> <lanl:LAUR-07-2028>
      • <lanl:LAUR-07-2028> <lanl:hasTitle> “Web-Based Collective Decision Making Systems”^^<xsd:string>
    • RDF Schema (RDFS): The ontology is to the Semantic Web as the schema is to the relational database.
      • “ Anything of rdf:type lanl:Human can lanl:drive anything of rdf:type lanl:Car .”
  • 32. RDF and RDFS. lanl:marko lanl:cookie lanl:Human lanl:Food lanl:isEating rdf:type rdf:type lanl:isEating rdfs:domain rdfs:range ontology instance RDF is not a syntax. It’s a data model. Various syntaxes exist to encode RDF including RDF/XML, N-TRIPLE, TRiX, N3, etc.
  • 33. PageRank in a semantic network? lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote ? lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type ? ?
  • 34. Components of a grammar-based walker.
    • A walker .
      • Discrete element.
    • A grammar .
      • An abstract representation of legal path for the walker take.
        • e.g. “you can traverse a lanl:friendOf edge from a lanl:Human to another lanl:Human .”
        • Also includes rules: “increment a counter.”, “don’t ever return to this vertex.”
    • A data set that respects the ontological “expectations” of the grammar.
  • 35. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 0 0 0 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 36. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.” 1 0 0
  • 37. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 1 0 0 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 38. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 1 0 1 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 39. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 1 0 1 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 40. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 2 0 1 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 41. Grammar-based PageRank example. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type 2 0 1 “ Take only lanl:wrote out-edge to a resource of rdf:type lanl:Article . Then take a lanl:wrote in-edge to a resource of rdf:type lanl:Human . Increment only lanl:Human s. Make sure that the lanl:Human seen before is not the same lanl:Human currently. Repeat infinitely.”
  • 42. Grammars create implicit relationships. lanl:marko lanl:p1 lanl:wrote lanl:johan lanl:wrote lanl:chuck lanl:hasFriend lanl:Article rdf:type rdf:type lanl:Human rdf:type rdf:type lanl:hasCoauthor
  • 43. Conclusions.
    • Many systems can be represented as a network.
    • The semantic network is a more expressive, though less studied data model.
    • The grammar technique can be used to port many of the common network analysis algorithms to the semantic network domain.
  • 44. Related publications.
    • Rodriguez, M.A., Watkins, J.H., Bollen, J., Gershenson, C., “ Using RDF to Model the Structure and Process of Systems ”, International Conference on Complex Systems, Boston, Massachusetts, LAUR-07-5720, October 2007.
    • Rodriguez, M.A., Bollen, J., Van de Sompel, H., “ A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and their Usage ”, 2007 ACM/IEEE Joint Conference on Digital Libraries, pages 278-287, Vancouver, Canada, ACM/IEEE Computing, doi:10.1145/1255175.1255229, LA-UR-07-0665, June 2007.
    • Rodriguez, M.A., &quot;Social Decision Making with Multi-Relational Networks and Grammar-Based Particle Swarms &quot;, 2007 Hawaii International Conference on Systems Science (HICSS), pages 39-49, Waikoloa, Hawaii, IEEE Computer Society, ISSN: 1530-1605, doi:10.1109/HICSS.2007.487, LA-UR-06-2139, January 2007.
    • Rodriguez, M.A., &quot; A Multi-Relational Network to Support the Scholarly Communication Process &quot;, International Journal of Public Information Systems, volume 2007, issue 1, pages 13-29, ISSN: 1653-4360, LA-UR-06-2416, March 2007.
    • Rodriguez, M.A., “ Mapping Semantic Networks to Undirected Networks ”, LA-UR-07-5287, August 2007.
    • Rodriguez, M.A., Watkins, J.H., “ Grammar-Based Geodesics in Semantic Networks ”, LA-UR-07-4042, June 2007.
    • Rodriguez, M.A., Bollen, J., “ Modeling Computations in a Semantic Network ”, LA-UR-07-3678, May 2007.
    • Rodriguez, M.A., “ General-Purpose Computing on a Semantic Network Substrate ”, LA-UR-07-2885, April 2007.
    • Rodriguez, M.A., “ Grammar-Based Random Walkers in Semantic Networks ”, Knowledge-Based Systems, Elsevier, LA-UR-06-7791, in press, 2007.