Your SlideShare is downloading. ×
0
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
The Structure of Computer Science Knowledge Network
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The Structure of Computer Science Knowledge Network

945

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
945
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Pham Manh Cuong
  • Pham Manh Cuong
  • Pham Manh Cuong
  • Transcript

    • 1. The Structure of the Computer Science Knowledge Network Manh Cuong Pham , Ralf Klamma Information Systems and Database Technology RWTH Aachen, Germany Odense, Denmark, August 09, 2010 ASONAM 2010
    • 2. Agenda
      • Introduction
      • SNA as a knowledge discovery method
      • Data sets: DBLP and CiteSeerX
      • Network visualization
      • Venue ranking
      • Conclusions and Outlook
    • 3. Introduction
      • Digital libraries (in computer science)
        • DBLP, ACM DL, IEEE Explorer, CiteSeerX, etc.
        • Digital media for scientific knowledge conservation
          • Publications
          • Venues
        • Development of research communities & research areas
        • Knowledge discovery: Citation analysis, usage-analysis, etc.
        • Digital libraries in Web 2.0: Mendeley, ResearchGate etc.
      • Problems
        • Structure of computer science knowledge
        • Existing research fields
        • The interconnection between fields
      VLDB community in 2006 (DBLP) VLDB community in 1990 (DBLP)
    • 4. Motivations
      • Scientometrics
        • Unit of analysis: journals
        • Knowledge mapping: building, visualizing and analyzing the knowledge network
        • Methods:
          • Citation analysis [Boyack 2005]
          • Content analysis
          • Log-data (usage data) analysis [Bollen 2009]
        • Data sets:
          • Journal Citation Index (JCR)
          • Science Citation Index (SCI)
          • Social Science Citation Index (SSCI), etc.
      • Problem
        • Computer science conferences
    • 5. Our Approach
      • Combination of large-scale digital libraries
        • DBLP
        • CiteSeer X
      • Citation analysis
        • Bibliographical coupling at venue level (conferences, journals)
        • Similarity measures
      • SNA as a knowledge discovery method
        • Visual analytics
        • Cluster analysis
        • SNA measures: PageRank, betweenness, hub, authority scores etc.
    • 6. Data Sets
      • DBLP (http://www.informatik.uni-trier.de/~ley/db/)
        • 788,259 author’s names
        • 1,226,412 publications
        • 3,490 venues (conferences, workshops, journals)
      • CiteSeerX (http://citeseerx.ist.psu.edu/)
        • 7,385,652 publications (including publications in reference lists)
        • 22,735,240 citations
        • Over 4 million author’s names
      • Combination
        • Canopy clustering [ McCallum 2000 ]
        • Result: 864,097 matched pairs
        • On average: venues cite 2306 and
        • are cited 2037 times
    • 7. Network Creation and Pre-processing
      • Knowledge network
        • Aggregate bibliography coupling counts at venue level
        • Undirected graph G(V, E) , where V : venues, E : edges weighted by cosine similarity
        • Threshold:
        • Clustering: density-based algorithm [ Neuman 2004, Clauset 2004 ]
        • Network visualization: force-directed paradigm [ Fruchterman 1991 ]
      • Knowledge flow network
        • Aggregate bibliography coupling counts at venue level
        • Threshold: citation counts >= 50
        • Domains from Microsoft Academic Search ( http://academic.research.microsoft.com/)
    • 8. Knowledge Network: the Visualization
    • 9. Knowledge Network: Clustering
    • 10. Interdisciplinary Venues: Top Betweenness Centrality
    • 11. High Prestige Series: Top PageRank
    • 12. Conclusions and Future Research
      • SNA does help to gain an insight into the computer science knowledge
      • Knowledge network in computer science
        • Highly clustered, large clusters form the core of computer science research
        • Research fields are interconnected
        • Interdisciplinary venues
      • Outlook
        • More digital libraries should be integrated: ACM, IEEE, CEUR-WS.org, etc.
        • Usage analysis
        • Dynamic analysis of knowledge network
    • 13. Questions ? http://bosch.informatik.rwth-aachen.de:5080/AERCS/

    ×