CiteSearch:
Multi-faceted Fusion Approach to Citation Analysis
Kiduk Yang and Lokman Meho
Web Information Discovery Integr...
2
CiteSearch: What, Why, & How
 Goal
• Quality Assessment of Scholarly Publications
 Motivation
• Lack of comprehensive ...
3
CiteSearch Study: Overview
 Objectives
• Investigate current citation analysis environment
• Test the viability of Cite...
44
Citation Databases
Web of Science Scopus Google Scholar
Breadth of
coverage
36M records
8,700 titles
Journals (240 open...
5
Scopus and WoS: Citation Count
 Scopus vs. WoS
• 14.0% (278) more citations by Scopus
 More comprehensive coverage by ...
6
Impact of Scopus By Research Area
- varies significantly between research areas- varies significantly between research a...
7
Impact of Scopus on Faculty Members Relative
Ranking
Scopus significantly alters the relative ranking of those faculty
m...
8
Scopus + WoS: Citation Count By Document Type
Scopus
(359)
WoS
(229)
18%
(92)
54%
(267)
Scopus ∪ WoS
(496)
28%
(137)
Con...
9
Scopus + WoS: Summary of Results
 Coverage
• Varies greatly between research areas
 Increase in citations ranges from ...
10
Google Scholar Citations By Document Type
11
Citations By Language
12
Impact of GS By Research Area
13
Impact of GS on Faculty Members Relative Ranking
GS does not significantly alter the rankings of faculty members
14
GS vs. Scopus∪WoS
 GS increases WoS∪Scopus citations by 93% (2,552)
 Scopus∪WoS increases GS citations by 26% (1,104)...
15
CiteSearch Study: GS + Scopus + WoS
Google Scholar
(4203)
4.3%
(230)
18.3%
(970)
48.3%
(2561)
GS ∪ Scopus ∪ WoS
(5307)
...
16
GS + Scopus∪WoS: Summary of Results
 Coverage
• Varies greatly between research areas
 23% to 144% increase by combin...
17
Findings
 Scopus, WoS, and GS complement rather than replace each other
 GS can be useful in showing evidence of broa...
18
Conclusions
 Multiple sources of citations should be used to generate accurate
citation counts and rankings
• Citation...
19
CiteSearch System: Work-in-Progress
 Federated Citation Search
• To compile comprehensive & usable citation data
1. Qu...
20
CiteSearch System: Architecture
21
End
22
23
Upcoming SlideShare
Loading in …5
×

Kiduk yang citesearch

1,137 views
1,044 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,137
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • These are old figures. also total counts seem suspect.
    old figures:
    table 2: 3019 for WoS, 2564 for Scopus are total counts, not unique counts
    table 8: 5259 total, 5018 unique for all document type (where does 5,680 come from?)
  • N=5307
    A+B+C=1104 (20.8%)
    D+E+F=1642 (30.9%)
    A+D=438
    C+E=721
    B+F=1587
  • Kiduk yang citesearch

    1. 1. CiteSearch: Multi-faceted Fusion Approach to Citation Analysis Kiduk Yang and Lokman Meho Web Information Discovery Integrated Tool Laboratory Keimyung University, Korea American University of Beirut, Lebanon October 27, 2010
    2. 2. 2 CiteSearch: What, Why, & How  Goal • Quality Assessment of Scholarly Publications  Motivation • Lack of comprehensive citation database • Limitations of conventional citation analysis  One-dimensional assessment  Misleading evaluation  Approach • Multi-faceted, Fusion-based Citation Analysis  Combine data from multiple citation databases  Assess quality using various quality evaluation measures
    3. 3. 3 CiteSearch Study: Overview  Objectives • Investigate current citation analysis environment • Test the viability of CiteSearch system  Method • Search citation databases and compare the results  Setup • Study sample  Publications of 15 SLIS faculty members (approx. 1,100 publications) • Databases used  Google Scholar, Scopus, Web of Science • Citation sources  Journals and conference papers in 1996-2005
    4. 4. 44 Citation Databases Web of Science Scopus Google Scholar Breadth of coverage 36M records 8,700 titles Journals (240 open access) & conference papers 28M records 15,000 titles Journals (500 open access) & conference papers 500M records Unknown 30+ document types Coverage years A&HCI: 1975- SCI: 1900- SSCI: 1956- 1996-present (with cited references) 1966-present (without cited references) Unknown Subject area All All All • Data collection - WoS: 100 hours - Scopus: 200 hours - GS: over 3,000 hours Data as of 2006
    5. 5. 5 Scopus and WoS: Citation Count  Scopus vs. WoS • 14.0% (278) more citations by Scopus  More comprehensive coverage by Scopus (15,000 vs. 8,700 periodicals)  Scopus + WoS • Scopus increases WoS citations by 35% (710) • WoS increases Scopus citations by 19.0% (432) • Relatively low overlap (58%) and high uniqueness (42%) Scopus (2,301) Web of Science (2,023) 58% (1,591) 26% (710) 16% (432) Scopus ∪ WoS (2,733)
    6. 6. 6 Impact of Scopus By Research Area - varies significantly between research areas- varies significantly between research areas
    7. 7. 7 Impact of Scopus on Faculty Members Relative Ranking Scopus significantly alters the relative ranking of those faculty members that appear in the middle of the rankings
    8. 8. 8 Scopus + WoS: Citation Count By Document Type Scopus (359) WoS (229) 18% (92) 54% (267) Scopus ∪ WoS (496) 28% (137) Conference Papers Only
    9. 9. 9 Scopus + WoS: Summary of Results  Coverage • Varies greatly between research areas  Increase in citations ranges from 5% to 99% by combining results from both databases • Scopus has a much better coverage of conference proceedings  Overlap: 18%  Scopus only: 54%  WoS only: 28%  Ranking by citation count • Relative ranking of faculty members changes significantly for those in the middle
    10. 10. 10 Google Scholar Citations By Document Type
    11. 11. 11 Citations By Language
    12. 12. 12 Impact of GS By Research Area
    13. 13. 13 Impact of GS on Faculty Members Relative Ranking GS does not significantly alter the rankings of faculty members
    14. 14. 14 GS vs. Scopus∪WoS  GS increases WoS∪Scopus citations by 93% (2,552)  Scopus∪WoS increases GS citations by 26% (1,104)  GS identifies 53% (or 1,448) more citations than WoS∪Scopus  GS has much better coverage of conference proceedings • (1,849 by GS vs. 496 by Scopus∪WoS)  GS has over twice as many unique citations as Scopus∪WoS • (2,552 vs. 1,104, respectively) Google Scholar (4,181) Scopus∪WoS (2,733) 31% (1,629) 48% (2,552) 21% (1,104) GS ∪ Scopus∪WoS (5,285)
    15. 15. 15 CiteSearch Study: GS + Scopus + WoS Google Scholar (4203) 4.3% (230) 18.3% (970) 48.3% (2561) GS ∪ Scopus ∪ WoS (5307) Scopus (2308) WoS (2025) 11.7% (617) 8.2% (435) 3.8% (204) 5.3% (282)
    16. 16. 16 GS + Scopus∪WoS: Summary of Results  Coverage • Varies greatly between research areas  23% to 144% increase by combining GS & Scopus∪WoS  5% to 98% increase by combining Scopus & WoS • GS has strong coverage in CS & IS  HCI, IR, computational linguistics, social informatics • Scopus∪WoS has stronger coverage in LS  Bibliometrics, collection development, information policy • GS provides significantly better coverage of non-English materials  GS (7%); Scopus (1%); WoS (1%)  Ranking • No significant changes in relative ranking of faculty members
    17. 17. 17 Findings  Scopus, WoS, and GS complement rather than replace each other  GS can be useful in showing evidence of broader international impact than could possibly be done through Scopus and WoS  GS can be very useful for citation searching purposes; however, it is not conducive for large-scale comparative citation analyses  Scopus significantly alters the relative citation ranking of scholars as measured by Web of Science. GS does not
    18. 18. 18 Conclusions  Multiple sources of citations should be used to generate accurate citation counts and rankings • Citation databases complement one another • Small overlap between sources may significantly influence relative ranking  Multi-faceted citation analysis is needed • citation coverage varies by research area, document type, language  CiteSearch can greatly facilitate citation analysis • Enormous effort is required to  Refine search strategy  Parse search results  Eliminate noise (duplicate citations)  Extract & normalize citation metadata
    19. 19. 19 CiteSearch System: Work-in-Progress  Federated Citation Search • To compile comprehensive & usable citation data 1. Query multiple citation databases 2. Filter out noise • e.g., invalid, duplicate citations 1. Extract & normalize metadata • bibliographical metadata (e.g., title, author, year, source, etc.) • citation metadata (e.g., doctype, subject, language, etc.)  Multi-faceted Citation Analysis • To produce multi-faceted quality/impact assessment measures that  account for variance in citation quality (e.g., Weighted citation counts, CiteRank)  consider various facets of evaluation metric (e.g., Document type, language)  accommodate diffent aspects of quality assessment (e.g., H-Index, Mentor-Index) 1. Compute citation-based quality scores (CQS) for each publication 2. Compute CQS for authors, schools, publishers using publication CQS 3. Compute CQS for each publication weighted by author/school/publisher scores 4. Compute CQS for authors, schools, publishers using weighted publication CQS 5. Repeat steps 3 and 4 until convergence
    20. 20. 20 CiteSearch System: Architecture
    21. 21. 21 End
    22. 22. 22
    23. 23. 23

    ×