Fusion of Libraries and the Web:
   Subject-based Information
  Retrieval in the Web 2.0 Era
    Digital Library Research ...
Agenda
1. What is happening on the information
   retrieval world in the Web 2.0 era?
2. Toward the Lib 2.0: what is neede...
Agenda
1. What is happening on the information
   retrieval world in the Web 2.0 era?
2. Toward the Lib 2.0: what is neede...
History of information access
               methods
• Until the 20th century
  – To research in libraries
  – To go to bo...
What is Web 2.0?
Definition by Tim O’Reilly (2005)
• The Web as platform
• Harnessing collective intelligence
• Data is th...
What is Web 2.0?
Definition by Tim O’Reilly (2005)
• The Web as platform
• Harnessing collective intelligence
  – Improvem...
Possibility of collective intelligence
as an Information access method
 • The broadest gate for information
   – Any keywo...
Folksonomy
= “folks” + “taxonomy”
Meta-tagging by ordinary people
• Web services diffusing from 2005
    Flickr, del.icio....
Photos searched by “B747”
                        Using Flicker




                                       Tags assigned t...
10
11
Wikipedia categories
• A multilingual          Price
                          Price in economics and business is the assi...
Two classification paradigms
Folksonomy                Taxonomy
• Bottom-up approach      • Top-down approach
• Multiple h...
Limitation of the folksonomy
Browsing-oriented
• Floods of information without validation
• Prefer new information to old ...
Agenda
1. What is happening on the information
   retrieval world in the Web 2.0 era?
2. Toward the Lib 2.0: what is neede...
Toward the Lib 2.0
Fusion of libraries and the Web
• To provide accesses to stocks in libraries
  – Induction from the Web...
Library classification systems
    (gaps between ideal and reality)
• Classification (UDC, LC, DDC, …)
  – ideal: browsing...
A hint: pathfinders

• Firstly developed at MIT library (in 1970s)
• The most demanded information
  resources at the begi...
Possibilities of automation of
         pathfinder creation
• Templates are available
• Subject headings and library class...
Agenda
1. What is happening on the information
   retrieval world in the Web 2.0 era?
2. Toward the Lib 2.0: what is neede...
A solution: fusion of Wikipedia
 categories and library classifications
Expansion of classifications using Wikipedia
• Wik...
Natural Sciences  Technique
                       Social sciences (300)
                                                 ...
Start point
          of retrieval


             information resources on the Web
                            folksonomy
...
Agenda
1. What is happening on the information
   retrieval world in the Web 2.0 era?
2. Toward the Lib 2.0: what is neede...
What is Littel Navigator?
A search engine for hints of information retrieval
=A pathfinder generator
• Fusion of various i...
Hanshin-Awaji Great earthquake (1995)

        Input keywords related to
         what you want to know
Induced themes related to the keywords
earthquake             <- history of earthquake
                                   ...
Reports for 10th anniversary of Hanshin-Awaji earthquake,
                     Kobe University
A journal for seismology
(volumes in a library, OPAC)
Meta search of various databases
A search result of Google Scholar
(articles for Hanshin-Awaji Earthquake)
A search result of JapanKnowledge
(encyclopedia)
Assosiated keywords
History of Information Retrieval
Conclusion
• Fusion of library classification systems and
  the Web 2.0 paradigm enables “A search
  engine of hints for i...
Upcoming SlideShare
Loading in …5
×

Fusion of Libraries and the Web: Subject-based Information Retrieval in the Web 2.0 Era

1,201 views

Published on

The 2008 annual meeting of the Committee on Japanese Materials (CJM), Council on East Asian Libraries (CEAL), Association for Asian Studies (AAS), Hyatt Regency Atlanta, Georgia, USA, April, 2008.

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,201
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Fusion of Libraries and the Web: Subject-based Information Retrieval in the Web 2.0 Era

  1. 1. Fusion of Libraries and the Web: Subject-based Information Retrieval in the Web 2.0 Era Digital Library Research Division Information Technology Center University of Tokyo Yoji Kiyota (Assistant Professor)
  2. 2. Agenda 1. What is happening on the information retrieval world in the Web 2.0 era? 2. Toward the Lib 2.0: what is needed to enhance the values of libraries? 3. A solution: fusion of folksonomy and taxonomy 4. Application for reference services: Littel Navigator
  3. 3. Agenda 1. What is happening on the information retrieval world in the Web 2.0 era? 2. Toward the Lib 2.0: what is needed to enhance the values of libraries? 3. A solution: fusion of folksonomy and taxonomy 4. Application for reference services: Littel Navigator
  4. 4. History of information access methods • Until the 20th century – To research in libraries – To go to bookstores – To ask someone – To ask consultant services (librarians, government offices, consultants, …) – To look up Web directories, BBS, databases, … • From the 21st century: in addition, – To google (Web search engines) – To access QA sites, SNS, folksonomy, … So-called the Web 2.0 paradigm
  5. 5. What is Web 2.0? Definition by Tim O’Reilly (2005) • The Web as platform • Harnessing collective intelligence • Data is the next Intel Inside • End of the software release cycle • Lightweight programming models • and more… Just a buzzword? --- No, if we choose one of the definitions
  6. 6. What is Web 2.0? Definition by Tim O’Reilly (2005) • The Web as platform • Harnessing collective intelligence – Improvements on web search engines – Folksonomy: organization by the wisdom of crowds • Data is the next Intel Inside • End of the software release cycle • Lightweight programming models
  7. 7. Possibility of collective intelligence as an Information access method • The broadest gate for information – Any keyword hits using Google – The best way to find cues? • Diversity of information – Exceeds diversity of mass-media • Self-organization of information – Improvements of web search engines • PageRank: democracy on the Web – Folksonomy: meta-tagging based on the wisdom of crowds 7
  8. 8. Folksonomy = “folks” + “taxonomy” Meta-tagging by ordinary people • Web services diffusing from 2005 Flickr, del.icio.us, YouTube, … • Every participant assigns tags to contents, based on each viewpoint • As a result, diverse tags are assigned to each content • Example approaches of libraries – Ann Arbor District Library “Social OPAC”
  9. 9. Photos searched by “B747” Using Flicker Tags assigned to Related tags each content (co-occurrence)
  10. 10. 10
  11. 11. 11
  12. 12. Wikipedia categories • A multilingual Price Price in economics and business is the assigned encyclopedia edited numerical monetary value of a good, service or asset. The concept of price is central to microeconomics where by ordinary people it is one of the most important variables in resource allocation theory (also called price theory)..… • Categories are Categories: [marketing][economics][market] assigned to each article society social studies – categories are economy regarded as folksonomy tags labor industry – multi-level business skills commerce folksonomy – multiple hyponyms business science logistics marketing economics market price
  13. 13. Two classification paradigms Folksonomy Taxonomy • Bottom-up approach • Top-down approach • Multiple hyponyms • Only one hyponym Classification suitable Classification suitable for Web resources for library materials 13
  14. 14. Limitation of the folksonomy Browsing-oriented • Floods of information without validation • Prefer new information to old information – How many web pages survive 10 years? cf. web.archive.org • Shallow organization – Meta noise of folksonomy • Lack of methods for evaluation and validation Can libraries complement the limitation of 14 folksonomy?
  15. 15. Agenda 1. What is happening on the information retrieval world in the Web 2.0 era? 2. Toward the Lib 2.0: what is needed to enhance the values of libraries? 3. A solution: fusion of folksonomy and taxonomy 4. Application for reference services: Littel Navigator
  16. 16. Toward the Lib 2.0 Fusion of libraries and the Web • To provide accesses to stocks in libraries – Induction from the Web to libraries – Enables validations of information on the Web • To provide viewpoints of information – Various methods, including reference books, dictionaries, … The role of library classification systems is very important! 16
  17. 17. Library classification systems (gaps between ideal and reality) • Classification (UDC, LC, DDC, …) – ideal: browsing of organized bookshelves is useful for information retrieval – reality: lack of browsing methods online (cf. OPAC) • Subject headings (LCSH, …) – ideal: concept-based search of catalogs – reality: lack of flexibility a high hurdle for ordinary people?
  18. 18. A hint: pathfinders • Firstly developed at MIT library (in 1970s) • The most demanded information resources at the beginning of information retrieval? • Manually created by librarians – low coverage Can we create pathfinders automatically? 18
  19. 19. Possibilities of automation of pathfinder creation • Templates are available • Subject headings and library classifications play the major role for pathfinder generation ↓ Possible if subject headings are estimated for query keywords • databases of reference books, elementary books, … A Problem too few subject headings! 19
  20. 20. Agenda 1. What is happening on the information retrieval world in the Web 2.0 era? 2. Toward the Lib 2.0: what is needed to enhance the values of libraries? 3. A solution: fusion of folksonomy and taxonomy 4. Application for reference services: Littel Navigator
  21. 21. A solution: fusion of Wikipedia categories and library classifications Expansion of classifications using Wikipedia • Wikipedia: folksonomy-based organization – The most organized information resource on the Web – Compatibility of other Web resources • Library classifications: taxonomy-based organization – Reflect the knowledge structure of human history – Powerful tools for demanded information resources Automated Subject Induction from Query Keywords through Wikipedia Categories and Subject Headings -> recommendation of useful information 21
  22. 22. Natural Sciences Technique Social sciences (300) (400) of eathquakes List (500) Encyclopedia of Earth Sciences Japan in Architecture Economics (330) Society (360) management earthquake hazard (450) (520) Seismology Architecture Economic Social walfare (453) structure (524) history (332) (369) Journals on Economic Hazard earthquakes Earthquake Economic history history (453.2) - Japan (332.1) Prediction of Earthquake Seismic hazard earthquakes construction Dictonary of Library classification (NDC) Economics and subject headings (BSH) Wikipedia Economic history Hazard Influences over Economic history History of hazard management Earthquake of Japan Hazards History of Earthquakes Heisei era earthquakes as seismic waves Impacts over economy of Japan Great Hanshin Earthquake
  23. 23. Start point of retrieval information resources on the Web folksonomy reliability on information retrieval Wikipedia library classification integration for and subject headings deeper retrieval information resources in libraries •elementary books •reference books •journals •past literatures
  24. 24. Agenda 1. What is happening on the information retrieval world in the Web 2.0 era? 2. Toward the Lib 2.0: what is needed to enhance the values of libraries? 3. A solution: fusion of folksonomy and taxonomy 4. Application for reference services: Littel Navigator
  25. 25. What is Littel Navigator? A search engine for hints of information retrieval =A pathfinder generator • Fusion of various information sources – Subject headings, Library classifications, reference books – Web sites, Wikipedia... • Automated theme induction from any keyword – Traversal of the Wikipedia network • Navigation from vague keywords to specific materials – Stored history of user inputs
  26. 26. Hanshin-Awaji Great earthquake (1995) Input keywords related to what you want to know
  27. 27. Induced themes related to the keywords earthquake <- history of earthquake Information resources related to earthquake <- history of economy economy <- history of Japan’s economy Induction paths of the themes <- earthquake seismology <- history of earthquake <- history of disaster prevention disaster <- history of earthquake
  28. 28. Reports for 10th anniversary of Hanshin-Awaji earthquake, Kobe University
  29. 29. A journal for seismology (volumes in a library, OPAC)
  30. 30. Meta search of various databases
  31. 31. A search result of Google Scholar (articles for Hanshin-Awaji Earthquake)
  32. 32. A search result of JapanKnowledge (encyclopedia)
  33. 33. Assosiated keywords
  34. 34. History of Information Retrieval
  35. 35. Conclusion • Fusion of library classification systems and the Web 2.0 paradigm enables “A search engine of hints for information retrieval – Various viewpoints for information retrieval – Derivation from Web resources to reliable resources • Complement each other – Web: any keywords can be uses as cues – Library: reliability and organization 35

×