[Power point presentation]

268
-1

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
268
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

[Power point presentation]

  1. 1. Data Mining – another tool for librarians George Beckett Health Sciences Library Memorial University of Newfoundland May 14, 2006 CHLA/ABSC 2006 Conference
  2. 2. This talk is about … <ul><li>Data mining and visualization techniques to aid the user in identifying relevant citations or interesting relationships in the sea of information </li></ul><ul><li>Exploring illustrative applications </li></ul><ul><li>Role of librarians </li></ul><ul><li>Strengths and weaknesses of these techniques </li></ul><ul><li>It is NOT about analysis of library operations </li></ul>
  3. 3. The challenge Rapidly growing search results from database and web searches that make effective assessment and ranking of results difficult PLUS Searchers untrained in searching techniques who want it fast and easy! The Google generation? +
  4. 4. Search results ascendant! Search Hits PubMed Google Heel pain 976 8,370,000 Breast cancer treatment 78,578 58,800,000 Health care policy Canada 2,341 203,000,000
  5. 5. Roles for Librarians <ul><li>Librarians must be able to teach their clients to use data mining and visualization tools as they become more common and needed </li></ul><ul><li>Understanding these tools, their strengths and weaknesses, is needed for mastery of effective information retrieval </li></ul>
  6. 6. Definitions … <ul><li>Data mining - &quot;The nontrivial extraction of implicit, previously unknown, and potentially useful information from data&quot; 1 and &quot;The science of extracting useful information from large data sets or databases&quot; 2 </li></ul><ul><li>Text mining , also known as intelligent text analysis , text data mining or knowledge-discovery in text ( KDT ), refers generally to the process of extracting interesting and non-trivial information and knowledge from unstructured text 3 </li></ul><ul><li>Web mining and web usage mining is the application of data mining techniques to discover usage patterns from the Web in order to better understand and serve the needs of users or Web-based applications. 4 </li></ul>
  7. 7. Definitions …(2) <ul><li>Information visualization is the use of interactive, visual representations of abstract data to aid cognition 5 </li></ul><ul><li>Federated searching consists of transforming a query and broadcasting it to a group of disparate databases with the appropriate syntax, merging the results collected from the databases, presenting them in a succinct and unified format with minimal duplication, and allowing the library patron to sort the merged result set by various criteria 6 </li></ul>
  8. 8. www.clusty.com <ul><li>Web federated search engine </li></ul><ul><li>Uses clustering to organize results </li></ul><ul><li>Uses text visualization techniques to make clustered results more accessible </li></ul><ul><li>Can act as a front end to selected sources such as PubMed using keyword searching </li></ul>
  9. 15. PUBMED only
  10. 16. www.kosmix.com <ul><li>Web information search engine in alpha version </li></ul><ul><li>Concentrates on health but will expand coverage to other areas </li></ul><ul><li>Uses data mining and basic visualization to categorize topics and display topics </li></ul>
  11. 19. RefViz ™ <ul><li>Bibliographic citation analysis tool - $150 US </li></ul><ul><li>Clusters citations and uses graphic visualizations to organize information </li></ul><ul><li>Useful for identifying major concept areas and possible linkages between concepts </li></ul><ul><li>Can customize analysis weightings for specific subject areas </li></ul><ul><li>Prefers working with citations with abstracts </li></ul>
  12. 25. Wrapping up … <ul><li>Data mining is useful but is not a gold plated solution to all searching problems </li></ul><ul><li>Most useful for: </li></ul><ul><ul><li>generalist searchers who do not use or know about specialized database search features </li></ul></ul><ul><ul><li>simplified front-end to other search engines </li></ul></ul><ul><ul><li>organizing results in federated searching </li></ul></ul><ul><ul><li>detecting relationships in search results </li></ul></ul><ul><ul><li>in-depth analysis of a static set of search results </li></ul></ul>
  13. 26. Wrapping up …(2) <ul><li>Not as effective as database specific tools at refining a search </li></ul><ul><li>Underlying search & retrieval process is often not obvious </li></ul><ul><li>Data mining is a set of skills and techniques that librarians must understand and be able to educate their clients about them </li></ul>
  14. 27. Links of interest <ul><li>Clusty.com - http://clusty.com </li></ul><ul><li>Kosmix.com - http://www.kosmix.com </li></ul><ul><li>RefViz.com - http://www.refviz.com </li></ul><ul><li>Grokker.com - http://www.grokker.com </li></ul><ul><li>Katy B ö rner - http://ella.slis.indiana.edu/~katy/research </li></ul><ul><li>Kdnuggets - http://www.kdnuggets.com </li></ul><ul><li>Visual Search option on EBSCOhost research databases </li></ul>
  15. 28. References <ul><li>W. Frawley, G. Piatetsky-Shapiro, C. Matheus, “Knowledge Discovery in Databases: An Overview” AI Magazine , Fall 1992 (213-228). </li></ul><ul><li>D. Hand, H. Mannila, P. Smyth: Principles of Data Mining. MIT Press, Cambridge, MA, 2001. </li></ul><ul><li>“ Text mining” Wikipedia, The Free Encyclopedia . 15 Mar 2006 <http://en.wikipedia.org/wiki/Text_mining> </li></ul><ul><li>“ Web mining” Wikipedia, The Free Encyclopedia. 15 Mar 2006 <http://en.wikipedia.org/wiki/Web_mining> </li></ul><ul><li>“ Information visualization” Wikipedia, The Free Encyclopedia. 15 Mar 2006 <http://en.wikipedia.org/wiki/Information_visualization> </li></ul><ul><li>“ Federated search” Wikipedia, The Free Encyclopedia. 15 Mar 2006 <http://en.wikipedia.org/wiki/Federated_search> </li></ul>

×