Qatar Digital Library Project Workshop


Published on

The first workshop on the "Qatar Digital Library Project”, held at Qatar University on May 20, 2013.

This project is part of a program of national priorities for scientific research NPRP, and funded by the Qatar National Research Fund (QNRF).

The project is managed by Dr. Edward Fox, the Lead Principal Investigator from Virginia Tech and Dr. Mohamed Samaka the Co-LPI from the Department of Computer Science and Engineering at Qatar University, and shared by many experts in digital libraries such as Dr. Lee Giles from Pennsylvania State University, and Dr. Richard Furuta from Texas A & M University. Consultants such as Dr. John Impagliazzo from Hofstra University in New York and Dr. Susan Lukesh, and Carol Thompson and Robert Laws, researchers Myrna Tabet and Asad Nafes from Qatar University and Tarek Kanan from Virginia Tech, Hamed AlHouri from Texas A & M University.

This workshop is the first part of a series of workshops and seminars to present the project and to train faculty, students, librarians and digital Qatari community members interested in joining the project and expand the national collections and services.

More info at

Published in: Education, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Qatar Digital Library Project Workshop

  1. 1. HTTP://WWW.TAMU.EDU/HTTP://WWW.PSU.EDU/HTTP://WWW.QU.EDU.QA/ HTTP://WWW.VT.EDU/Qatar Digital Library ProjectMonday, 20 May 2013 1
  2. 2. Qatar Digital Library (QDL)InitiativeWorkshop #1Monday, 20 May 201308:45 to 14:00Auditorium Room 117, New LibraryQatar University — Doha, Qatar2Monday, 20 May 2013
  3. 3. IntroductionsDr. Mohammed SamakaCollege of EngineeringQatar University — Doha, QatarProject Co-Lead Principal Investigator3Monday, 20 May 201308:55 – 09:05
  4. 4. Qatar Digital Library Project• Global Explosion of information:o More than 30,000 peer-reviewed research journalsexist worldwideo 2.5 million articles published per year• Knowledge society requires:o Deep awareness and access to the best contento Real-time research results to assist and improvestrategic decision-makingMonday, 20 May 2013 4
  5. 5. Qatar Digital Library Project TeamQatar University, Qatar:Mohammed Samaka (Ph.D., Co-Lead PI)Myrna TabetKhalid AbualSaudAsad NafeesSumaya Ali S A Al-Maadeed(Ph.D., Key Investigator)This project was made possible by NPRP Grant # 4 - 029 - 1 – 007 fromthe Qatar National Research Fund (a member of Qatar Foundation).Virginia Tech, USA:Edward Fox (Ph.D., Lead-PI)Tarek KananPenn. State University, USA:C. Lee Giles (Ph.D., PI)Texas A&M, USA:Richard Furuta (Ph.D., PI)Hamed AlhooriMonday, 20 May 2013 5Consultants:John Impagliazzo (Ph.D., Key Investigator)Susan Lukesh (Ph.D.)Carole ThompsonRobert Laws
  6. 6. Project MissionTransform the use of information in Qatar,moving toward a knowledge society,in accord with the Qatar National Vision 2030.Monday, 20 May 2013 6Qatar Digital Library Project
  7. 7. Project Objectives/AimsA. Research and prototype digital library systems and infrastructure forQatar, focusing initially on Qatari information related to governmentand scholarly activities.Leverage the crawling engine fromPenn State‘s SeerSuite software infrastructure, andextend it beyond its current focus on English to support Arabic-English collections, andto cover a broad range of scholarly disciplines, and all types of governmentinformation.B. Research and build the digital library community in Qatar, supportingdigital library use, services, collection development, tailored systems,and advancing toward a Knowledge Society.Study scholarly activities, and engage in community building in Qatar, so DLs can betailored to specific domains and to the unique needs of Qatar. Through workshops, aconsulting center at the proposed Institute, and collaborative efforts with libraries andmuseums in Qatar, we will identify particular needs and uses, and tailor collections,systems, and services, to lead toward the Qatari Knowledge Society.Monday, 20 May 2013 7Qatar Digital Library Project
  8. 8. Welcoming CommentsDr. Rashid AlammariDean, College of EngineeringQatar University — Doha, Qatar8Monday, 20 May 201309:05 – 09:15
  9. 9. AcknowledgmentThis workshop and presentation is due topartial support from a grant from theQatar National Research Fund (QNRF)through itsNational Priority Research Program (NPRP)Number 4-029-1-0079Monday, 20 May 201309:15 – 09:15
  10. 10. ParticipantsSubmit CompletedQDL Surveys10Monday, 20 May 2013
  11. 11. Overview of Digital LibrariesDr. Edward FoxDepartment of Computer ScienceVirginia Tech — Blacksburg, Virginia USAProject Lead Principal Investigator11Monday, 20 May 201309:15 – 09:40
  12. 12. Philosophy & MessageCollaborationLocalNationalRegional, GlobalEmpowermentUploadingSharingOpen AccessResearchComputing,Digital libraries,Info. retrieval, …EducationDL curriculumGraduate: ETDsUgrad: EnsembleMonday, 20 May 2013 12
  13. 13. Outline• Acknowledgements• Digital Libraries• NDLTD (electronic theses / dissertations)• Digital Library Curriculum Project• Ensemble (Pathway in US NSDL)• Crisis, Tragedy & Recovery Network (CTRnet)• Saudi Digital Library - SDLMonday, 20 May 2013 13
  14. 14. Acknowledgements• Mentors (Licklider & Kessler 1967-71 MIT, Salton1978-1983 Cornell)• QNRF, Qatar University, NSF and other sponsors• Students, colleagues, co-investigators• Virginia Tech: Computer Science, Digital LibraryResearch Lab, Information Technology• Collaborators on local, national, and internationalprojects14Monday, 20 May 2013
  15. 15. DLs — Objectives in 1991• World Lit.: 24hr / 7day / from desktop• Integrated “super” information systems: 5S: Table of relatedareas and their coverage• Ubiquitous, Higher Quality, Lower Cost• Education, Knowledge Sharing, Discovery• Disintermediation -> Collaboration• Universities Reclaim Property• Interactive Courseware, Student Works• Scalable, Sustainable, Usable, UsefulMonday, 20 May 2013 15
  16. 16. DL Overview: Why of Global Interest?• National projects can preserve antiquities andheritage: cultural, historical, linguistic, scholarly• Knowledge and information are essential to economicand technological growth, education• DL - a domain for international collaborationo wherein all can contribute and benefito which leverages investment in networkingo which provides useful content on Internet & WWWo which will tie nations and peoples together more stronglyand through deeper understandingMonday, 20 May 2013 16
  17. 17. 17Monday, 20 May 2013
  18. 18. 18Libraries of the FutureJCR Licklider, 1965, MIT PressWorldNationStateCityCommunityMonday, 20 May 2013
  19. 19. 19Information Life CycleAuthoringModifyingOrganizingIndexingStoringRetrievingDistributingNetworkingRetention/ MiningAccessingFilteringUsingCreatingMonday, 20 May 2013
  20. 20. 20Digital Library ContentArticles,Reports,BooksTextDocumentsSpeech,MusicVideoAudio(Aerial)PhotosGeographicInformationModelsSimulationsSoftware,ProgramsGenomeHuman,animal,plantBioInformation2D, 3D,VR,CATImages andGraphicsContentTypesMonday, 20 May 2013
  21. 21. 21Content-BasedInformationRetrievalMonday, 20 May 2013
  22. 22. 22Digital Objects (DOs)• “Born digital”oCreated digitally• Digitized version of “real” objectoIs the DO version the same, better, or worse?oSuggestion for documents : structured + rendered• Surrogate for “real” objectoScanned versionso3D modelsMonday, 20 May 2013
  23. 23. 23Institutional Repositories• “Institutional repositories are digital collectionsthat capture and preserve the intellectual output ofa single university or a multiple institutioncommunity of colleges and universities.”• Crow, R. “Institutional repository checklist andresource guide”, SPARC, Washington, D.C., USA•, 20 May 2013
  24. 24. 24Goals of Institutional Repositories(by Steven Harnad, University of Southampton)• Self Archiving of Institutional Researcho Theses and Dissertations (VTLS NDLTD Project)o Article preprints and post printso Internal documents and map• Management of digital collections• Preservation of materials – decentralized approach• Housing of teaching materials• Electronic publishing of journals, books, posters,maps, audio, video and other multimedia objectsAdapted from Slide by V. Chachra, VTLSMonday, 20 May 2013
  25. 25. NDLTD:• Networked Digital Library of Theses andDissertations (NDLTD)• N D Ltd or “Noodle TD”• Vision:Every thesis and dissertation in the world is:o Devised to take advantage of the most helpfulelectronic publishing methodso Shared globally and easily foundo Supported by a suite of digital library services to aidauthors, researchers, learners, universitieso Preserved and migrated permanentlyMonday, 20 May 2013 25
  26. 26. • Aiding universities and nations to enhance graduateeducation, publishing, preservation (data sets next!),and Intellectual Property Rights efforts• Helping improve the availability and content of(electronic) theses and dissertations (ETDs)• Educating ALL future scholars so they can publishelectronically and effectively use digital libraries (i.e.,are Information Literate and can be more expressive)What are we doing?Monday, 20 May 2013 26http://
  27. 27. 27Why ETD? Short Answer• For Students:o Gain knowledge and skills for the Information Ageo Richer communication(digital information, multimedia, …)• For Universities:o Easy way to enter the digital library field andbenefit thereby• For the World:o Global digital library – large, useful, many services• General:o Save time and moneyo Increased visibility for all associated with researchresultsMonday, 20 May 2013
  28. 28. Monday, 20 May 2013 28
  29. 29. Curriculum Module Template1. Module name2. Scope3. Learning objectives4. 5S characteristics of the module(streams, structures, spaces, scenarios,society)5. Level of effort required (in-class andout-of-class time required for students)6. Relationships with other modules(flow between modules)7. Prerequisite knowledge/skillsrequired (what the students need toknow prior to beginning the module;completion optional; complete only ifprerequisite knowledge/skills are notincluded in other modules)8. Introductory remedial instruction(the body of knowledge to be taughtfor the prerequisite knowledge/skillsrequired; completion optional)9. Body of knowledge (theory +practice; an outline that could be usedas the basis for class lectures)10. Resources (required readings forstudents; additional suggested readingsfor instructor and students)11. Exercises / Learning activities12. Evaluation of learning objectiveachievement (graded exercises orassignments)13. Glossary14. Additional useful links15. Contributors (authors of module,reviewers of module)Monday, 20 May 2013 29
  30. 30. 30DL Curriculum FrameworkSemester 1:DL collections:development/creationSemester 2:DL services andsustainabilityCOURSESTRUCTUREDigitizationStorageInterchangeDigital objectsCompositesPackagesMetadataCatalogingAuthorsubmissionNamingRepositoriesArchivesSpaces(conceptual,geographic,2/3D, VR)Architectures(agents, buses,wrappers/mediators)InteroperabilityServices(searching,linking,browsing, etc.)Intellectual propertyrights mgmt.PrivacyProtection (watermarking)Archiving andpreservationIntegrityArchitectures(agents, buses,wrappers/mediators)InteroperabilityCOREDLTOPICSDocumentsE-publishingMarkupInfo. NeedsRelevanceEvaluationEffectivenessThesauriOntologiesClassificationCategorizationBibliographicinformationBibliometricsCitationsRoutingFilteringCommunityfilteringSearch & search strategyInfo seeking behaviorUser modelingFeedbackInfosummarizationVisualizationMultimediastreams/structuresCapture/representationCompression/codingContent-basedanalysisMultimediaindexingMultimediapresentation,renderingRELATEDTOPICSMonday, 20 May 2013
  31. 31. Monday, 20 May 2013 31
  32. 32. Monday, 20 May 2013 32
  33. 33. Monday, 20 May 2013 33
  34. 34. Monday, 20 May 2013 34
  35. 35. www.nsdl.orgMonday, 20 May 2013 35
  36. 36. Ensemble, Pathway in NSDL•National STEM (science, technology,engineering, and mathematics)education Digital Library – NSDL•National Science Digital Library••Many projects, largest now called…… PathwaysMonday, 20 May 2013 36
  37. 37. 37NSDL ConnectsUsers: students, educators, life-long learnersContent: structured learning materials;large real-time or archived datasets;audio, images, animations;primary sources;digital learning objects (e.g. applets);interactive (virtual, remote) laboratories; ...Tools: search; refer; validate; integrate; create;customize; publish; share; notify; collaborate; ...This slidefrom Lee ZiaMonday, 20 May 2013
  38. 38. Ensemble:www.computingportal.orgMonday, 20 May 2013 38
  39. 39. 39• Human tragedies that result from man-madeand natural events affect humans andcommunities significantly.• During and after a tragic event, there are aseries of needs that have to be addressed.o Compounded by communication failures and aconfusing plethora of data and informationCrisis, Tragedy, and Recovery (CTR)Monday, 20 May 2013
  40. 40. 40CTR stakeholdersMonday, 20 May 2013
  41. 41. Monday, 20 May 2013 41
  42. 42. Saudi Digital Library - about SDL• … wide spreading of scientific blocks or groupings• … linking between academic and research communities.• … supporting these scientific groupings at the national level,where it provideso sophisticated information serviceso … digital information resources in various forms,o … accessible to faculty staff, researchers and studentso …Monday, 20 May 2013 42
  43. 43. Saudi Digital Library- about SDL• The digital library includes:o The largest gathering of e-books in the Arab world.More than (100.000) e-book in full text in various scientificspecializationso More than 300 global publisherssuch as Elsevier, Springer, Pearson Wiley, Taylor & Francis,McGraw-Hill, Yale University, Oxford University, HarvardUniversityMonday, 20 May 2013 43
  44. 44. Monday, 20 May 2013 44
  45. 45. ‫السعودية‬ ‫الرقمية‬ ‫المكتبة‬•‫بين‬ ‫تربط‬ ‫والتي‬ ‫صورها‬ ‫بشتى‬ ‫العلمية‬ ‫التجمعات‬ ‫أو‬ ‫التكتالت‬ ‫بانتشار‬ ‫الراهن‬ ‫عصرنا‬ ‫يتسم‬‫للتعلم‬ ‫الوطني‬ ‫للمركز‬ ‫التابعة‬ ‫السعودية‬ ‫الرقمية‬ ‫المكتبة‬ ‫وتعتبر‬ ‫والبحثية‬ ‫األكاديمية‬ ‫المجتمعات‬‫أبرز‬ ‫من‬ ‫السعودية‬ ‫العربية‬ ‫بالمملكة‬ ‫العالي‬ ‫التعليم‬ ‫وزارة‬ ‫في‬ ‫بعد‬ ‫عن‬ ‫والتعليم‬ ‫اإللكتروني‬‫توفير‬ ‫على‬ ‫يعمل‬ ‫حيث‬ ، ‫الوطني‬ ‫المستوى‬ ‫على‬ ‫العلمية‬ ‫التكتالت‬ ‫هذه‬ ‫لمثل‬ ‫الداعمة‬ ‫الصور‬، ‫أشكالها‬ ‫بمختلف‬ ‫الرقمية‬ ‫المعلومات‬ ‫مصادر‬ ‫إتاحة‬ ‫إلى‬ ‫إضافة‬ ، ‫متطورة‬ ‫معلوماتية‬ ‫خدمات‬‫العليا‬ ‫الدراسات‬ ‫مرحلتي‬ ‫في‬ ‫والطالب‬ ‫والباحثين‬ ‫التدريس‬ ‫هيئة‬ ‫أعضاء‬ ‫متناول‬ ‫في‬ ‫وجعلها‬‫العالي‬ ‫التعليم‬ ‫مؤسسات‬ ‫وبقية‬ ‫السعودية‬ ‫بالجامعات‬ ‫والبكالوريوس‬•‫حيث‬ ،‫العربي‬ ‫العالم‬ ‫في‬ ‫المعلومات‬ ‫لمصادر‬ ‫أكاديمي‬ ‫تجمع‬ ‫أكبر‬ ‫هي‬ ، ‫السعودية‬ ‫الرقمية‬ ‫المكتبة‬‫من‬ ‫أكثر‬ ‫تضم‬114‫بالتحديث‬ ‫وتقوم‬ ،‫األكاديمية‬ ‫التخصصات‬ ‫كافة‬ ‫تغطي‬ ،‫علمي‬ ‫مرجع‬ ‫ألف‬‫البعيد‬ ‫المدى‬ ‫على‬ ً‫ا‬‫ضخم‬ ً‫ا‬‫معرفي‬ ً‫ا‬‫تراكم‬ ‫يحقق‬ ‫مما‬ ‫المحتوى؛‬ ‫لهذا‬ ‫المستمر‬.‫المكتبة‬ ‫تعاقدت‬ ‫وقد‬‫من‬ ‫أكثر‬ ‫مع‬300‫عالمي‬ ‫ناشر‬.‫والمعلومات‬ ‫للمكتبات‬ ‫العربي‬ ‫االتحاد‬ ‫بجائزة‬ ‫المكتبة‬ ‫فازت‬ ‫وقد‬«‫اعلم‬»‫عام‬ ‫العربي‬ ‫العالم‬ ‫مستوى‬ ‫على‬ ‫المتميزة‬ ‫للمشاريع‬2010‫م‬.•‫الناشرين‬ ‫مع‬ ‫بالتفاوض‬ ‫خاللها‬ ‫من‬ ‫تقوم‬ ،‫واحدة‬ ‫مظلة‬ ‫السعودية‬ ‫الجامعات‬ ‫لجميع‬ ‫المكتبة‬ ‫وتوفر‬‫التكتل‬ ‫خالل‬ ‫من‬ ،‫وللجهود‬ ‫للمال‬ ‫كبير‬ ‫توفير‬ ‫هذا‬ ‫وفي‬ ،‫والمالية‬ ‫القانونية‬ ‫القضايا‬ ‫مختلف‬ ‫حول‬‫أمام‬ ‫والحقوق‬ ‫المنافع‬ ‫من‬ ‫مزيد‬ ‫على‬ ‫تحصل‬ ‫أن‬ ‫خاللها‬ ‫من‬ ‫تستطيع‬ ،‫واحدة‬ ‫مظلة‬ ‫تحث‬‫الناشرين‬.Monday, 20 May 2013 45
  46. 46. Project Aims / Non-Textual ContentDr. John ImpagliazzoEmeritus, Computer Science DepartmentHofstra University — Hempstead, New York USAProject Consultant and Key Investigator46Monday, 20 May 201309:40 – 09:50Additional Content Contributions byCarole Thompson and Robert LawsProject Consultants
  47. 47. Project Aims (1 of 5)Aim #1Research and prototype digital library systemsand infrastructure for Qatar, focusing initially onQatari information related to government andscholarly activities.Aim #2Research and build the digital library communityin Qatar, supporting digital library use, services,collection development, tailored systems, andadvancing toward a knowledge society.Monday, 20 May 2013 47
  48. 48. Project Aims (2 of 5)Regarding Aim 1:• Leverage Penn State’s SeerSuite software infrastructure• Implement novel advanced systems on the proposedequipment• Extend SeerSuite beyond its current focus on English tosupport Arabic-English collections and cross-languagediscovery• Extend the effort to cover a broad range of scholarlydisciplines: computing, chemistry, …• Support all types of government informationMonday, 20 May 2013 48
  49. 49. Project Aims (3 of 5)Regarding Aim 1 (continued):• Demonstrate how deep analysis of digital objects andcollections provides superior capabilities beyond those incommercial systems• Obtain pages, reports, and other information from allbranches of the government through websites as well asother databases and other accessible online venues• Collect information related to education and museums• Focus on automatic or semi-automatic collectiondevelopment• Cover key aspects of Qatari information currently availableMonday, 20 May 2013 49
  50. 50. Project Aims (4 of 5)Regarding Aim 2:• Study scholarly activities (by surveys and at your locations)• Identify particular needs and uses• Tailor DL content to specific domains and to the uniqueneeds of Qatar• Establish a consulting center at the QDL Institute• Collaborate in efforts with libraries and museums in Qatar• Engage in community building in Qatar (join us!)Monday, 20 May 2013 50
  51. 51. Project Aims (5 of 5)Regarding Aim 2 (continued):• Tailor collections, systems, and services to lead toward theQatari Knowledge Society• Extend work on social networks to collect and utilize data,allowing personalized as well as group and agency tailoring.• Include key communities such as citizens, educators,scholars, and students• Partner with the new digital librarians to add othercollections, especially covering Qatari culture and heritage• Identify collections with key metadataMonday, 20 May 2013 51
  52. 52. The Need to Safeguard CultureMaulana Khan (2001)Doha Conference of Ulama on Islam and Cultural Heritage“Every group or community has its ownparticular culture and has the absoluteright to safeguard that culture”Maulana Khan. Proceedings of the Doha Conference of ‘Ulama on Islam and Cultural Heritage.Doha, Qatar. December 30–31, 2001. p. 66 New York: UNESCO, 2005. Retrieved Nov. 15, 2010 from, 20 May 2013 52
  53. 53. Non-Textual Collections (1 of 2)• Initial focus of QDL on automatic or semi-automatic collectiondevelopment• Project also includes supplemental collections consisting of non-automatic material.• Sampling would complete the spectrum of data collection usefulin supporting research and the long-term interests of Qatar.• Project also seeks to develop a collection that demonstrates thevarious aspects of content preservation• Research effort seeks to explore non-text artifacts to:o Enhance the textual research and collectiono Preserve the heritage and culture of the country and its peopleo Offer a basis for related researchMonday, 20 May 2013 53
  54. 54. Non-Textual Collections (2 of 2)• Examples of the sampling includes areas such as:o Qatari Art, Literature, Musico Qatari Sports, Politicso Qatari Education (historical and contemporary perspectives)o Qatari Museum Collections• Appropriate metadata would:o Document each itemo Serve as examples of the varying types of descriptiono Serve as a basis to build new digital libraries in Qatar• Sampling of some of these materials would:o Serve the people of Qataro Become a model for other effortso Demonstrate the potential for future researchMonday, 20 May 2013 54
  55. 55. Non-Textual Examples (1 of 3)• Images landmarks in Qataro Buildingso Mosques• Oral Historieso Over 2000 CDso Need to document themo Make them available for preservation• Literatureo Ministry of Culture, Arts, and Heritage (MoC) making an efforto Al MalikMonday, 20 May 2013 55
  56. 56. Non-Textual Examples (2 of 3)• Qatari Arto Qatari Artist Directoryo Names of Artistso Examples of their Work• Work of Dr. Wafa Al-Hamad• Exhibits now at QU, New Library Exhibition Hallo Need to make examples of these collections• Sports in Qataro Document exhibits and activities of historical interesto Falconryo Camel Racingo Horse racingTalal Nayef Al QasimMonday, 20 May 2013 56
  57. 57. Non-Textual Examples (3 of 3)• Music of Qataro Qatari Musical Band Concert images (MoC)o Qatar Philharmonic Orchestrao Document the Development of Music in Qataro Information from the QF Music Academy at Katara• Educationo Qatar University (State University)o Hamad bin Khalifa University (Education City)• Media Collectionso Al Jazeerao NewspapersMonday, 20 May 2013 57
  58. 58. QDL and Librarian / Corporate /Government PerspectivesMyrna TabetLibrary ServicesQatar University — Doha, QatarProject Research Associate58Monday, 20 May 201309:50 – 10:00
  59. 59. Qatar Digital Library (QDL)• Importance, Benefits, and Content• Preservation of Culture and History of Qatar• Scholarly, Governmental, Institutional, andCorporate ViewpointsMonday, 20 May 2013 59
  60. 60. Importance of the QDL Project (1 of 2)• Users have:o Become sophisticated and adept at using technologyo Higher expectations of service provided by libraries• Role of (digital) librarians:o Changing as a consequence to the digital shifto New ways in provision of informationMonday, 20 May 2013 60
  61. 61. Importance of the QDL Project (2 of 2)• Digital libraries can:o Assist in the transformation of data into informationo Help in building a knowledge-based society• Governments aim:o Provide access to relevant information for their citizens• Nationwide digital library community can worktogether toward these goalsMonday, 20 May 2013 61
  62. 62. DL Benefits to Library Users(1 of 2)• Vastly more information at your fingertips• Access 24/7, from anywhere, anytime• Rapidly updated: Current + Historical information• Information sharing and collaborationMonday, 20 May 2013 62
  63. 63. DL Benefits to Library Users(2 of 2)• New forms of accesso Multilingual / multimediao Hypermedia (Linked data, Text, Images, Audio, Video)• Improved preservation with:o Metadatao Information exchange protocolsMonday, 20 May 2013 63
  64. 64. Significance to Librarians, Corporations,and Governmental Agencies (1 of 2)• The need to preserve cultural and historical heritage =>o Collections of fragile and precious artifacts =>o Libraries, museums, and archives developing digitalcollections =>o Users from all over the world accessing and studying• SeerSuite (crawler, search engine)o Collect from the Web and from curated collection of artifactso Images, audio, and text for browsing and searchingo Extractions of tables and references / citationso Machine learning and artificial intelligence (AI) – beyondcommercial optionsMonday, 20 May 2013 64
  65. 65. Significance to Librarians, Corporations,and Governmental Agencies(2 of 2)• A one stop search of:o Information about Qataro Information to preserve the culture of Qatar• Indexing, analysis, and retrieval of:o Resources, reports, statistics, and other types ofinformationo Information in the Arabic language as well as in EnglishMonday, 20 May 2013 65
  66. 66. Available Content(1 of 2)• Materials captured:o Local scholarly, culturalo Governmental documents• Metadata, data, and many types of documents(including full text)• Free and open as well:o Freely accessible for anyone to useo Available for authorized users due to:• Licenses• Cost issuesMonday, 20 May 2013 66
  67. 67. Available Content(2 of 2)• Main resources:o First appeared in digital formo Often referred to as being ‘born’ digital• At a later stage the project will include:o Digital versions of material already existing in printo Multimedia (image, audio, video) forms• Surveys and studies will:o Guide collection strategies and prioritieso Satisfy the needs of the Qatari communityMonday, 20 May 2013 67
  68. 68. Selected Digital Library ReferencesLesk, M. (2005).Understanding digital libraries, 2nd ed.San Francisco, CA: Morgan Kaufmann.Tedd, L. & Large, A. (2005).Digital libraries: Principles and practice in a globalenvironment.Munchen: K.G. Saur.Witten, I., Bainbridge, D. & Nichols, D. (2010).How to build a digital library, 2nd ed.Burlington, MA: Morgan Kaufmann, Elsevier.Monday, 20 May 2013 68
  69. 69. QDL and Researcher PerspectivesHamed AlhooriDepartment of Computer Science & EngineeringTexas A&M University — College Station, Texas USAProject Research Associate69Monday, 20 May 201310:00 – 10:10
  70. 70. IntroductionInadequacy ofliterature reviews(Boote, D.N., et al., 2005)Monday, 20 May 2013 70
  71. 71. Introduction – ObjectiveUnderstand and support the dynamic informationneeds, information-seeking behavior, informationuse, and other scholarly activities of researchers,scientists, engineers, scholars and students in Qatar.Monday, 20 May 2013 71
  72. 72. Introduction - Research Questions• How do researchers currently search, select, andmanage their information sources?• What difficulties are researchers facing during theliterature review process?• How social reference management andrecommendation systems, used in scholarlycommunities, influenced the research process?• How to measure a better scientific impact for eachdiscipline using multi-dimensional metrics?• What are the current scholarly research needs?Monday, 20 May 2013 72
  73. 73. Related Studies• New patterns of searching (Hallmark, J., 2004).• Difficulty locating information (George, C., 2006).• Not aware of or familiar with some of the servicesand do not consult librarians (Kuruppu, P.U., 2006).• LimitationsMonday, 20 May 2013 73
  74. 74. Methodology• Qualitative and quantitative research methods• Statistical hypothesis testing techniquesMonday, 20 May 2013 74
  75. 75. Initial Results — 1• (Alhoori, et al., 2011) - acceptance rate 9%• 164 researchers participated in the studyo 25 faculty members, 5 postdocs, 84 doctoral students,28 master students, 22 undergraduate students.o 131 male and 33 femaleo Participants were from 13 different disciplines fromTexas A&M University – College StationMonday, 20 May 2013 75
  76. 76. Initial Results — 2• Differences in reading habits• Difficulties locating their needs• Getting lost• Repeated results• Printing articles and using folders to organize• Notes• Research updates• Social reference management - lack of awareness,accuracyMonday, 20 May 2013 76
  77. 77. Initial Results — 3• Saving methods• Significant relationship betweenoSaving methods and collaborationoSaving methods and retrieving articles• Researchers’ satisfaction• Search differences• Research interests• Publication overload (78%)Monday, 20 May 2013 77
  78. 78. References• Boote, D.N., Beile, P. Scholars Before Researchers: On the Centrality of theDissertation Literature Review in Research Preparation. Educational Researcher.34, 3-15 (2005).• Hallmark, J. Access and Retrieval of Recent Journal Articles: A ComparativeStudy of Chemists and Geoscientists. Issues in Science and TechnologyLibrarianship. 40 (2004).• George, C., Bright, A., Hurlbert, T., Linke, E.C., ST Clair, G., Stein, J. Scholarly useof information: graduate studentsʼ information seeking behaviour. InformationResearch. 11, 1-19 (2006).• Kuruppu, P.U., Gruber, A.M. Understanding the Information Needs of AcademicScholars in Agricultural and Biological Sciences. The Journal of AcademicLibrarianship. 32, 609-623 (2006).• Alhoori, H., Furuta, R. ,“Understanding the dynamic scholarly research needsand behavior as applied to social reference management,” InternationalConference on Theory and Practice of Digital Libraries, TPDL 2011.Monday, 20 May 2013 78
  79. 79. Monday, 20 May 2013 79
  80. 80. B R E A K80Monday, 20 May 201310:10 – 10:40
  81. 81. Web Archiving, Crawling,and SeerSuiteDr. Edward Fox and Tarek KananDepartment of Computer ScienceVirginia Tech — Blacksburg, Virginia USAProject Lead Principal InvestigatorProject Research Associate81Monday, 20 May 201310:40 – 11:15
  82. 82. Web Archiving and Web CrawlingDr. Edward FoxDepartment of Computer ScienceVirginia Tech — Blacksburg, Virginia USAProject Lead Principal Investigator82Monday, 20 May 201310:40 – 11:05
  83. 83. Archiving by AssemblingContent + Metadata• Collect digital objectso Digitize / purchase / obtain submissionsoCatalog each => metadata record• Aggregate into a metadata catalogo Searchableo Browsableo Usually free of intellectual property rights concernsoCan be shared through the Open Archives Initiative (OAI)Monday, 20 May 2013 83
  84. 84. 84OAI = Technical Umbrella forPractical Interoperability…ReferenceLibrariesPublishersE-PrintArchives…that can be exploited by differentcommunitiesMuseumsMonday, 20 May 2013
  85. 85. 85DiscoveryCurrentAwarenessPreservationService ProvidersData ProvidersThe World According to OAIMonday, 20 May 2013
  86. 86. Web Archiving• Introduction: Web archiving is the process ofgathering up data recorded on the World Wide Web,• storing it,• ensuring the data is preserved in an archive, and• making the collected data available for futureresearch.• The Internet Archive and several national librariesinitiated Web archiving practices in 1996.Monday, 20 May 2013 86
  87. 87. Monday, 20 May 2013 87
  88. 88. Web Archiving• 2001: International Web Archiving Workshop (IWAW):o share experiences and exchange ideas.• 2003: International Internet Preservation Consortium(IIPC):o international collaboration ino developing standards and open source tools for theo creation of Web archives.• Tools: Heritrix, Memento, SiteStory• Web growth =>o concern with change and loss =>o local and national Web archiving initiativesMonday, 20 May 2013 88
  89. 89. Web Crawlers• A Web crawler is an Internet bot that systematicallybrowses the World Wide Web, typically for the purposeof Web indexing.• A Web crawler also may be called a Web spider, an ant,or an automatic indexer.• Web search engines and some other sites use Webcrawling or spidering software to update their Webcontent or indexes of others sites’ Web content.• Web crawlers can copy all the pages they visit for laterprocessing by a search engine that indexes thedownloaded pages so that users can search them muchmore quickly.Monday, 20 May 2013 89
  90. 90. Web Crawler• A Web crawler starts with a list of URLs to visit,called the seeds.• On those page, identifies all the hyperlinks• adds them to the list of URLs to visit• recursively visits pages pointed to• according to a set of policies.• Prioritizes its downloads – some pages change often.Monday, 20 May 2013 90
  91. 91. Web CrawlersDifficulties and Limitations• Technical challenges of Web archiving• Intellectual property laws.• Peter Lyman, states that "although the Web is popularlyregarded as a public domain resource, it is copyrighted;thus, archivists have no legal right to copy the Web".• However national libraries in many countries have alegal right to copy portions of the Web• under an extension of a requirement for legal deposit.Monday, 20 May 2013 91
  92. 92. Web CrawlersDifficulties and Limitations• Removal requests: implemented by:• WebCite, the Internet Archive, or Internet Memory• Other Web archives are only accessible fromcertain locations or have regulated usage.• WebCite cites a recent lawsuit against Googlescaching, which Google won.Monday, 20 May 2013 92
  93. 93. Focused Crawlers• For a particular topic or event• to build a Web collection focused in that area• Start with URLs of interest, viewed as seeds to grow from• Expand in a ‘smart’ way to get all and only what is relevant• Use information retrieval / artificial intelligence / machinelearningo Require ‘knowledge bases’ and/or human training examples• Nevertheless, there is a tradeoff between the resultingo Recall (i.e., coverage of what is out there)o Precision (i.e., freedom from noise in what is collected)Monday, 20 May 2013 93
  94. 94. SeerSuiteTarek KananDepartment of Computer ScienceVirginia Tech — Blacksburg, Virginia USAProject Research Associate94Monday, 20 May 201311:05 – 11:15
  95. 95. SeerSuite (1 of 4)• Prof. C. Lee Giles, QDL Principal Investigatoro Created CiteSeer in 1997o Associates: Steve Lawrence and Kurt Bollacker at theNEC Research Institute (now NEC Labs) in Princeton,New Jersey, USAo Now at Penn State University, he continues to leadSeerSuite into the “Next Generation”•, 20 May 2013 95
  96. 96. SeerSuite (2 of 4)• What is SeerSuite?• SeerSuite built by…• SeerSuite supports…Monday, 20 May 2013 96
  97. 97. SeerSuite (3 of 4)• SeerSuite was designed to provide aframework that would replace CiteSeer• SeerSuite improves on aspects of theoriginal CiteSeer, with features such as:o Reliabilityo Robustnesso ScalabilityMonday, 20 May 2013 97
  98. 98. SeerSuite (4 of 4)• The motivation behind SeerSuite• SeerSuite design• SeerSuite enables access to:o Extensive document collectionso Citationso Author metadataMonday, 20 May 2013 98
  99. 99. SeerSuite - CiteSeerX• CiteSeerX shares several components withdigital libraries and search engineso Web Interfaceo Crawlerso Indexo DatabasesMonday, 20 May 2013 99
  100. 100. SeerSuite - CiteSeerX• Domain specific repositories and digitallibrary systemso arXiv for physicso RePEc for economicso Greenstone• CiteSeerXo Closest in design to Google ScholarMonday, 20 May 2013 100
  101. 101., 20 May 2013 101
  102. 102., 20 May 2013 102
  103. 103. Monday, 20 May 2013 103
  104. 104. SeerSuite - MyCiteSeerX• SeerSuite improves users’ information access.• MyCiteSeerX roles….• MyCiteSeerX allows users to storeo Querieso Document portfolioso Tag documentso Monitor and track documents of interest• MyCiteSeerX needs:o Registrationo LoginMonday, 20 May 2013 104
  105. 105. Monday, 20 May 2013 105
  106. 106. SeerSuite - TableSeer• Tables Contain Important Data• Analyzes, extracts, and indexes from tablesMonday, 20 May 2013 106
  107. 107. Monday, 20 May 2013 107
  108. 108. QDL WebsiteAsad Nafees*Technology ServicesQatar University — Doha, QatarProject Research Associate108Monday, 20 May 201311:15 – 11:25* Presented by Hamed Alhoori
  109. 109. QDL Website, 20 May 2013 109
  110. 110. Site Objectives• Provide in-depth information on everything related to the“QDL Project”. This includes objectives, progress, data sets,presentation slides, and interim reports.• Raise awareness and provide material on concepts relatedto digital libraries, associated systems, and processes.• Create an online identity for the proposed institute withinformation on activities and opportunities to participate.• Collect feedback and data through surveys and opinionpolls.Monday, 20 May 2013 110
  111. 111. Primary Audience• Groups of visitors making up the project’s primaryaudience will be the main focus of our site:o Librarians and libraries in Qataro Researchers and academicso Government organizationso Non-Governmental organizations(such as, 20 May 2013 111
  112. 112. Secondary Audience• Groups of visitors making up the secondaryaudience – these are important but not critical:o University / School Studentso Teachers / Facultyo Managerso Qatari citizenso Other stakeholdersMonday, 20 May 2013 112
  113. 113. Content(Published or Under Development)• Information and details about the project.• A list of digital libraries currently available in Qatar.• Information on the digital library systems availableso they can set up collections for their end users.• Searchable database of example digital librariesaround the world - in English, Arabic, or otherlanguages, filtered by topic or other criteria.• Lessons and tutorials on how to define and createcollections.Monday, 20 May 2013 113
  114. 114. Content(Published or Under Development)• A ‘Try it now’ function that would allow an end user to go through thesteps of setting up a digital library, creating collections, and uploadingelectronic artifacts.• An online forum for librarians allowing for an open discussion of issuesand ideas that be used to help organize collections and artifacts.• Information page on how content in digital library research archives can• increase their visibility around the world,• put them in contact with other researchers and collaborators, and• improve their funding opportunities.• Examples and links to peer sites in other countries where governmentinformation is successfully archived and accessed from a digital libraryMonday, 20 May 2013 114
  115. 115. The Survey• We have a survey• Responses to this survey will help us tailor ourefforts to the resources and needs of Qatar.• Provide your contact information; we will keep youinformed about the project’s progress over time,and workshops that are offered.• English:• Arabic:, 20 May 2013 115
  116. 116. Join the Mailing List• Keep up to date withinformation about ourproject by joining ourmailing list.• Submit your email andwell keep you informedabout our newinitiatives, upcomingevents, seminars andnews.Monday, 20 May 2013 116
  117. 117. Audience ParticipationDr. John ImpagliazzoEmeritus, Computer Science DepartmentHofstra University — Hempstead, New York USAProject Consultant and Key Investigator117Monday, 20 May 201311:25 – 11:45
  118. 118. Audience ParticipationConstituent Interestin QDL Project118Monday, 20 May 201311:25 – 11:35
  119. 119. Audience ParticipationQ & A Session119Monday, 20 May 201311:35 – 11:45
  120. 120. Global Perspectiveand QDL Next StepsDr. Edward FoxDepartment of Computer ScienceVirginia Tech — Blacksburg, Virginia USAProject Lead Principal Investigator120Monday, 20 May 201311:45 – 11:55
  121. 121. World Digital Library• Free and growing Internet available collection of significantcultural treasures from many countries and cultures• Photographs, books, manuscripts, maps, audio recordings,and films• Searchable in 7 languages: Arabic, English, French, Russian,Chinese, Spanish and Portugueseo With an easy to use interface that enableso browsing by place, time, topic, type of item, andcontributing institution, oro open-ended search.Monday, 20 May 2013 121
  122. 122. World Digital Libraryand Qatar National Library• The World Digital Library -- for use by students,scholars, and members of the public.• UNESCO program led by the US Library of Congress inpartnership with libraries all over the world.• The National Library Qatar Foundation is one of anumber of Library members from Arabic/Islamiccountries that contribute content.• The Qatar Foundation is a major financial sponsor ofthe World Digital Library.Monday, 20 May 2013 122
  123. 123. World Digital Libraryand British Library• Qatar Foundation Qatar National Libraryo partnership with the British Library to make Arabicscience and Gulf history available for worldwideresearch• Began in July 2012o to digitize more than half a million pages ofo historic documents detailingo Arab history and culture.Monday, 20 May 2013 123
  124. 124. World Digital Libraryand British Library• The three-year projecto to transform people’s understanding of the history ofthe Middle Easto material from the UK’s India Office archive +o medieval Arabic manuscripts on science and medicine.• At the British Library, in close cooperation with theQatar National Library.Monday, 20 May 2013 124
  125. 125. QDL FocusCommunity in Qatar• Identify interested stakeholders, to tailor to needs• Train next generation of digital librarians, archivists,and curators• Partners helping with additional collectiondevelopmentAdvanced Technology for Enhanced Access• “Low hanging fruit” by crawling Qatar-related Web• Improved analysis (citations, tables, chemicals, …)• Support for both Arabic and EnglishMonday, 20 May 2013 125
  126. 126. Closing RemarksDr. Mohammed SamakaCollege of EngineeringQatar University — Doha, QatarProject Co-Lead Principal Investigator126Monday, 20 May 201311:55 – 12:00
  127. 127. Submit CompletedWorkshop Evaluations127Monday, 20 May 201312:00 – 12:00
  128. 128. L U N C H128Monday, 20 May 201312:00 – 13:00
  129. 129. OptionalPost-workshopHands-on Session13:00 to 14:00Room 110129Monday, 20 May 201313:00 – 14:00