Digital Libraries with Superimposed Information<br />Supporting Scholarly Tasks that Involve Fine-grain Information<br />U...
Acknowledgments<br />My family<br />Dr. Edward Fox, Dr. Manuel Pérez-Quiñones, Dr. Ricardo Torres, Dr. Lois Delcambre, Dr....
Motivation: many scholarly tasks involve working with subdocuments<br />3<br />
Problems<br />Information is heterogeneous, voluminous, distributed across locations, and it is challenging to manage, org...
A digital library = repository of collections and metadata + services <br />5<br />
Scenario<br />6<br />Find me species that are darters that have a dorsal fin that looks like this, which is connected to a...
Superimposed information enables working with contextualized subdocuments<br />superimposed (new) information<br />marks<b...
Hypothesis<br />A digital library with superimposed information (SI-DL) provides enhanced support to scholarly tasks that ...
Research questions<br />9<br />
Research approach<br />10<br />
Research approach - theory<br />11<br />
Research approach - practical/user<br />12<br />
Review of work done and results<br />13<br />
Review of work done and results<br />14<br />
Review of work done and results<br />15<br />
Review of work done and results<br />16<br />
Review of work done and results<br />17<br />
Review of work done and results<br />18<br />
Review of work done and results<br />19<br />
Review of work done and results<br />20<br />
Subimage and SuperIDR use – a qualitative study<br />How do people use subimages in fish identification and how does Super...
Rationale: Maximize Use of SuperIDR<br />Recruit people with interest in fish ID<br />Have a longer duration of use in nat...
Study setup<br />23<br />
24<br />Study procedures<br />Data collected<br />Interview responses<br />Diary entries<br />Log data of SuperIDR use<br ...
Participants:3 groups<br />25<br />Analyzed participants based on fisheries and fish identification experience, current pr...
Subimage/annotation characteristics<br />940 subimages, annotations, most focusing on part of the fish (image)<br />26<br />
27<br />morphological description, size, color, presence, counts, location<br />Color<br />Location<br />
Co-presence, morphological comparisons, multiple parts description, connections/relationships, comparison with other infor...
information object as a whole, combination of types<br />29<br />Information object as a whole<br />Combination of color a...
Strategies and contexts that suggest subimage use in fish identification<br />In learning methods<br />In identification (...
Subimage use in SuperIDR<br />Marking and annotating subimages (940 subimages and annotations)<br />Browsing through subim...
32<br />Subimages in species learning methods<br />
Manually inspecting subimages while comparing similar species<br />33<br />
Complex object as a query<br />
“It [SuperIDR] is pulling together different ways of getting to information ... So, not only do I have a taxonomy [and] di...
Guidelines for design of an SI-DL<br />36<br />
Conclusions<br />Working with subdocuments is important and necessary in many scholarly tasks<br />An SI-DL provides enhan...
Future work<br />Improved CBIR of subimages and improved combined search (e.g. transfer learning)<br />Leverage existing c...
Contributions and publications<br />39<br />
Publications related to this research<br />Published<br />SuperIDR: A Tablet PC Tool for Image Description and Retrieval (...
Other published work<br />Pedagogical Enhancements to a Course on Information Retrieval (TLIR, 2011)<br />Sustainability o...
Thank you<br />?<br />?<br />42<br />
Back up slides<br />43<br />
Photo attributions (Flickr)<br />A digital library by HacksHaven<br />Art History With Chris And Mac 6/9: Manet: Lecture (...
SuperIDR architecture<br />
46<br />Species learning methods<br />Variability in fishes of same species<br />
Summary of findings of qualitative study<br />13 types of subimages/annotations from 940 subimages/annotations<br />Subima...
Morphological comparison<br />48<br />
Participatory SI-DL [Marchionini, 2010]<br />49<br />
Upcoming SlideShare
Loading in …5
×

Digital libraries with superimposed information - Ph.D. Defense

615 views

Published on

Slides from my Ph.D. defense that took place on Jan 28, 2011.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
615
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Species identification, analyzing paintings, studying architecture styles, analyzing medical images, etc.
  • Focus on infrastructure to work with marks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Collect qualitative data, in multiple ways and from multiple sources, on subimage and SuperIDR use in fish IDRecruit people with interest in fish IDHave a longer duration of use in natural setting and in targeted tasksHave them use SuperIDR on their own (data on use in the wild) and in targeted tasks (opportunity to observe use), so we have data on use that relates to task execution.Study setup – skype, interviews, etc.
  • 3 week long studySetup, pre-study interview – for background information and species id practices, and training Week use (diaries) and tasks – first 2 weeksWeek use (diaries) – 3rd weekPost study interview on subimage use in species id, SuperIDR support of subimage use
  • P1, P2, P3, P4, P5, and P6Undergraduates – P2 and P6, recently taken Ichthyology, freshwater species knowledge relatively fresh in mind, transitioning from using/referring to several sources to internalizing that species id knowledge, species id in the classroom, assisted senior students in 1-2 projects on field Master’s students – P1 and P5 (recently took Ichthyology) work with a few species, just started on research projects, generally use memory or refer to a few books/websites/etc. PhD students – P3 and P5 have many years of experience, have done a lot of species id in the field and lab, work on select species, have almost internalized species id process. Still need to refer to information for fishes outside the ones that they work. Have developed their own styles of species id. For the most part, use these references to confirm fish identification
  • morphological description (shape, pattern, texture), size, color, presence, counts, location,morphological comparisons, multiple parts description, connections/relationships, comparison with other information-objects, (Not about parts) the information object as a whole, combination of aforementioned types
  • Use of subimages is necessary in fish identificationFish identification activities – learning and identifying speciesLearning methods vary, such as notecards, textbooks, identification key, notes, printed lists, lists of images in digital documents, websites, etc. Focus on location, habitat, species general physical appearance, distinguishing characteristics subimages.Species identification is typically a top-down approach – family, genus, species. Distinguishing charac./subimages used at genus/species level, usually to compare and contrast among very similar species  eliminating choices and then arriving at the species. Typically identify in field (except while taking a class, wherea lot of id is in class using jarred/specimens), need to quickly id fish in order to release them alive (another reason for distinguishing charac.)Species vary in appearance – some charac. Are preserved such as black lines or markings, so might use that in identifying a fish.
  • Used from 3.20 hours to 7:15 hours, across task and non-task sessions. Identification of species using top-down approach described earlierCombined search, complex objects as queriesManual analysis of images is necessarySuperIDR feedbackbrings together tools to access information, well supported subimage use for learning about a species, since there is a lot of information to browse and learn.
  • SubdocumentsPreserve contextSupport multiple ways to describe, organize, access, retrieve, use, and re-use subdocuments and associated information Support manual as well as automatic ways to work with and process information
  • Improve CBIR on subimages and combined search – combined query and search, descriptors for this application, treating subimages separately from whole images, transfer learning, leveraging knowledge of types of subimages/annotations to improve searchLeverage existing collections to study applicability in other domains (flickr group photo notes)Crowdsourcing social media to study SI use in a social network context and the WWW – how do people use others’ tags on photos, others’ notes on images, others’ annotations on documents (kindle books)?, what activities do they use it for? , does SI and its use help/impact services (search, etc)?Participatory SI-DL, when personal and institutional DLs come together, how is SI now modeled, considering multiple users and institutions and uses? How can people share information and services in a reusable and interoperable manner in this participatory DL? What are the dynamics of users and uses in such a DL?Comparison of forms of subdocuments and associated information - Marshall’s study of annotations and types, Winget’s study of annotations on structured data (musical scores), subimages/annotation types
  • Digital libraries with superimposed information - Ph.D. Defense

    1. 1. Digital Libraries with Superimposed Information<br />Supporting Scholarly Tasks that Involve Fine-grain Information<br />Uma Murthy<br />PhD Defense<br />28 January 2011<br />
    2. 2. Acknowledgments<br />My family<br />Dr. Edward Fox, Dr. Manuel Pérez-Quiñones, Dr. Ricardo Torres, Dr. Lois Delcambre, Dr. NarenRamakrishnan, Dr. Eric Hallerman, Lin Tzy Li, Dr. Marcos Goncalves, Yinlin Chen, Nadia Kozievitch, Evandro Ramos, Tiago Falcao, KapilAhuja, Dr. John Pitrelli, Dr. GaneshRamaswamy, Dr. Andrea Kavanaugh, Dr. Lillian Cassel, Dr. Deborah Tatar, Dr. Donald Orth, Seungwon Yang, LokeyaVenkatachalam, Seonho Kim, Doug Gorton, Ricardo Quintana-Castillo, Monika Akbar, Dave Archer, Susan Price, RaoShen, SrinivasVemuri, Xiaoyan Yang, YoncaHaciahmetoglu, PardhaPyla, ManasTungare, SameerAhuja, Ben Hanrahan, Laurian Vega, Stacy Branham, Tejinder Judge, Rhonda Phillips, RamyaRavichandar, HariPyla, ManjulaIyer, Dr. Noel Greis, Dr. Jack Olin, VenkatSrinivasan, …<br />NSF grants (Superimposed information, Digital Government, DL curriculum, CTR, ECDL), Microsoft tablet PC grant, CS department, and Graduate school <br />2<br />
    3. 3. Motivation: many scholarly tasks involve working with subdocuments<br />3<br />
    4. 4. Problems<br />Information is heterogeneous, voluminous, distributed across locations, and it is challenging to manage, organize, access, retrieve, and use.<br />Tools/methods (including paper-based and digital) are not well-integrated.<br />4<br />Ineffective and inefficient task execution<br />
    5. 5. A digital library = repository of collections and metadata + services <br />5<br />
    6. 6. Scenario<br />6<br />Find me species that are darters that have a dorsal fin that looks like this, which is connected to another dorsal fin that looks like this, which might have an orange hue on its edge<br />Search for subdocuments, in context of other information, incl. other subdocuments<br />Use it in another task/context<br />
    7. 7. Superimposed information enables working with contextualized subdocuments<br />superimposed (new) information<br />marks<br />base (existing) information<br />7<br />
    8. 8. Hypothesis<br />A digital library with superimposed information (SI-DL) provides enhanced support to scholarly tasks that involve working with subdocuments <br />DL<br />SI<br />Provides enhanced<br />support to<br />Scholarly tasks with subdocuments<br />+<br />8<br />
    9. 9. Research questions<br />9<br />
    10. 10. Research approach<br />10<br />
    11. 11. Research approach - theory<br />11<br />
    12. 12. Research approach - practical/user<br />12<br />
    13. 13. Review of work done and results<br />13<br />
    14. 14. Review of work done and results<br />14<br />
    15. 15. Review of work done and results<br />15<br />
    16. 16. Review of work done and results<br />16<br />
    17. 17. Review of work done and results<br />17<br />
    18. 18. Review of work done and results<br />18<br />
    19. 19. Review of work done and results<br />19<br />
    20. 20. Review of work done and results<br />20<br />
    21. 21. Subimage and SuperIDR use – a qualitative study<br />How do people use subimages in fish identification and how does SuperIDR support that use?<br />SuperIDR support for working with subimages in fish identification<br />Contexts and strategies of working with subimages in fish identification<br />Characteristics of subimages and related information<br />21<br />
    22. 22. Rationale: Maximize Use of SuperIDR<br />Recruit people with interest in fish ID<br />Have a longer duration of use in natural setting and in targeted tasks<br />Have them use SuperIDR on their own (data on use in the wild) and in targeted tasks (opportunity to observe use)<br />Collect qualitative data, in multiple ways and from multiple sources, on subimage and SuperIDR use in fish ID<br />22<br />
    23. 23. Study setup<br />23<br />
    24. 24. 24<br />Study procedures<br />Data collected<br />Interview responses<br />Diary entries<br />Log data of SuperIDR use<br />Screen captures of task execution<br />Spoken thoughts during task execution<br />Species id materials<br />Database image<br />Species id responses<br />
    25. 25. Participants:3 groups<br />25<br />Analyzed participants based on fisheries and fish identification experience, current projects and fish identification practices<br />P2 (male), P5 (female), P6 (male): Relatively less experienced, undergraduates (UG) or recent UG<br />P1 (male), P5 (female): Moderately experienced Master’s students, working on theses and/or teaching/research<br />P3 (male), P4 (female): Highly experienced PhD students, working on research projects<br />
    26. 26. Subimage/annotation characteristics<br />940 subimages, annotations, most focusing on part of the fish (image)<br />26<br />
    27. 27. 27<br />morphological description, size, color, presence, counts, location<br />Color<br />Location<br />
    28. 28. Co-presence, morphological comparisons, multiple parts description, connections/relationships, comparison with other information-objects<br />28<br />Comparison with other information objects<br />Connections/relationships<br />
    29. 29. information object as a whole, combination of types<br />29<br />Information object as a whole<br />Combination of color and count<br />
    30. 30. Strategies and contexts that suggest subimage use in fish identification<br />In learning methods<br />In identification (top-down approach, compare similar species)<br />To help identify fishes quickly (identify in field versus the lab or the classroom)<br />In fishes of the same species (to deal with variability in appearance)<br />To verify species using manual inspection<br />30<br />
    31. 31. Subimage use in SuperIDR<br />Marking and annotating subimages (940 subimages and annotations)<br />Browsing through subimages in species description, subimages in comparison, subimages in search results<br />Text, image, and combined search, complex objects as queries<br />31<br />
    32. 32. 32<br />Subimages in species learning methods<br />
    33. 33. Manually inspecting subimages while comparing similar species<br />33<br />
    34. 34. Complex object as a query<br />
    35. 35. “It [SuperIDR] is pulling together different ways of getting to information ... So, not only do I have a taxonomy [and] dichotomous key, but it is also supported by images, many images that I have loaded in myself, that I can compare and contrast right there in the program [SuperIDR]. I can annotate the images, so I know that I kind of looking somewhat into their future [use]. And it kind of just pulls all those tools together, more so than [pulling together] information. It gives me many ways of accessing the same information. The more ways you can come to that information, the better [it is]. Because it is always going to make you more confident about the decision that you are making." [P1 interview]<br />35<br />SI-DL<br />Context <br />It depends on how distinct that species [is] and how many other species are similar to that species, I guess … I would never trust the result, I guess, 100% …you know, based on just one picture and a little bit of written text. I would always want to pull up other species that are somewhat similar and just do a visual inspection myself to be sure that it just was not some bad [query] image that I used or a bad search term." [P3 interview]<br />“... It would not work if you said that this fish has dark spots. You know you get hundreds of species with dark spots. But, if you got down to a few species and you need to know how many they have ..." [P1 interview]<br />Manually working with information<br />
    36. 36. Guidelines for design of an SI-DL<br />36<br />
    37. 37. Conclusions<br />Working with subdocuments is important and necessary in many scholarly tasks<br />An SI-DL provides enhanced support to such scholarly tasks<br />Treating subdocuments as first-class objects facilitates management, access, retrieval, and use of subdocuments and associated information<br />Contributions<br />Superimposed applications<br />SI-DL definition (metamodel) and prototype (SuperIDR)<br />Findings from user studies on use of SI in scholarly tasks<br />Insights about subimage use in species identification<br />Guidelines for SI-DL design<br />Datasets (images, subimages, annotations)*<br />37<br />
    38. 38. Future work<br />Improved CBIR of subimages and improved combined search (e.g. transfer learning)<br />Leverage existing collections to study applicability in other domains<br />Crowdsourcing social media to study SI use in a social network context and the<br />Participatory SI-DL, when personal and institutional DLs come together<br />Comparison of various forms and functions of subdocuments and associated<br />38<br />
    39. 39. Contributions and publications<br />39<br />
    40. 40. Publications related to this research<br />Published<br />SuperIDR: A Tablet PC Tool for Image Description and Retrieval (WIPTE, 2010)<br />A Teaching Tool for Parasitology: Enhancing Learning with Annotation and Image Retrieval (ECDL, 2010)<br />Superimposed image description and retrieval for fish species identification (ECDL 2009)<br />Species identification: fish images with cbir and annotations (JCDL poster, 2009)<br />Superimposed information architecture for digital libraries (ECDL, 2008)<br />From concepts to implementation and visualization: tools from a team-based approach to IR (SIGIR demo, 2008)<br />Further development of a digital library curriculum: Evaluation approaches and new tools (ICADL, 2007)<br />A superimposed information-supported digital library (JCDL doctoral consortium, 2007)<br />Extending the 5S digital library (DL) framework: From a minimal DL towards a DL reference model (DLF workshop, JCDL, 2007)<br />Enhancing concept mapping tools below and above to facilitate the use of superimposed information (CMC, 2006)<br />Sierra - a superimposed application for enhanced image description and retrieval (ECDL demo, 2006)<br />Using superimposed and context information to find and re-find sub-documents (PIM, 2006)<br />SIMPEL: a superimposed multimedia presentation editor and player (JCDL demo, 2006)<br />Planned<br />A qualitative study on the use of subimages and of SuperIDR – a prototype digital library with superimposed information – in fish species identification (JCDL, 2011)<br />Extending the 5s framework to provide support for cbir, complex objects, and superimposed information (journal paper)<br />40<br />
    41. 41. Other published work<br />Pedagogical Enhancements to a Course on Information Retrieval (TLIR, 2011)<br />Sustainability of Bits, not just Atoms (CHI sustainability workshop, 2010)<br />Using an iPhone Application for Diversity Recruitment (ASEE-SE, 2009)<br />Building an ontology for crisis, tragedy and recovery (NKOS 2009)<br />Curatorial Work and Learning in Virtual Environments: A Virtual World Project to Support the NDIIPP Community (JCDL Digital Preservation workshop, 2009)<br />A Methodology and Tool Suite for Evaluation of Accuracy of Interoperating Statistical Natural Language Processing Engines (Interspeech 2008)<br />VizBlog: a discovery tool for the blogosphere. (DigGov 2007)<br />Re-finding from a Human Information Processing Perspective (PIM 2006)<br />41<br />
    42. 42. Thank you<br />?<br />?<br />42<br />
    43. 43. Back up slides<br />43<br />
    44. 44. Photo attributions (Flickr)<br />A digital library by HacksHaven<br />Art History With Chris And Mac 6/9: Manet: Lecture (Mme Manet and Leon) by moonflowerdragon<br />Korean music by Homies In Heaven<br />Old annotations by Lorianne DiSabato<br />Reading Annotation by Rosa Say<br />
    45. 45. SuperIDR architecture<br />
    46. 46. 46<br />Species learning methods<br />Variability in fishes of same species<br />
    47. 47. Summary of findings of qualitative study<br />13 types of subimages/annotations from 940 subimages/annotations<br />Subimages are important and necessary in fish identification<br />Identification top down way<br />Learning using multiple methods<br />Context is important<br />Combined search and using a complex object as a query<br />SI-DL – bringing together capabilities<br />47<br />
    48. 48. Morphological comparison<br />48<br />
    49. 49. Participatory SI-DL [Marchionini, 2010]<br />49<br />

    ×