Your SlideShare is downloading. ×
Digital libraries with superimposed information - Ph.D. Defense
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Digital libraries with superimposed information - Ph.D. Defense

399
views

Published on

Slides from my Ph.D. defense that took place on Jan 28, 2011.

Slides from my Ph.D. defense that took place on Jan 28, 2011.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
399
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Species identification, analyzing paintings, studying architecture styles, analyzing medical images, etc.
  • Focus on infrastructure to work with marks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Results to date Case studiesSuperimposed applicationsSuperIDRSuperIDR evaluation – longitudinal and classroom-basedMetamodelWas able to answer questions about what does this DL contain, how might it be realized, how it compares with traditional methods of doing a task and to some extent how subimages/SI is used in scholarly tasksBut not yet answered – how SI-DL supports use of subimages in scholarly tasks? Opportunity to analyze deeper on use of subimages in scholarly tasks
  • Collect qualitative data, in multiple ways and from multiple sources, on subimage and SuperIDR use in fish IDRecruit people with interest in fish IDHave a longer duration of use in natural setting and in targeted tasksHave them use SuperIDR on their own (data on use in the wild) and in targeted tasks (opportunity to observe use), so we have data on use that relates to task execution.Study setup – skype, interviews, etc.
  • 3 week long studySetup, pre-study interview – for background information and species id practices, and training Week use (diaries) and tasks – first 2 weeksWeek use (diaries) – 3rd weekPost study interview on subimage use in species id, SuperIDR support of subimage use
  • P1, P2, P3, P4, P5, and P6Undergraduates – P2 and P6, recently taken Ichthyology, freshwater species knowledge relatively fresh in mind, transitioning from using/referring to several sources to internalizing that species id knowledge, species id in the classroom, assisted senior students in 1-2 projects on field Master’s students – P1 and P5 (recently took Ichthyology) work with a few species, just started on research projects, generally use memory or refer to a few books/websites/etc. PhD students – P3 and P5 have many years of experience, have done a lot of species id in the field and lab, work on select species, have almost internalized species id process. Still need to refer to information for fishes outside the ones that they work. Have developed their own styles of species id. For the most part, use these references to confirm fish identification
  • morphological description (shape, pattern, texture), size, color, presence, counts, location,morphological comparisons, multiple parts description, connections/relationships, comparison with other information-objects, (Not about parts) the information object as a whole, combination of aforementioned types
  • Use of subimages is necessary in fish identificationFish identification activities – learning and identifying speciesLearning methods vary, such as notecards, textbooks, identification key, notes, printed lists, lists of images in digital documents, websites, etc. Focus on location, habitat, species general physical appearance, distinguishing characteristics subimages.Species identification is typically a top-down approach – family, genus, species. Distinguishing charac./subimages used at genus/species level, usually to compare and contrast among very similar species  eliminating choices and then arriving at the species. Typically identify in field (except while taking a class, wherea lot of id is in class using jarred/specimens), need to quickly id fish in order to release them alive (another reason for distinguishing charac.)Species vary in appearance – some charac. Are preserved such as black lines or markings, so might use that in identifying a fish.
  • Used from 3.20 hours to 7:15 hours, across task and non-task sessions. Identification of species using top-down approach described earlierCombined search, complex objects as queriesManual analysis of images is necessarySuperIDR feedbackbrings together tools to access information, well supported subimage use for learning about a species, since there is a lot of information to browse and learn.
  • SubdocumentsPreserve contextSupport multiple ways to describe, organize, access, retrieve, use, and re-use subdocuments and associated information Support manual as well as automatic ways to work with and process information
  • Improve CBIR on subimages and combined search – combined query and search, descriptors for this application, treating subimages separately from whole images, transfer learning, leveraging knowledge of types of subimages/annotations to improve searchLeverage existing collections to study applicability in other domains (flickr group photo notes)Crowdsourcing social media to study SI use in a social network context and the WWW – how do people use others’ tags on photos, others’ notes on images, others’ annotations on documents (kindle books)?, what activities do they use it for? , does SI and its use help/impact services (search, etc)?Participatory SI-DL, when personal and institutional DLs come together, how is SI now modeled, considering multiple users and institutions and uses? How can people share information and services in a reusable and interoperable manner in this participatory DL? What are the dynamics of users and uses in such a DL?Comparison of forms of subdocuments and associated information - Marshall’s study of annotations and types, Winget’s study of annotations on structured data (musical scores), subimages/annotation types
  • Transcript

    • 1. Digital Libraries with Superimposed Information
      Supporting Scholarly Tasks that Involve Fine-grain Information
      Uma Murthy
      PhD Defense
      28 January 2011
    • 2. Acknowledgments
      My family
      Dr. Edward Fox, Dr. Manuel Pérez-Quiñones, Dr. Ricardo Torres, Dr. Lois Delcambre, Dr. NarenRamakrishnan, Dr. Eric Hallerman, Lin Tzy Li, Dr. Marcos Goncalves, Yinlin Chen, Nadia Kozievitch, Evandro Ramos, Tiago Falcao, KapilAhuja, Dr. John Pitrelli, Dr. GaneshRamaswamy, Dr. Andrea Kavanaugh, Dr. Lillian Cassel, Dr. Deborah Tatar, Dr. Donald Orth, Seungwon Yang, LokeyaVenkatachalam, Seonho Kim, Doug Gorton, Ricardo Quintana-Castillo, Monika Akbar, Dave Archer, Susan Price, RaoShen, SrinivasVemuri, Xiaoyan Yang, YoncaHaciahmetoglu, PardhaPyla, ManasTungare, SameerAhuja, Ben Hanrahan, Laurian Vega, Stacy Branham, Tejinder Judge, Rhonda Phillips, RamyaRavichandar, HariPyla, ManjulaIyer, Dr. Noel Greis, Dr. Jack Olin, VenkatSrinivasan, …
      NSF grants (Superimposed information, Digital Government, DL curriculum, CTR, ECDL), Microsoft tablet PC grant, CS department, and Graduate school
      2
    • 3. Motivation: many scholarly tasks involve working with subdocuments
      3
    • 4. Problems
      Information is heterogeneous, voluminous, distributed across locations, and it is challenging to manage, organize, access, retrieve, and use.
      Tools/methods (including paper-based and digital) are not well-integrated.
      4
      Ineffective and inefficient task execution
    • 5. A digital library = repository of collections and metadata + services
      5
    • 6. Scenario
      6
      Find me species that are darters that have a dorsal fin that looks like this, which is connected to another dorsal fin that looks like this, which might have an orange hue on its edge
      Search for subdocuments, in context of other information, incl. other subdocuments
      Use it in another task/context
    • 7. Superimposed information enables working with contextualized subdocuments
      superimposed (new) information
      marks
      base (existing) information
      7
    • 8. Hypothesis
      A digital library with superimposed information (SI-DL) provides enhanced support to scholarly tasks that involve working with subdocuments
      DL
      SI
      Provides enhanced
      support to
      Scholarly tasks with subdocuments
      +
      8
    • 9. Research questions
      9
    • 10. Research approach
      10
    • 11. Research approach - theory
      11
    • 12. Research approach - practical/user
      12
    • 13. Review of work done and results
      13
    • 14. Review of work done and results
      14
    • 15. Review of work done and results
      15
    • 16. Review of work done and results
      16
    • 17. Review of work done and results
      17
    • 18. Review of work done and results
      18
    • 19. Review of work done and results
      19
    • 20. Review of work done and results
      20
    • 21. Subimage and SuperIDR use – a qualitative study
      How do people use subimages in fish identification and how does SuperIDR support that use?
      SuperIDR support for working with subimages in fish identification
      Contexts and strategies of working with subimages in fish identification
      Characteristics of subimages and related information
      21
    • 22. Rationale: Maximize Use of SuperIDR
      Recruit people with interest in fish ID
      Have a longer duration of use in natural setting and in targeted tasks
      Have them use SuperIDR on their own (data on use in the wild) and in targeted tasks (opportunity to observe use)
      Collect qualitative data, in multiple ways and from multiple sources, on subimage and SuperIDR use in fish ID
      22
    • 23. Study setup
      23
    • 24. 24
      Study procedures
      Data collected
      Interview responses
      Diary entries
      Log data of SuperIDR use
      Screen captures of task execution
      Spoken thoughts during task execution
      Species id materials
      Database image
      Species id responses
    • 25. Participants:3 groups
      25
      Analyzed participants based on fisheries and fish identification experience, current projects and fish identification practices
      P2 (male), P5 (female), P6 (male): Relatively less experienced, undergraduates (UG) or recent UG
      P1 (male), P5 (female): Moderately experienced Master’s students, working on theses and/or teaching/research
      P3 (male), P4 (female): Highly experienced PhD students, working on research projects
    • 26. Subimage/annotation characteristics
      940 subimages, annotations, most focusing on part of the fish (image)
      26
    • 27. 27
      morphological description, size, color, presence, counts, location
      Color
      Location
    • 28. Co-presence, morphological comparisons, multiple parts description, connections/relationships, comparison with other information-objects
      28
      Comparison with other information objects
      Connections/relationships
    • 29. information object as a whole, combination of types
      29
      Information object as a whole
      Combination of color and count
    • 30. Strategies and contexts that suggest subimage use in fish identification
      In learning methods
      In identification (top-down approach, compare similar species)
      To help identify fishes quickly (identify in field versus the lab or the classroom)
      In fishes of the same species (to deal with variability in appearance)
      To verify species using manual inspection
      30
    • 31. Subimage use in SuperIDR
      Marking and annotating subimages (940 subimages and annotations)
      Browsing through subimages in species description, subimages in comparison, subimages in search results
      Text, image, and combined search, complex objects as queries
      31
    • 32. 32
      Subimages in species learning methods
    • 33. Manually inspecting subimages while comparing similar species
      33
    • 34. Complex object as a query
    • 35. “It [SuperIDR] is pulling together different ways of getting to information ... So, not only do I have a taxonomy [and] dichotomous key, but it is also supported by images, many images that I have loaded in myself, that I can compare and contrast right there in the program [SuperIDR]. I can annotate the images, so I know that I kind of looking somewhat into their future [use]. And it kind of just pulls all those tools together, more so than [pulling together] information. It gives me many ways of accessing the same information. The more ways you can come to that information, the better [it is]. Because it is always going to make you more confident about the decision that you are making." [P1 interview]
      35
      SI-DL
      Context
      It depends on how distinct that species [is] and how many other species are similar to that species, I guess … I would never trust the result, I guess, 100% …you know, based on just one picture and a little bit of written text. I would always want to pull up other species that are somewhat similar and just do a visual inspection myself to be sure that it just was not some bad [query] image that I used or a bad search term." [P3 interview]
      “... It would not work if you said that this fish has dark spots. You know you get hundreds of species with dark spots. But, if you got down to a few species and you need to know how many they have ..." [P1 interview]
      Manually working with information
    • 36. Guidelines for design of an SI-DL
      36
    • 37. Conclusions
      Working with subdocuments is important and necessary in many scholarly tasks
      An SI-DL provides enhanced support to such scholarly tasks
      Treating subdocuments as first-class objects facilitates management, access, retrieval, and use of subdocuments and associated information
      Contributions
      Superimposed applications
      SI-DL definition (metamodel) and prototype (SuperIDR)
      Findings from user studies on use of SI in scholarly tasks
      Insights about subimage use in species identification
      Guidelines for SI-DL design
      Datasets (images, subimages, annotations)*
      37
    • 38. Future work
      Improved CBIR of subimages and improved combined search (e.g. transfer learning)
      Leverage existing collections to study applicability in other domains
      Crowdsourcing social media to study SI use in a social network context and the
      Participatory SI-DL, when personal and institutional DLs come together
      Comparison of various forms and functions of subdocuments and associated
      38
    • 39. Contributions and publications
      39
    • 40. Publications related to this research
      Published
      SuperIDR: A Tablet PC Tool for Image Description and Retrieval (WIPTE, 2010)
      A Teaching Tool for Parasitology: Enhancing Learning with Annotation and Image Retrieval (ECDL, 2010)
      Superimposed image description and retrieval for fish species identification (ECDL 2009)
      Species identification: fish images with cbir and annotations (JCDL poster, 2009)
      Superimposed information architecture for digital libraries (ECDL, 2008)
      From concepts to implementation and visualization: tools from a team-based approach to IR (SIGIR demo, 2008)
      Further development of a digital library curriculum: Evaluation approaches and new tools (ICADL, 2007)
      A superimposed information-supported digital library (JCDL doctoral consortium, 2007)
      Extending the 5S digital library (DL) framework: From a minimal DL towards a DL reference model (DLF workshop, JCDL, 2007)
      Enhancing concept mapping tools below and above to facilitate the use of superimposed information (CMC, 2006)
      Sierra - a superimposed application for enhanced image description and retrieval (ECDL demo, 2006)
      Using superimposed and context information to find and re-find sub-documents (PIM, 2006)
      SIMPEL: a superimposed multimedia presentation editor and player (JCDL demo, 2006)
      Planned
      A qualitative study on the use of subimages and of SuperIDR – a prototype digital library with superimposed information – in fish species identification (JCDL, 2011)
      Extending the 5s framework to provide support for cbir, complex objects, and superimposed information (journal paper)
      40
    • 41. Other published work
      Pedagogical Enhancements to a Course on Information Retrieval (TLIR, 2011)
      Sustainability of Bits, not just Atoms (CHI sustainability workshop, 2010)
      Using an iPhone Application for Diversity Recruitment (ASEE-SE, 2009)
      Building an ontology for crisis, tragedy and recovery (NKOS 2009)
      Curatorial Work and Learning in Virtual Environments: A Virtual World Project to Support the NDIIPP Community (JCDL Digital Preservation workshop, 2009)
      A Methodology and Tool Suite for Evaluation of Accuracy of Interoperating Statistical Natural Language Processing Engines (Interspeech 2008)
      VizBlog: a discovery tool for the blogosphere. (DigGov 2007)
      Re-finding from a Human Information Processing Perspective (PIM 2006)
      41
    • 42. Thank you
      ?
      ?
      42
    • 43. Back up slides
      43
    • 44. Photo attributions (Flickr)
      A digital library by HacksHaven
      Art History With Chris And Mac 6/9: Manet: Lecture (Mme Manet and Leon) by moonflowerdragon
      Korean music by Homies In Heaven
      Old annotations by Lorianne DiSabato
      Reading Annotation by Rosa Say
    • 45. SuperIDR architecture
    • 46. 46
      Species learning methods
      Variability in fishes of same species
    • 47. Summary of findings of qualitative study
      13 types of subimages/annotations from 940 subimages/annotations
      Subimages are important and necessary in fish identification
      Identification top down way
      Learning using multiple methods
      Context is important
      Combined search and using a complex object as a query
      SI-DL – bringing together capabilities
      47
    • 48. Morphological comparison
      48
    • 49. Participatory SI-DL [Marchionini, 2010]
      49