Visual Search for Supporting Content Exploration in Large Document Collections

Dasha Herrmannova
Dasha HerrmannovaResearch Scientist at Oak Ridge National Laboratory
1/48
Visual search for supporting content
exploration in large document collections
Drahomira Herrmannova and Petr Knoth
2/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
3/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
4/48
What do we do
• Improve search in (large) document collections
• Examples of collections:
– News articles
– Cultural heritage collection
– Collection of scientific papers
• Current search engines:
– Support for lookup
– Much less support for exploration
5/48
Search tasks (Rose and Levinson, 2004)
• Undirected (or exploratory) queries – significant
portion of all searches (Rose and Levinson, 2004)
6/48
Exploratory search (Marchionini, 2006)
7/48
How to support exploratory search
• One possible solution – information
visualisation
• Why?
– Easier to communicate structure, organisation and
relations in content
– Visually appealing
8/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
9/48
Information Visualisation (1/2)
• Division according to granularity of
information
– Collection level
– Document level
– Intra-document level
10/48
Collection level visualisations
• Visualise attributes of the collection
• Typically aim at providing a general overview
of the collection content
• Examples
11/48
Tag clouds (Montero and Solana, 2006)
12/48
TIARA (Wei et al., 2010)
13/48
GRIDL (Schneiderman et al., 2000)
14/48
Document level visualisations
• Visualise attributes of the collection items
• Mutual links and relations of collection items
• Examples
15/48
Hopara (Milne and Witten, 2011)
16/48
Wivi (Lehmann et al., 2010)
17/48
Apolo (Chau et al., 2011)
18/48
Intra-document level visualisations
• Visualise the internal structure of a document
• Example
19/48
TileBars (Hirst, 1995)
20/48
Information Visualisation (2/2)
• Division according to the “starting point” of
the visualisation
– Browsing focused
– Query focused
21/48
Browsing focused
• Exploration starts at a specific point in the
collection from which the user navigates
through the collection
• Usually the same starting point is used every
time
22/48
InfoSky (Granitzer et al., 2004)
23/48
Query focused
• Starts with a query
• The query determines the entry point from
which the exploration starts
24/48
ThinkPedia (Hirsch et al., 2009)
25/48
Our approach
• Document level information
• Query focused browsing
26/48
Design principles (1/2)
• For visual search interfaces
• Should be considered when designing the
interface
• Related studies:
– Chen and Yu, 2000
– Sebrechts et al., 1999
27/48
Design principles (2/2)
1. Added value
2. Simplicity
3. Visual legibility
4. Use of colours
5. Dimension
6. Fixed spatial location
28/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
29/48
Considered types of collections
• Every document in a collection defined
according to a set of dimensions
• Dimensions typically of different types
• Document = set of properties expressing
values of dimensions
• Dimensions always present
• Examples
30/48
News articles collection
• Dimensions:
– Time
– Themes
– Locations
– Relations to other articles
31/48
Cultural heritage artifacts
• Dimensions:
– Artifact type
– Historical period
– Style
– Material
32/48
Scientific papers
• Dimensions:
– Citations
– Authors
– Concepts
– Similarities with other articles
33/48
The visualisation
34/48
The visualisation
35/48
The visualisation
36/48
The visualisation
37/48
Discovering connections
38/48
Comparing and contrasting documents
39/48
Limitations
• In theory not restricted, the limitations might
be:
– the size and resolution of the screen
– the limitations of human perception
40/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
41/48
Conclusion (1/2)
• Motivation:
1. Provide better support for exploratory search
than current textual interfaces
2. Interface that is conceptually applicable in any
document collection regardless of its type
3. Provide an added value by assisting in the
discovery of interesting connections that would
otherwise remain hidden
42/48
Conclusion (2/2)
• Results:
1. Support for comparing and contrasting content.
2. Support for exploration across dimensions.
3. Universal approach to the visualised dimensions.
43/48
Future plans
• Planned release end of June
• Integration with CORE system
• Evaluation
44/48
References (1/4)
• G. Marchionini. Exploratory search: from finding to understanding.
Communications of the ACM - Supporting exploratory search. 2006.
• D. Rose & D. Levinson. Understanding user goals in web search.
Proceedings of the 13th conference on World Wide Web. 2004.
• Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as
visual information retrieval interfaces. In MERIDA, INSCIT2006
CONFERENCE. 2006.
• Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong
Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an-
alytic system. In Proceedings of the 16th ACMSIGKDD international
conference on Knowledge discovery and data mining. 2010.
45/48
References (2/4)
• Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau.
Visualizing digital librarysearch results with categorical and hierarchical
axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000.
• Marti A. Hearst. TileBars: Visualization of Term Distribution Information in
Full Text Information Access. In the Proceedings of the ACM SIGCHI
Conference on Human Factors in Computing Systems. 1995.
• David Milne, Ian Witten. A link-based visual search engine for Wikipedia.
Proceeding of the 11th annual international ACM/IEEE joint conference on
Digital libraries. 2011.
• Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive
visualization for opportunistic exploration of large document collections.
Information Systems. 2010.
46/48
References (3/4)
• Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos.
Apolo: making sense of large network data by combining rich user
interaction and machine learning. In Proceedings of the 2011 annual
conference on Human factors in computing systems. 2011.
• Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and
Werner Klieber. Evaluating a system for interactive exploration of large,
hierarchically structured document repositories. In Proceedings of the IEEE
Symposium on Information Visualization. 2004.
• Christian Hirsch, John Hosking, and John Grundy. Interactive visualization
tools for exploring the semantic graph of large knowledge spaces.
Interfaces. 2009.
47/48
References (4/4)
• Chaomei Chen and Yue Yu. Empirical studies of information visualization: a
meta-analysis. Int. J. Hum.- Comput. Stud. 2000.
• Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis,
and Michael S. Miller. Visualization of search results: a comparative
evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd
annual international ACM SIGIR conference on Research and development
in information retrieval. 1999.
48/48
Thanks for listening!
Questions?
1 of 48

Recommended

Open Data: Strategies for Research Data Management (and Planning) by
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
1K views25 slides
Horizon 2020 open access and open data mandates by
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesMartin Donnelly
314 views30 slides
Open Data Strategies and Research Data Realities by
Open Data Strategies and Research Data RealitiesOpen Data Strategies and Research Data Realities
Open Data Strategies and Research Data RealitiesMartin Donnelly
573 views27 slides
How to elaborate a data management plan by
How to elaborate a data management planHow to elaborate a data management plan
How to elaborate a data management planBiblioteca de la Universitat Jaume I
1.2K views35 slides
Funder requirements for Data Management Plans by
Funder requirements for Data Management PlansFunder requirements for Data Management Plans
Funder requirements for Data Management PlansSherry Lake
541 views19 slides
Open Science: What, why, how? by
Open Science: What, why, how? Open Science: What, why, how?
Open Science: What, why, how? Biblioteca de la Universitat Jaume I
2.3K views66 slides

More Related Content

What's hot

Kms Serveying The Landscape by
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscapechu2mm
2.5K views14 slides
A coordinated framework for open data open science in Botswana/Simon Hodson by
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
242 views20 slides
Writing a successful data management plan with the DMPTool by
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolkfear
574 views42 slides
OSFair2017 Workshop | OpenDataMonitor by
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOpen Science Fair
224 views18 slides
Open Data and Institutional Repositories by
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional RepositoriesRobin Rice
1.1K views23 slides
Open Access to Research Data: Challenges and Solutions by
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
1.3K views37 slides

What's hot(13)

Kms Serveying The Landscape by chu2mm
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscape
chu2mm2.5K views
Writing a successful data management plan with the DMPTool by kfear
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
kfear574 views
Open Data and Institutional Repositories by Robin Rice
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
Robin Rice1.1K views
Open Access to Research Data: Challenges and Solutions by Martin Donnelly
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
Martin Donnelly1.3K views
You down with dmp yeah you know me! by Renaine Julian
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
Renaine Julian312 views
Dissemination Information Packages (DIPS) for Information Reuse by Micah Altman
Dissemination Information Packages (DIPS) for Information Reuse Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse
Micah Altman1.2K views
General introduction to Open Data Policies H2020, influence of OD policies on... by Nancy Pontika
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
Nancy Pontika372 views

Similar to Visual Search for Supporting Content Exploration in Large Document Collections

Towards Research Engines: Supporting Search Stages in Web Archives (2015) by
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)TimelessFuture
1.8K views35 slides
Lessons Learned from a Digital Tool Criticism Workshop by
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopMarijn Koolen
76 views29 slides
C N I20080404 by
C N I20080404C N I20080404
C N I20080404Anita de Waard
261 views16 slides
Torsten Reimer by
Torsten ReimerTorsten Reimer
Torsten ReimerAnita de Waard
573 views16 slides
How Do UK Students, Researchers and Academics use the Internet by
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetCaroline Williams
232 views39 slides
A hands-on approach to digital tool criticism: Tools for (self-)reflection by
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionMarijn Koolen
237 views32 slides

Similar to Visual Search for Supporting Content Exploration in Large Document Collections(20)

Towards Research Engines: Supporting Search Stages in Web Archives (2015) by TimelessFuture
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
TimelessFuture1.8K views
Lessons Learned from a Digital Tool Criticism Workshop by Marijn Koolen
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism Workshop
Marijn Koolen76 views
How Do UK Students, Researchers and Academics use the Internet by Caroline Williams
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the Internet
Caroline Williams232 views
A hands-on approach to digital tool criticism: Tools for (self-)reflection by Marijn Koolen
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflection
Marijn Koolen237 views
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH by lorna_hughes
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
lorna_hughes1.3K views
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography by OCLC
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
OCLC354 views
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography by Lynn Connaway
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Lynn Connaway355 views
Managing Ireland's Research Data - 3 Research Methods by Rebecca Grant
Managing Ireland's Research Data - 3 Research MethodsManaging Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
Rebecca Grant70 views
Data sharing as part of the research ecosystem by Varsha Khodiyar
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
Varsha Khodiyar293 views
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018) by OpenAIRE
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
OpenAIRE102 views
Introduction to the workshop Services to support FAIR data - Sarah Jones by OpenAIRE
Introduction to the workshop Services to support FAIR data - Sarah JonesIntroduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah Jones
OpenAIRE102 views
Qualitative Research Methods in LIS by Lynn Connaway
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
Lynn Connaway834 views
Qualitative Research Methods in LIS by OCLC
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
OCLC809 views

More from Dasha Herrmannova

Machine Learning for Data Extraction by
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data ExtractionDasha Herrmannova
92 views61 slides
Do Authors Deposit on Time? Tracking Open Access Policy Compliance by
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDasha Herrmannova
559 views39 slides
Semantometrics: Text Analysis in Research Evaluation by
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Dasha Herrmannova
135 views18 slides
Do Citations and Readership Predict Excellent Publications? by
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Dasha Herrmannova
171 views12 slides
An Analysis of the Microsoft Academic Graph by
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphDasha Herrmannova
512 views32 slides
Unsupervised Identification of Study Descriptors in Toxicology Research: An E... by
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Dasha Herrmannova
186 views1 slide

More from Dasha Herrmannova(10)

Do Authors Deposit on Time? Tracking Open Access Policy Compliance by Dasha Herrmannova
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Dasha Herrmannova559 views
Semantometrics: Text Analysis in Research Evaluation by Dasha Herrmannova
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation
Dasha Herrmannova135 views
Do Citations and Readership Predict Excellent Publications? by Dasha Herrmannova
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?
Dasha Herrmannova171 views
An Analysis of the Microsoft Academic Graph by Dasha Herrmannova
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic Graph
Dasha Herrmannova512 views
Unsupervised Identification of Study Descriptors in Toxicology Research: An E... by Dasha Herrmannova
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Dasha Herrmannova186 views
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking by Dasha Herrmannova
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingSimple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Dasha Herrmannova655 views
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin... by Dasha Herrmannova
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Dasha Herrmannova1.1K views
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing... by Dasha Herrmannova
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Dasha Herrmannova567 views
Mining Research Publication Networks for Impact -- KMi Internal Seminar by Dasha Herrmannova
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Dasha Herrmannova2.4K views

Recently uploaded

Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...ShapeBlue
198 views20 slides
Transcript: Redefining the book supply chain: A glimpse into the future - Tec... by
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...BookNet Canada
41 views16 slides
"Surviving highload with Node.js", Andrii Shumada by
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada Fwdays
56 views29 slides
Generative AI: Shifting the AI Landscape by
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeDeakin University
53 views55 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
170 views29 slides
Business Analyst Series 2023 - Week 4 Session 7 by
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7DianaGray10
139 views31 slides

Recently uploaded(20)

Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue198 views
Transcript: Redefining the book supply chain: A glimpse into the future - Tec... by BookNet Canada
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
BookNet Canada41 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays56 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc170 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10139 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays32 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue135 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue173 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue138 views
Business Analyst Series 2023 - Week 4 Session 8 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 8Business Analyst Series 2023 -  Week 4 Session 8
Business Analyst Series 2023 - Week 4 Session 8
DianaGray10123 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue297 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue206 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue147 views
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 by BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software176 views
LLMs in Production: Tooling, Process, and Team Structure by Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage42 views

Visual Search for Supporting Content Exploration in Large Document Collections

  • 1. 1/48 Visual search for supporting content exploration in large document collections Drahomira Herrmannova and Petr Knoth
  • 2. 2/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 3. 3/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 4. 4/48 What do we do • Improve search in (large) document collections • Examples of collections: – News articles – Cultural heritage collection – Collection of scientific papers • Current search engines: – Support for lookup – Much less support for exploration
  • 5. 5/48 Search tasks (Rose and Levinson, 2004) • Undirected (or exploratory) queries – significant portion of all searches (Rose and Levinson, 2004)
  • 7. 7/48 How to support exploratory search • One possible solution – information visualisation • Why? – Easier to communicate structure, organisation and relations in content – Visually appealing
  • 8. 8/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 9. 9/48 Information Visualisation (1/2) • Division according to granularity of information – Collection level – Document level – Intra-document level
  • 10. 10/48 Collection level visualisations • Visualise attributes of the collection • Typically aim at providing a general overview of the collection content • Examples
  • 11. 11/48 Tag clouds (Montero and Solana, 2006)
  • 12. 12/48 TIARA (Wei et al., 2010)
  • 14. 14/48 Document level visualisations • Visualise attributes of the collection items • Mutual links and relations of collection items • Examples
  • 15. 15/48 Hopara (Milne and Witten, 2011)
  • 17. 17/48 Apolo (Chau et al., 2011)
  • 18. 18/48 Intra-document level visualisations • Visualise the internal structure of a document • Example
  • 20. 20/48 Information Visualisation (2/2) • Division according to the “starting point” of the visualisation – Browsing focused – Query focused
  • 21. 21/48 Browsing focused • Exploration starts at a specific point in the collection from which the user navigates through the collection • Usually the same starting point is used every time
  • 23. 23/48 Query focused • Starts with a query • The query determines the entry point from which the exploration starts
  • 25. 25/48 Our approach • Document level information • Query focused browsing
  • 26. 26/48 Design principles (1/2) • For visual search interfaces • Should be considered when designing the interface • Related studies: – Chen and Yu, 2000 – Sebrechts et al., 1999
  • 27. 27/48 Design principles (2/2) 1. Added value 2. Simplicity 3. Visual legibility 4. Use of colours 5. Dimension 6. Fixed spatial location
  • 28. 28/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 29. 29/48 Considered types of collections • Every document in a collection defined according to a set of dimensions • Dimensions typically of different types • Document = set of properties expressing values of dimensions • Dimensions always present • Examples
  • 30. 30/48 News articles collection • Dimensions: – Time – Themes – Locations – Relations to other articles
  • 31. 31/48 Cultural heritage artifacts • Dimensions: – Artifact type – Historical period – Style – Material
  • 32. 32/48 Scientific papers • Dimensions: – Citations – Authors – Concepts – Similarities with other articles
  • 39. 39/48 Limitations • In theory not restricted, the limitations might be: – the size and resolution of the screen – the limitations of human perception
  • 40. 40/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 41. 41/48 Conclusion (1/2) • Motivation: 1. Provide better support for exploratory search than current textual interfaces 2. Interface that is conceptually applicable in any document collection regardless of its type 3. Provide an added value by assisting in the discovery of interesting connections that would otherwise remain hidden
  • 42. 42/48 Conclusion (2/2) • Results: 1. Support for comparing and contrasting content. 2. Support for exploration across dimensions. 3. Universal approach to the visualised dimensions.
  • 43. 43/48 Future plans • Planned release end of June • Integration with CORE system • Evaluation
  • 44. 44/48 References (1/4) • G. Marchionini. Exploratory search: from finding to understanding. Communications of the ACM - Supporting exploratory search. 2006. • D. Rose & D. Levinson. Understanding user goals in web search. Proceedings of the 13th conference on World Wide Web. 2004. • Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as visual information retrieval interfaces. In MERIDA, INSCIT2006 CONFERENCE. 2006. • Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an- alytic system. In Proceedings of the 16th ACMSIGKDD international conference on Knowledge discovery and data mining. 2010.
  • 45. 45/48 References (2/4) • Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau. Visualizing digital librarysearch results with categorical and hierarchical axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000. • Marti A. Hearst. TileBars: Visualization of Term Distribution Information in Full Text Information Access. In the Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems. 1995. • David Milne, Ian Witten. A link-based visual search engine for Wikipedia. Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries. 2011. • Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive visualization for opportunistic exploration of large document collections. Information Systems. 2010.
  • 46. 46/48 References (3/4) • Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. Apolo: making sense of large network data by combining rich user interaction and machine learning. In Proceedings of the 2011 annual conference on Human factors in computing systems. 2011. • Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and Werner Klieber. Evaluating a system for interactive exploration of large, hierarchically structured document repositories. In Proceedings of the IEEE Symposium on Information Visualization. 2004. • Christian Hirsch, John Hosking, and John Grundy. Interactive visualization tools for exploring the semantic graph of large knowledge spaces. Interfaces. 2009.
  • 47. 47/48 References (4/4) • Chaomei Chen and Yue Yu. Empirical studies of information visualization: a meta-analysis. Int. J. Hum.- Comput. Stud. 2000. • Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis, and Michael S. Miller. Visualization of search results: a comparative evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999.