SlideShare a Scribd company logo
1/48
Visual search for supporting content
exploration in large document collections
Drahomira Herrmannova and Petr Knoth
2/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
3/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
4/48
What do we do
• Improve search in (large) document collections
• Examples of collections:
– News articles
– Cultural heritage collection
– Collection of scientific papers
• Current search engines:
– Support for lookup
– Much less support for exploration
5/48
Search tasks (Rose and Levinson, 2004)
• Undirected (or exploratory) queries – significant
portion of all searches (Rose and Levinson, 2004)
6/48
Exploratory search (Marchionini, 2006)
7/48
How to support exploratory search
• One possible solution – information
visualisation
• Why?
– Easier to communicate structure, organisation and
relations in content
– Visually appealing
8/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
9/48
Information Visualisation (1/2)
• Division according to granularity of
information
– Collection level
– Document level
– Intra-document level
10/48
Collection level visualisations
• Visualise attributes of the collection
• Typically aim at providing a general overview
of the collection content
• Examples
11/48
Tag clouds (Montero and Solana, 2006)
12/48
TIARA (Wei et al., 2010)
13/48
GRIDL (Schneiderman et al., 2000)
14/48
Document level visualisations
• Visualise attributes of the collection items
• Mutual links and relations of collection items
• Examples
15/48
Hopara (Milne and Witten, 2011)
16/48
Wivi (Lehmann et al., 2010)
17/48
Apolo (Chau et al., 2011)
18/48
Intra-document level visualisations
• Visualise the internal structure of a document
• Example
19/48
TileBars (Hirst, 1995)
20/48
Information Visualisation (2/2)
• Division according to the “starting point” of
the visualisation
– Browsing focused
– Query focused
21/48
Browsing focused
• Exploration starts at a specific point in the
collection from which the user navigates
through the collection
• Usually the same starting point is used every
time
22/48
InfoSky (Granitzer et al., 2004)
23/48
Query focused
• Starts with a query
• The query determines the entry point from
which the exploration starts
24/48
ThinkPedia (Hirsch et al., 2009)
25/48
Our approach
• Document level information
• Query focused browsing
26/48
Design principles (1/2)
• For visual search interfaces
• Should be considered when designing the
interface
• Related studies:
– Chen and Yu, 2000
– Sebrechts et al., 1999
27/48
Design principles (2/2)
1. Added value
2. Simplicity
3. Visual legibility
4. Use of colours
5. Dimension
6. Fixed spatial location
28/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
29/48
Considered types of collections
• Every document in a collection defined
according to a set of dimensions
• Dimensions typically of different types
• Document = set of properties expressing
values of dimensions
• Dimensions always present
• Examples
30/48
News articles collection
• Dimensions:
– Time
– Themes
– Locations
– Relations to other articles
31/48
Cultural heritage artifacts
• Dimensions:
– Artifact type
– Historical period
– Style
– Material
32/48
Scientific papers
• Dimensions:
– Citations
– Authors
– Concepts
– Similarities with other articles
33/48
The visualisation
34/48
The visualisation
35/48
The visualisation
36/48
The visualisation
37/48
Discovering connections
38/48
Comparing and contrasting documents
39/48
Limitations
• In theory not restricted, the limitations might
be:
– the size and resolution of the screen
– the limitations of human perception
40/48
Contents
• What do we do
• Information Visualisations and Visual Search
Interfaces
• Our approach
• Conclusion
41/48
Conclusion (1/2)
• Motivation:
1. Provide better support for exploratory search
than current textual interfaces
2. Interface that is conceptually applicable in any
document collection regardless of its type
3. Provide an added value by assisting in the
discovery of interesting connections that would
otherwise remain hidden
42/48
Conclusion (2/2)
• Results:
1. Support for comparing and contrasting content.
2. Support for exploration across dimensions.
3. Universal approach to the visualised dimensions.
43/48
Future plans
• Planned release end of June
• Integration with CORE system
• Evaluation
44/48
References (1/4)
• G. Marchionini. Exploratory search: from finding to understanding.
Communications of the ACM - Supporting exploratory search. 2006.
• D. Rose & D. Levinson. Understanding user goals in web search.
Proceedings of the 13th conference on World Wide Web. 2004.
• Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as
visual information retrieval interfaces. In MERIDA, INSCIT2006
CONFERENCE. 2006.
• Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong
Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an-
alytic system. In Proceedings of the 16th ACMSIGKDD international
conference on Knowledge discovery and data mining. 2010.
45/48
References (2/4)
• Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau.
Visualizing digital librarysearch results with categorical and hierarchical
axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000.
• Marti A. Hearst. TileBars: Visualization of Term Distribution Information in
Full Text Information Access. In the Proceedings of the ACM SIGCHI
Conference on Human Factors in Computing Systems. 1995.
• David Milne, Ian Witten. A link-based visual search engine for Wikipedia.
Proceeding of the 11th annual international ACM/IEEE joint conference on
Digital libraries. 2011.
• Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive
visualization for opportunistic exploration of large document collections.
Information Systems. 2010.
46/48
References (3/4)
• Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos.
Apolo: making sense of large network data by combining rich user
interaction and machine learning. In Proceedings of the 2011 annual
conference on Human factors in computing systems. 2011.
• Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and
Werner Klieber. Evaluating a system for interactive exploration of large,
hierarchically structured document repositories. In Proceedings of the IEEE
Symposium on Information Visualization. 2004.
• Christian Hirsch, John Hosking, and John Grundy. Interactive visualization
tools for exploring the semantic graph of large knowledge spaces.
Interfaces. 2009.
47/48
References (4/4)
• Chaomei Chen and Yue Yu. Empirical studies of information visualization: a
meta-analysis. Int. J. Hum.- Comput. Stud. 2000.
• Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis,
and Michael S. Miller. Visualization of search results: a comparative
evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd
annual international ACM SIGIR conference on Research and development
in information retrieval. 1999.
48/48
Thanks for listening!
Questions?

More Related Content

What's hot

Kms Serveying The Landscape
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscape
chu2mm
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
African Open Science Platform
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
kfear
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitor
Open Science Fair
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
Robin Rice
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
Martin Donnelly
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
Renaine Julian
 
Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse
Micah Altman
 
Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
DigitalLibraryServices
 
Building a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North TexasBuilding a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North Texas
National Information Standards Organization (NISO)
 
Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"
National Information Standards Organization (NISO)
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
Nancy Pontika
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
DigitalPreservationEurope
 

What's hot (13)

Kms Serveying The Landscape
Kms Serveying The LandscapeKms Serveying The Landscape
Kms Serveying The Landscape
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
 
OSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitorOSFair2017 Workshop | OpenDataMonitor
OSFair2017 Workshop | OpenDataMonitor
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
 
Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse Dissemination Information Packages (DIPS) for Information Reuse
Dissemination Information Packages (DIPS) for Information Reuse
 
Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
Building a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North TexasBuilding a Trusted Framework - Kevin Hawkins, University North Texas
Building a Trusted Framework - Kevin Hawkins, University North Texas
 
Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"Hawkins "Monitoring Usage of Open Access Long-Form Content"
Hawkins "Monitoring Usage of Open Access Long-Form Content"
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 

Similar to Visual Search for Supporting Content Exploration in Large Document Collections

Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
TimelessFuture
 
Lessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism Workshop
Marijn Koolen
 
C N I20080404
C N I20080404C N I20080404
C N I20080404
Anita de Waard
 
Torsten Reimer
Torsten ReimerTorsten Reimer
Torsten Reimer
Anita de Waard
 
How Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the Internet
Caroline Williams
 
A hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflection
Marijn Koolen
 
Linking Collections Through Linked Open Data
Linking Collections Through Linked Open DataLinking Collections Through Linked Open Data
Linking Collections Through Linked Open Data
The European Library
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
lorna_hughes
 
Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)
Digitised Manuscripts to Europeana
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Lynn Connaway
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
OCLC
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research MethodsManaging Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
Rebecca Grant
 
Digital libraries
Digital librariesDigital libraries
Digital libraries
Apurva Kulkarni
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
Varsha Khodiyar
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
OpenAIRE
 
Introduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah JonesIntroduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah Jones
OpenAIRE
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
Lynn Connaway
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
OCLC
 
B08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte EnB08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte En
Territorial Intelligence
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3
Jacqueline Thomas
 

Similar to Visual Search for Supporting Content Exploration in Large Document Collections (20)

Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
 
Lessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism WorkshopLessons Learned from a Digital Tool Criticism Workshop
Lessons Learned from a Digital Tool Criticism Workshop
 
C N I20080404
C N I20080404C N I20080404
C N I20080404
 
Torsten Reimer
Torsten ReimerTorsten Reimer
Torsten Reimer
 
How Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the InternetHow Do UK Students, Researchers and Academics use the Internet
How Do UK Students, Researchers and Academics use the Internet
 
A hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflectionA hands-on approach to digital tool criticism: Tools for (self-)reflection
A hands-on approach to digital tool criticism: Tools for (self-)reflection
 
Linking Collections Through Linked Open Data
Linking Collections Through Linked Open DataLinking Collections Through Linked Open Data
Linking Collections Through Linked Open Data
 
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DHLorna hughes 12 05-2013 NeDiMAH and ontology for DH
Lorna hughes 12 05-2013 NeDiMAH and ontology for DH
 
Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)Reasoning with Reasoning (STRiX 2014)
Reasoning with Reasoning (STRiX 2014)
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
 
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library EthnographyCapturing the Behaviors of the Elusive User: Strategies for Library Ethnography
Capturing the Behaviors of the Elusive User: Strategies for Library Ethnography
 
Managing Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research MethodsManaging Ireland's Research Data - 3 Research Methods
Managing Ireland's Research Data - 3 Research Methods
 
Digital libraries
Digital librariesDigital libraries
Digital libraries
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
 
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
Research data discovery in OpenAIRE (Presentation by Paolo Manghi at DI4R2018)
 
Introduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah JonesIntroduction to the workshop Services to support FAIR data - Sarah Jones
Introduction to the workshop Services to support FAIR data - Sarah Jones
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
 
Qualitative Research Methods in LIS
Qualitative Research Methods in LISQualitative Research Methods in LIS
Qualitative Research Methods in LIS
 
B08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte EnB08 B4pc 141 Diapo Amiotte En
B08 B4pc 141 Diapo Amiotte En
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3
 

More from Dasha Herrmannova

Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data Extraction
Dasha Herrmannova
 
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Dasha Herrmannova
 
Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation
Dasha Herrmannova
 
Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?
Dasha Herrmannova
 
An Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic Graph
Dasha Herrmannova
 
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Dasha Herrmannova
 
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingSimple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Dasha Herrmannova
 
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Dasha Herrmannova
 
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Dasha Herrmannova
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Dasha Herrmannova
 

More from Dasha Herrmannova (10)

Machine Learning for Data Extraction
Machine Learning for Data ExtractionMachine Learning for Data Extraction
Machine Learning for Data Extraction
 
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy ComplianceDo Authors Deposit on Time? Tracking Open Access Policy Compliance
Do Authors Deposit on Time? Tracking Open Access Policy Compliance
 
Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation Semantometrics: Text Analysis in Research Evaluation
Semantometrics: Text Analysis in Research Evaluation
 
Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?Do Citations and Readership Predict Excellent Publications?
Do Citations and Readership Predict Excellent Publications?
 
An Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic GraphAn Analysis of the Microsoft Academic Graph
An Analysis of the Microsoft Academic Graph
 
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
Unsupervised Identification of Study Descriptors in Toxicology Research: An E...
 
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication RankingSimple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
Simple Yet Effective Methods for Large-Scale Scholarly Publication Ranking
 
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
Semantometrics in Coauthorship Networks: Fulltext-based Approach for Analysin...
 
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing...
 
Mining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal SeminarMining Research Publication Networks for Impact -- KMi Internal Seminar
Mining Research Publication Networks for Impact -- KMi Internal Seminar
 

Recently uploaded

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 

Recently uploaded (20)

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 

Visual Search for Supporting Content Exploration in Large Document Collections

  • 1. 1/48 Visual search for supporting content exploration in large document collections Drahomira Herrmannova and Petr Knoth
  • 2. 2/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 3. 3/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 4. 4/48 What do we do • Improve search in (large) document collections • Examples of collections: – News articles – Cultural heritage collection – Collection of scientific papers • Current search engines: – Support for lookup – Much less support for exploration
  • 5. 5/48 Search tasks (Rose and Levinson, 2004) • Undirected (or exploratory) queries – significant portion of all searches (Rose and Levinson, 2004)
  • 7. 7/48 How to support exploratory search • One possible solution – information visualisation • Why? – Easier to communicate structure, organisation and relations in content – Visually appealing
  • 8. 8/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 9. 9/48 Information Visualisation (1/2) • Division according to granularity of information – Collection level – Document level – Intra-document level
  • 10. 10/48 Collection level visualisations • Visualise attributes of the collection • Typically aim at providing a general overview of the collection content • Examples
  • 11. 11/48 Tag clouds (Montero and Solana, 2006)
  • 12. 12/48 TIARA (Wei et al., 2010)
  • 14. 14/48 Document level visualisations • Visualise attributes of the collection items • Mutual links and relations of collection items • Examples
  • 15. 15/48 Hopara (Milne and Witten, 2011)
  • 17. 17/48 Apolo (Chau et al., 2011)
  • 18. 18/48 Intra-document level visualisations • Visualise the internal structure of a document • Example
  • 20. 20/48 Information Visualisation (2/2) • Division according to the “starting point” of the visualisation – Browsing focused – Query focused
  • 21. 21/48 Browsing focused • Exploration starts at a specific point in the collection from which the user navigates through the collection • Usually the same starting point is used every time
  • 23. 23/48 Query focused • Starts with a query • The query determines the entry point from which the exploration starts
  • 25. 25/48 Our approach • Document level information • Query focused browsing
  • 26. 26/48 Design principles (1/2) • For visual search interfaces • Should be considered when designing the interface • Related studies: – Chen and Yu, 2000 – Sebrechts et al., 1999
  • 27. 27/48 Design principles (2/2) 1. Added value 2. Simplicity 3. Visual legibility 4. Use of colours 5. Dimension 6. Fixed spatial location
  • 28. 28/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 29. 29/48 Considered types of collections • Every document in a collection defined according to a set of dimensions • Dimensions typically of different types • Document = set of properties expressing values of dimensions • Dimensions always present • Examples
  • 30. 30/48 News articles collection • Dimensions: – Time – Themes – Locations – Relations to other articles
  • 31. 31/48 Cultural heritage artifacts • Dimensions: – Artifact type – Historical period – Style – Material
  • 32. 32/48 Scientific papers • Dimensions: – Citations – Authors – Concepts – Similarities with other articles
  • 39. 39/48 Limitations • In theory not restricted, the limitations might be: – the size and resolution of the screen – the limitations of human perception
  • 40. 40/48 Contents • What do we do • Information Visualisations and Visual Search Interfaces • Our approach • Conclusion
  • 41. 41/48 Conclusion (1/2) • Motivation: 1. Provide better support for exploratory search than current textual interfaces 2. Interface that is conceptually applicable in any document collection regardless of its type 3. Provide an added value by assisting in the discovery of interesting connections that would otherwise remain hidden
  • 42. 42/48 Conclusion (2/2) • Results: 1. Support for comparing and contrasting content. 2. Support for exploration across dimensions. 3. Universal approach to the visualised dimensions.
  • 43. 43/48 Future plans • Planned release end of June • Integration with CORE system • Evaluation
  • 44. 44/48 References (1/4) • G. Marchionini. Exploratory search: from finding to understanding. Communications of the ACM - Supporting exploratory search. 2006. • D. Rose & D. Levinson. Understanding user goals in web search. Proceedings of the 13th conference on World Wide Web. 2004. • Yusef Hassan-Montero and Victor Herrero-Solana. Improving tag-clouds as visual information retrieval interfaces. In MERIDA, INSCIT2006 CONFERENCE. 2006. • Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan, Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan, and Qiang Zhang. Tiara: a visual exploratory text an- alytic system. In Proceedings of the 16th ACMSIGKDD international conference on Knowledge discovery and data mining. 2010.
  • 45. 45/48 References (2/4) • Ben Shneiderman, David Feldman, Anne Rose, and Xavier Ferré Grau. Visualizing digital librarysearch results with categorical and hierarchical axes. In Proceedings of the fifth ACM conference on Digital libraries. 2000. • Marti A. Hearst. TileBars: Visualization of Term Distribution Information in Full Text Information Access. In the Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems. 1995. • David Milne, Ian Witten. A link-based visual search engine for Wikipedia. Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries. 2011. • Simon Lehmann, Ulrich Schwanecke, and Rolf Dorner. Interactive visualization for opportunistic exploration of large document collections. Information Systems. 2010.
  • 46. 46/48 References (3/4) • Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. Apolo: making sense of large network data by combining rich user interaction and machine learning. In Proceedings of the 2011 annual conference on Human factors in computing systems. 2011. • Michael Granitzer, Wolfgang Kienreich, Vedran Sabol, Keith Andrews, and Werner Klieber. Evaluating a system for interactive exploration of large, hierarchically structured document repositories. In Proceedings of the IEEE Symposium on Information Visualization. 2004. • Christian Hirsch, John Hosking, and John Grundy. Interactive visualization tools for exploring the semantic graph of large knowledge spaces. Interfaces. 2009.
  • 47. 47/48 References (4/4) • Chaomei Chen and Yue Yu. Empirical studies of information visualization: a meta-analysis. Int. J. Hum.- Comput. Stud. 2000. • Marc M. Sebrechts, John V. Cugini, Sharon J. Laskowski, Joanna Vasilakis, and Michael S. Miller. Visualization of search results: a comparative evaluation of text, 2d, and 3d interfaces. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 1999.