Manually selecting subsets of photos from large collections in order to present them to friends or colleagues or to print them as photo books can be a tedious task. Today, fully automatic approaches are at hand for supporting users. They make use of pixel information extracted from the images, analyze contextual information such as capture time and focal aperture, or use both to determine a proper subset of photos. However, these approaches miss the most important factor in the photo selection process: the user. The goal of our approach is to consider individual interests. By recording and analyzing gaze information from the user's viewing photo collections, we obtain information on user's interests and use this information in the creation of personal photo selections. In a controlled experiment with 33 participants, we show that the selections can be significantly improved over a baseline approach by up to 22% when taking individual viewing behavior into account. We also obtained significantly better results for photos taken at an event participants were involved in compared with photos from another event.
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Ansgar Scherp
We propose a pipeline for text extraction from infographics
that makes use of a novel combination of data mining and computer vision techniques. The pipeline defines a sequence of steps to identify characters, cluster them into text lines, determine their rotation angle, and apply state-of-the-art OCR to recognize the text. In this paper, we formally define the pipeline and present its current implementation. In addition, we have conducted preliminary evaluations over a data corpus of 121 manually annotated infographics from a broad range of illustration types such as bar charts, pie charts, and line charts, maps, and others. We assess the results of our text extraction pipeline by comparing it with two baselines. Finally, we sketch an outline for future work and possibilities for improving the pipeline. - http://ceur-ws.org/Vol-1458/
Knowledge Discovery in Social Media and Scientific Digital LibrariesAnsgar Scherp
The talk presents selected results of our research in the area of text and data mining in social media and scientific literature. (1) First, we consider the area of classifying microblogging postings like tweets on Twitter. Typically, the classification results are evaluated against a gold standard, which is either the hashtags of the tweets’ authors or manual annotations. We claim that there are fundamental differences between these two kinds of gold standard classifications and conducted an experiment with 163 participants to manually classify tweets from ten topics. Our results show that the human annotators are more likely to classify tweets like other human annotators than like the tweets’ authors (i. e., the hashtags). This may influence the evaluation of classification methods like LDA and we argue that researchers should reflect the kind of gold standard used when interpreting their results. (2) Second, we present a framework for semantic document annotation that aims to compare different existing as well as new annotation strategies. For entity detection, we compare semantic taxonomies, trigrams, RAKE, and LDA. For concept activation, we cover a set of statistical, hierarchy-based, and graph-based methods. The strategies are evaluated over 100,000 manually labeled scientific documents from economics, politics, and computer science. (3) Finally, we present a processing pipeline for extracting text of varying size, rotation, color, and emphases from scholarly figures. The pipeline does not need training nor does it make any assumptions about the characteristics of the scholarly figures. We conducted a preliminary evaluation with 121 figures from a broad range of illustration types.
URL: https://www.ukp.tu-darmstadt.de/ukp-home/news-singleview/artikel/guest-speaker-ansgar-scherp/
A Framework for Iterative Signing of Graph Data on the WebAnsgar Scherp
Existing algorithms for signing graph data typically do not cover the whole signing process. In addition, they lack distinctive features such as signing graph data at different levels of granularity, iterative signing of graph data, and signing multiple graphs. In this paper, we introduce a novel framework for signing arbitrary graph data provided, e g., as RDF(S), Named Graphs, or OWL. We conduct an extensive theoretical and empirical analysis of the runtime and space complexity of different framework configurations. The experiments are performed on synthetic and real-world graph data of different size and different number of blank nodes. We investigate security issues, present a trust model, and discuss practical considerations for using our signing framework.
We released a Java-based open source implementation of our software framework for iterative signing of arbitrary graph data provided, e. g., as RDF(S), Named Graphs, or OWL. The software framework is based on a formalization of different graph signing functions and supports different configurations. It is available in source code as well as pre-compiled as .jar-file.
The graph signing framework exhibits the following unique features:
- Signing graphs on different levels of granularity
- Signing multiple graphs at once
- Iterative signing of graph data for provenance tracking
- Independence of the used language for encoding the graph (i. e., the signature does not break when changing the graph representation)
The documentation of the software framework and its source code is available from: http://icp.it-risk.iwvi.uni-koblenz.de/wiki/Software_Framework_for_Signing_Graph_Data
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...Ansgar Scherp
ACM SIGMM Rising Stars Symposium
The ACM SIGMM Rising Stars Symposium, inaugurated in 2015, will highlight plenary presentations of six selected rising SIGMM members on their vision and research achievements, and dialogs with senior members about the future of multimedia research.
See: http://www.acmmm.org/2016/?page_id=706
Mining and Managing Large-scale Linked Open DataAnsgar Scherp
Linked Open Data (LOD) is about publishing and interlinking data of different origin and purpose on the web. The Resource Description Framework (RDF) is used to describe data on the LOD cloud. In contrast to relational databases, RDF does not provide a fixed, pre-defined schema. Rather, RDF allows for flexibly modeling the data schema by attaching RDF types and properties to the entities. Our schema-level index called SchemEX allows for searching in large-scale RDF graph data. The index can be efficiently computed with reasonable accuracy over large-scale data sets with billions of RDF triples, the smallest information unit on the LOD cloud. SchemEX is highly needed as the size of the LOD cloud quickly increases. Due to the evolution of the LOD cloud, one observes frequent changes of the data. We show that also the data schema changes in terms of combinations of RDF types and properties. As changes cannot capture the dynamics of the LOD cloud, current work includes temporal clustering and finding periodicities in entity dynamics over large-scale snapshots of the LOD cloud with about 100 million triples per week for more than three years.
Events in Multimedia - Theory, Model, ApplicationAnsgar Scherp
Talk by Ansgar Scherp.
Title: Events in Multimedia - Theory, Model, Application
Event: Workshop on Event-based Media Integration and Processing, ACM Multimedia, 2013
Formalization and Preliminary Evaluation of a Pipeline for Text Extraction Fr...Ansgar Scherp
We propose a pipeline for text extraction from infographics
that makes use of a novel combination of data mining and computer vision techniques. The pipeline defines a sequence of steps to identify characters, cluster them into text lines, determine their rotation angle, and apply state-of-the-art OCR to recognize the text. In this paper, we formally define the pipeline and present its current implementation. In addition, we have conducted preliminary evaluations over a data corpus of 121 manually annotated infographics from a broad range of illustration types such as bar charts, pie charts, and line charts, maps, and others. We assess the results of our text extraction pipeline by comparing it with two baselines. Finally, we sketch an outline for future work and possibilities for improving the pipeline. - http://ceur-ws.org/Vol-1458/
Knowledge Discovery in Social Media and Scientific Digital LibrariesAnsgar Scherp
The talk presents selected results of our research in the area of text and data mining in social media and scientific literature. (1) First, we consider the area of classifying microblogging postings like tweets on Twitter. Typically, the classification results are evaluated against a gold standard, which is either the hashtags of the tweets’ authors or manual annotations. We claim that there are fundamental differences between these two kinds of gold standard classifications and conducted an experiment with 163 participants to manually classify tweets from ten topics. Our results show that the human annotators are more likely to classify tweets like other human annotators than like the tweets’ authors (i. e., the hashtags). This may influence the evaluation of classification methods like LDA and we argue that researchers should reflect the kind of gold standard used when interpreting their results. (2) Second, we present a framework for semantic document annotation that aims to compare different existing as well as new annotation strategies. For entity detection, we compare semantic taxonomies, trigrams, RAKE, and LDA. For concept activation, we cover a set of statistical, hierarchy-based, and graph-based methods. The strategies are evaluated over 100,000 manually labeled scientific documents from economics, politics, and computer science. (3) Finally, we present a processing pipeline for extracting text of varying size, rotation, color, and emphases from scholarly figures. The pipeline does not need training nor does it make any assumptions about the characteristics of the scholarly figures. We conducted a preliminary evaluation with 121 figures from a broad range of illustration types.
URL: https://www.ukp.tu-darmstadt.de/ukp-home/news-singleview/artikel/guest-speaker-ansgar-scherp/
A Framework for Iterative Signing of Graph Data on the WebAnsgar Scherp
Existing algorithms for signing graph data typically do not cover the whole signing process. In addition, they lack distinctive features such as signing graph data at different levels of granularity, iterative signing of graph data, and signing multiple graphs. In this paper, we introduce a novel framework for signing arbitrary graph data provided, e g., as RDF(S), Named Graphs, or OWL. We conduct an extensive theoretical and empirical analysis of the runtime and space complexity of different framework configurations. The experiments are performed on synthetic and real-world graph data of different size and different number of blank nodes. We investigate security issues, present a trust model, and discuss practical considerations for using our signing framework.
We released a Java-based open source implementation of our software framework for iterative signing of arbitrary graph data provided, e. g., as RDF(S), Named Graphs, or OWL. The software framework is based on a formalization of different graph signing functions and supports different configurations. It is available in source code as well as pre-compiled as .jar-file.
The graph signing framework exhibits the following unique features:
- Signing graphs on different levels of granularity
- Signing multiple graphs at once
- Iterative signing of graph data for provenance tracking
- Independence of the used language for encoding the graph (i. e., the signature does not break when changing the graph representation)
The documentation of the software framework and its source code is available from: http://icp.it-risk.iwvi.uni-koblenz.de/wiki/Software_Framework_for_Signing_Graph_Data
About Multimedia Presentation Generation and Multimedia Metadata: From Synthe...Ansgar Scherp
ACM SIGMM Rising Stars Symposium
The ACM SIGMM Rising Stars Symposium, inaugurated in 2015, will highlight plenary presentations of six selected rising SIGMM members on their vision and research achievements, and dialogs with senior members about the future of multimedia research.
See: http://www.acmmm.org/2016/?page_id=706
Mining and Managing Large-scale Linked Open DataAnsgar Scherp
Linked Open Data (LOD) is about publishing and interlinking data of different origin and purpose on the web. The Resource Description Framework (RDF) is used to describe data on the LOD cloud. In contrast to relational databases, RDF does not provide a fixed, pre-defined schema. Rather, RDF allows for flexibly modeling the data schema by attaching RDF types and properties to the entities. Our schema-level index called SchemEX allows for searching in large-scale RDF graph data. The index can be efficiently computed with reasonable accuracy over large-scale data sets with billions of RDF triples, the smallest information unit on the LOD cloud. SchemEX is highly needed as the size of the LOD cloud quickly increases. Due to the evolution of the LOD cloud, one observes frequent changes of the data. We show that also the data schema changes in terms of combinations of RDF types and properties. As changes cannot capture the dynamics of the LOD cloud, current work includes temporal clustering and finding periodicities in entity dynamics over large-scale snapshots of the LOD cloud with about 100 million triples per week for more than three years.
Events in Multimedia - Theory, Model, ApplicationAnsgar Scherp
Talk by Ansgar Scherp.
Title: Events in Multimedia - Theory, Model, Application
Event: Workshop on Event-based Media Integration and Processing, ACM Multimedia, 2013
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Ansgar Scherp
Slides of our presentation @iiWAS2021: The 23rd International Conference on Information Integration and Web Intelligence, Linz, Austria, 29 November 2021 - 1 December 2021. ACM 2021, ISBN 978-1-4503-9556-4
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...Ansgar Scherp
Presentation for our paper @iiWAS2021: The 23rd International Conference on Information Integration and Web Intelligence, Linz, Austria, 29 November 2021 - 1 December 2021. ACM 2021, ISBN 978-1-4503-9556-4
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Ansgar Scherp
Text extraction from scientific figures has been addressed in the past by different unsupervised approaches due to the limited amount of training data. Motivated by the recent advances in Deep Learning, we propose a two-step neural-network-based pipeline to localize and extract text using Fully Convolutional Networks. We improve the localization of the text bounding boxes by applying a novel combination of a Residual Network with the Region Proposal Network based on Faster R-CNN. The predicted bounding boxes are further pre-processed and used as input to the of-the-shelf optical character recognition engine Tesseract 4.0. We evaluate our improved text localization method on five different datasets of scientific figures and compare it with the best unsupervised pipeline. Since only limited training data is available, we further experiment with different data augmentation techniques for increasing the size of the training datasets and demonstrate their positive impact. We use Average Precision and F1 measure to assess the text localization results. In addition, we apply Gestalt Pattern Matching and Levenshtein Distance for evaluating the quality of the recognized text. Our extensive experiments show that our new pipeline based on neural networks outperforms the best unsupervised approach by a large margin of 19-20%.
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresAnsgar Scherp
So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.
A Comparison of Different Strategies for Automated Semantic Document AnnotationAnsgar Scherp
We introduce a framework for automated semantic document annotation that is composed of four processes, namely concept extraction, concept activation, annotation selection, and evaluation. The framework is used to implement and compare different annotation strategies motivated by the literature. For concept extraction, we apply entity detection with semantic hierarchical knowledge bases, Tri-gram, RAKE, and LDA. For concept activation, we compare a set of statistical, hierarchy-based, and graph-based methods. For selecting annotations, we compare top-k as well as kNN. In total, we define 43 different strategies including novel combinations like using graph-based activation with kNN. We have evaluated the strategies using three different datasets of varying size from three scientific disciplines (economics, politics, and computer science) that contain 100, 000 manually labeled documents in total. We obtain the best results on all three datasets by our novel combination of entity detection with graph-based activation (e.g., HITS and Degree) and kNN. For the economic and political science datasets, the best F-measure is .39 and .28, respectively. For the computer science dataset, the maximum F-measure of .33 can be reached. The experiments are the by far largest on scholarly content annotation, which typically are up to a few hundred documents per dataset only.
Gregor Große-Bölting, Chifumi Nishioka, and Ansgar Scherp. 2015. A Comparison of Different Strategies for Automated Semantic Document Annotation. In Proceedings of the 8th International Conference on Knowledge Capture (K-CAP 2015). ACM, New York, NY, USA, , Article 8 , 8 pages. DOI=http://dx.doi.org/10.1145/2815833.2815838
Can you see it? Annotating Image Regions based on Users' Gaze InformationAnsgar Scherp
Presentation on eyetracking-based annotation of image regions that I gave at Vienna on Oct 19, 2012. Download original PowerPoint file to enjoy all animations. For the papers, please refer to: http://www.ansgarscherp.net/publications
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Analysis of GraphSum's Attention Weights to Improve the Explainability of Mul...Ansgar Scherp
Slides of our presentation @iiWAS2021: The 23rd International Conference on Information Integration and Web Intelligence, Linz, Austria, 29 November 2021 - 1 December 2021. ACM 2021, ISBN 978-1-4503-9556-4
STEREO: A Pipeline for Extracting Experiment Statistics, Conditions, and Topi...Ansgar Scherp
Presentation for our paper @iiWAS2021: The 23rd International Conference on Information Integration and Web Intelligence, Linz, Austria, 29 November 2021 - 1 December 2021. ACM 2021, ISBN 978-1-4503-9556-4
Text Localization in Scientific Figures using Fully Convolutional Neural Netw...Ansgar Scherp
Text extraction from scientific figures has been addressed in the past by different unsupervised approaches due to the limited amount of training data. Motivated by the recent advances in Deep Learning, we propose a two-step neural-network-based pipeline to localize and extract text using Fully Convolutional Networks. We improve the localization of the text bounding boxes by applying a novel combination of a Residual Network with the Region Proposal Network based on Faster R-CNN. The predicted bounding boxes are further pre-processed and used as input to the of-the-shelf optical character recognition engine Tesseract 4.0. We evaluate our improved text localization method on five different datasets of scientific figures and compare it with the best unsupervised pipeline. Since only limited training data is available, we further experiment with different data augmentation techniques for increasing the size of the training datasets and demonstrate their positive impact. We use Average Precision and F1 measure to assess the text localization results. In addition, we apply Gestalt Pattern Matching and Levenshtein Distance for evaluating the quality of the recognized text. Our extensive experiments show that our new pipeline based on neural networks outperforms the best unsupervised approach by a large margin of 19-20%.
A Comparison of Approaches for Automated Text Extraction from Scholarly FiguresAnsgar Scherp
So far, there has not been a comparative evaluation of different approaches for text extraction from scholarly figures. In order to fill this gap, we have defined a generic pipeline for text extraction that abstracts from the existing approaches as documented in the literature. In this paper, we use this generic pipeline to systematically evaluate and compare 32 configurations for text extraction over four datasets of scholarly figures of different origin and characteristics. In total, our experiments have been run over more than 400 manually labeled figures. The experimental results show that the approach BS-4OS results in the best F-measure of 0.67 for the Text Location Detection and the best average Levenshtein Distance of 4.71 between the recognized text and the gold standard on all four datasets using the Ocropy OCR engine.
A Comparison of Different Strategies for Automated Semantic Document AnnotationAnsgar Scherp
We introduce a framework for automated semantic document annotation that is composed of four processes, namely concept extraction, concept activation, annotation selection, and evaluation. The framework is used to implement and compare different annotation strategies motivated by the literature. For concept extraction, we apply entity detection with semantic hierarchical knowledge bases, Tri-gram, RAKE, and LDA. For concept activation, we compare a set of statistical, hierarchy-based, and graph-based methods. For selecting annotations, we compare top-k as well as kNN. In total, we define 43 different strategies including novel combinations like using graph-based activation with kNN. We have evaluated the strategies using three different datasets of varying size from three scientific disciplines (economics, politics, and computer science) that contain 100, 000 manually labeled documents in total. We obtain the best results on all three datasets by our novel combination of entity detection with graph-based activation (e.g., HITS and Degree) and kNN. For the economic and political science datasets, the best F-measure is .39 and .28, respectively. For the computer science dataset, the maximum F-measure of .33 can be reached. The experiments are the by far largest on scholarly content annotation, which typically are up to a few hundred documents per dataset only.
Gregor Große-Bölting, Chifumi Nishioka, and Ansgar Scherp. 2015. A Comparison of Different Strategies for Automated Semantic Document Annotation. In Proceedings of the 8th International Conference on Knowledge Capture (K-CAP 2015). ACM, New York, NY, USA, , Article 8 , 8 pages. DOI=http://dx.doi.org/10.1145/2815833.2815838
Can you see it? Annotating Image Regions based on Users' Gaze InformationAnsgar Scherp
Presentation on eyetracking-based annotation of image regions that I gave at Vienna on Oct 19, 2012. Download original PowerPoint file to enjoy all animations. For the papers, please refer to: http://www.ansgarscherp.net/publications
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Smart photo selection: interpret gaze as personal interest
1. 1
Smart Photo Selection:
Interpret Gaze as Personal Interest
Tina Walber1
, Ansgar Scherp2,3
, Steffen Staab1
1 Institute WeST, University of Koblenz, Germany
2 Kiel University, Germany
3 Leibniz Information Center for Economics, Kiel, Germany
2. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 2
Managment of Digital Photos
● Its a mess!
● We take a lot of photos
● Manually selecting photos is cumbersome
● Like to have photo selections for
– Sharing photos online
– Creating photo products like photo books
– Creating presentations
3. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 3
State of the Art: Automatic
Creation of Photo Selections
● Content-based approaches
– Analysis of low-level features
● Context-based approaches
– Analysis of context information
● What about individual aestetics, personal
preferences, user interests, ….?
4. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 4
Interpret Gaze as Personal Interest
● Gaze delivers information on user's interest
● Useful for creating individual photo selections?
● Principal approach
– Merely observe what users are doing anyway
– Do not ask to perfom additional tasks
5. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 5
● Starting from $99
6. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 6
Research Questions
1. Is there a need for individual photo selections?
2. Does a gaze-based selection outperform
selections based on content and context analysis
when comparing to those created manually?
3. Does the personal interest in a viewed photo set
have an impact on the obtained selection results?
7. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 7
Experiment Setup
Photo Viewing
Task:
„get an overview“
Step 1
Recording of the
eye tracking data
8. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 8
Photo Viewing
32 pages with 9 photos each
9. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 9
Experiment Setup
Photo Viewing
Task:
„get an overview“
Step 1
Photo Selection
Task: „select photos
for your private photo
collection“
Step 2
Recording of the
eye tracking data
Creation of
Ground Truth
Sm
10. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 10
Manual Selection
Creator:LibreOffice 3.5
LanguageLevel:2
11. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 11
Collection CA Collection CB
162 photos 126 photos
Experiment Data Set C = CA CB
∩
Two Data Sets and
Two User Groups
Institute A Institute B
Home collectionHome collection
Foreign collection
● Taken during social events of the research institutes
12. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 12
Participants
● 33 participants (12 of them female)
● 21 associated to Institute A, 12 to Institute B
● Aged between 25 and 62 (Ø 33.5 ± 9.57)
● 20 graduate students, 4 postdocs,
9 other professions
13. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 13
Overview Analysis and Evaluation
Se
Collection C
Gaze
Based
Selection
Calculation
of Precision
P
Manual
Selection
Content and
Context Based
Selection
Sb+e
Sb
Ground Truth
Sm
14. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 14
Baseline Measures
# Name Description
1 concentration
Time
Photo was taken with other photos in a
short period of time
2 sharpness Sharpness score from related work
3 numberOfFaces Number of faces
4 faceGaussian Size and position of faces
5 personsPopula
rity
Popularity of the depicted persons
6 faceArea Areas in pixels covered by faces
Selection of photos based on:
Calculated for each photo
15. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 15
Eye Tracking Data
●
Fixations and saccades
● Analysed gaze data with eye tracking measures
Creator:LibreOffice 3.5
LanguageLevel:2
● Viewing duration / page:
M = 12.6 s
● Number of fixations /
photo: M = 3.25
16. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 16
Eye Tracking Measures
# Name Description
7 fixated Was the photo was fixated?
Creator:LibreOffice 3.5
LanguageLevel:2
17. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 17
Eye Tracking Measures
# Name Description
7 fixated Was the photo was fixated?
8 fixationCount Counts the number of fixations
Creator:LibreOffice 3.5
LanguageLevel:2
18. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 18
Eye Tracking Measures
# Name Description
7 fixated Was the photo was fixated?
8 fixationCount Counts the number of fixations
9 fixationDuration Sum of duration of all fixations
Creator:LibreOffice 3.5
LanguageLevel:2
19. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 19
Eye Tracking Measures
# Name Description
7 fixated Was the photo was fixated?
8 fixationCount Counts the number of fixations
9 fixationDuration Sum of duration of all fixations
10 firstFixationDuration Duration of the first fixation
11 lastFixationDuration Duration of the last fixation
12 avgFixationDuration Average fixation duration
13 maxVisitDuration Maximum visit length
14 meanVisitDuration Average visit length
15 visitCount Number of visits
16 saccLength Mean length of the saccades
17 pupilMax Maximum pupil diameter
18 pupilMaxChange Maximum pupil diameter change
19 pupilAvg Average pupil diameter
20. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 20
Combination of Measures
● Using a model learned from logistic regression
● Assigns each image a probability of being
selected
● 30 random splits for training and test data
21. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 21
1. Is there a need for individual
photo selections?
1 21 41 61 81 101121141161181201221241261281
0
5
10
15
20
25
30
Photo with the highest number of selections
Photos with no selections
Photos in data set C
10 40 70 100 130 160 190 220 250 280
SelectionFrequency
● Manually created photo selections are diverse
22. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 22
2. Evaluation of the Photo Selections
PrecisionP
Sb Sb+e Se
*
*
Random
Selection
P = 0.428P = 0.365 P = 0.426
● Improvement of 17% over baseline
23. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 23
3. Impact of personal interest?
PrecisionP
Results for Sb+e
Foreign Collection Home Collection
P = 0.446P = 0.404
*
24. Walber, Scherp, Staab ● Smart Photo Selection: Interpret Gaze as Personal Interest 24
Conclusion
● Photo selection behavior is individual
● Gaze helps capture personal preferences
● Results are better for photos with personal interest
● Might work even better for real personal photos
● Potential application in photo book authoring
Thank you for your attention!
Editor's Notes
LIBLINEAR
85% as training data
75 % of fotos were selected five times or less.
Only two fotos selected by half of the subjects.