SlideShare a Scribd company logo
1 of 46
Institute for Web Science and Technologies · University of Koblenz-Landau, Germany
(Semi-)Automatic Analysis
of Online Contents
Steffen Staab
@ststaab
Web and Internet Science Group · ECS · University of Southampton, UK &
(Semi-)Automatic analysis of online content 2/68Steffen Staab
Content analysis
(Semi-)Automatic analysis of online content 3/68Steffen Staab
Is it difficult?
„Nach dem Auspacken der LPS-105 präsentiert sich dem
Betrachter ein stabiles Laufwerk, das genauso geringe
Außenmaße besitzt wie die Maxtor.“
Unpacking the LPS 105 reveals a sturdy disk drive which is of
the same small size as the Maxtor.
(Semi-)Automatic analysis of online content 4/68Steffen Staab
„Content“ analysis: What is in online content?
....
Entailment
Summaries
Arguments
Discourse
Opinions
Sentiments
Facts – who, what, when?
Syntax
Semantics
Pragmatics
Knowledge
(Semi-)Automatic analysis of online content 5/68Steffen Staab
Purpose
Technical objectives
• Search
• data & knowledge
bases:
• facts
• arguments
• ...
Applications
• Google Search
• Watson
• „Watson 2“
Social science and
humanities objectives
• Form hypotheses
• Find indications
• Recognize trends
• ...
(Semi-)Automatic analysis of online content 6/68Steffen Staab
Objective oriented content analysis
....
Entailment
Summaries
Arguments
Discourse
Opinions
Sentiments
Facts – who, what, when?
Syntax
Semantics
Pragmatics
Knowledge
(Semi-)Automatic analysis of online content 7/68Steffen Staab
SEMANTIC WEB ANNOTATION
(Semi-)Automatic analysis of online content 8/68Steffen Staab
CREAM – Creating Metadata (Handschuh et al 2002, 2003)
Document
Viewer / EditorOntology
Guidance &
Fact Browser
Concepts
Instances of
Concepts
Attribute Instances =
instance of a property
to a datatype instance
Relationship Instances =
instance of a property
to a class instance
(Semi-)Automatic analysis of online content 9/68Steffen Staab
CREAM – Creating Metadata (Handschuh et al 2002, 2003)
Open world - Target ontologies
now could be:
• Schema.org
(3 Trillion facts collected by
Google; 10,000 of concepts)
• Wikidata
1,148,230 concepts
(2 weeks ago)
(Semi-)Automatic analysis of online content 10/68Steffen Staab
Annotating facts with Cream
+++
Open (wrt ontologies)
Flexible
Semi-automatic: SCREAM
---
Effort for annotation
(minimize # of clicks)
Thick Client
Tech Readiness Level ~5
A lot of effort to prepare tool
for a task
Limited accuracy
(Semi-)Automatic analysis of online content 11/68Steffen Staab
Technology Readiness Levels
TRL 1: Beobachtung und Beschreibung des
Funktionsprinzips (8-15 Jahre zur Marktreife)
TRL 2: Beschreibung der Anwendung einer Technologie
TRL 3: Nachweis der Funktionstüchtigkeit einer
Technologie (5-13 Jahre zur Marktreife)
TRL 4: Versuchsaufbau im Labor
TRL 5: Versuchsaufbau in Einsatzumgebung
TRL 6: Prototyp in Einsatzumgebung
TRL 7: Prototyp im Einsatz (1-5 Jahre)
TRL 8: Qualifiziertes System mit Nachweis der
Funktionstüchtigkeit im Einsatzbereich
TRL 9: Qualifiziertes System mit Nachweis des
erfolgreichen Einsatzes
(Semi-)Automatic analysis of online content 12/68Steffen Staab
CLUSTERING OF TEXTDATA
http://topicmodels.west.uni-koblenz.de
With Christoph Kling
(Semi-)Automatic analysis of online content 13/68Steffen Staab
Text Mining Documents
Documents are
 PDFs, emails, tweets,
Flickr photo tags,
Word companions,…
Documents consist of
 bag of words
 metadata
- author(s)
- timestamp
- geolocation
- publisher
- booktitle
- device
...
Chinese
food
Vegan
food
Break
-fast
dimsum
duck
eggs
...
vegan
tofu
...
eggs
ham
...
Objective:
Cluster, categorize,
& explain
(Semi-)Automatic analysis of online content 14/68Steffen Staab
Latent Dirichlet Allocation (LDA)
(Semi-)Automatic analysis of online content 15/68Steffen Staab
Latent Dirichlet Allocation (LDA)
Document-topic distributions
Topic-word distributions
K topics
M documents
Each doc m from M has length Nm
(Semi-)Automatic analysis of online content 16/68Steffen Staab
Use Metadata to Help Topic Prediction
 Improve topic detection
→ Morning times may help to improve the breakfast topic
 Describe dependencies: metadata ↔ topics
→ breakfast topic happens
during morning hours Chinese
food
Vegan
food
Break
-
fast
dimsum
duck
eggs
...
vegan
tofu
...
eggs
ham
...
(Semi-)Automatic analysis of online content 17/68Steffen Staab
Use Metadata to Help Topic Prediction
 Improve topic detection
→ Morning times may help to improve the breakfast topic
 Describe dependencies: metadata ↔ topics
→ breakfast topic happens
during morning hours
 Usage
 Autocompletion
→ From words to words
 Prediction of search queries
→ From metadata to words
→ From words to metadata
Chinese
food
Vegan
food
Break
-
fast
dimsum
duck
eggs
...
vegan
tofu
...
eggs
ham
...
(Semi-)Automatic analysis of online content 18/68Steffen Staab
Dataset
 Linux Kernel Mailinglist
3,400,000 emails with timestamps and mailinglist ID
(Semi-)Automatic analysis of online content 19/68Steffen Staab
 Nominal
 Ordinal
 Cyclic
 Spherical
 Networked
Structures of Metadata Spaces Kern Desk Mail
Spatial Model is not used in this application
(but might be)!
(Semi-)Automatic analysis of online content 20/68Steffen Staab
Topics
(Semi-)Automatic analysis of online content 21/68Steffen Staab
Topics
(Semi-)Automatic analysis of online content 22/68Steffen Staab
Topics
 Professional topics:
 Hobbyist topics:
(Semi-)Automatic analysis of online content 23/68Steffen Staab
Topics
 Metadata weighting:
(Semi-)Automatic analysis of online content 24/68Steffen Staab
126,408 Online Fetish Users: First 8 Topics
(Semi-)Automatic analysis of online content 25/68Steffen Staab
Sociodemographics of Fetish dataset
(Semi-)Automatic analysis of online content 26/68Steffen Staab
Influence of Sociodemographics on Favorite Fetishes
(Semi-)Automatic analysis of online content 27/68Steffen Staab
Other applications of (extended) LDA
Sentiment and Topics
(Naveed et al ICWSM 2013)
Topics and spatial knowledge
(Kling et al WSDM 2014)
Modelling of power
(Kling et al ICWSM 2015)
(Semi-)Automatic analysis of online content 28/68Steffen Staab
BELIEVABILITY AND TRUST IN
ONLINE NEWS
With Christoph Kling, Jerome Kunegis
Collaboraiton with Jutta Milde, Karin Stengel, Ines Vogel
Ongoing work in KOMEPOL
(Semi-)Automatic analysis of online content 29/68Steffen Staab
Targets
(Semi-)Automatic analysis of online content 30/68Steffen Staab
Example article at Spiegel.de
(Semi-)Automatic analysis of online content 31/68Steffen Staab
Requirements
Scalability:
• # Documents
• # Annotators
• # Annotations per
annotater
Tool:
• Administration
• Crowdsourcing
• Semi-automatic
(Semi-)Automatic analysis of online content 32/68Steffen Staab
Separating article management and coding
(Semi-)Automatic analysis of online content 33/68Steffen Staab
Text-Upload
(Semi-)Automatic analysis of online content 34/68Steffen Staab
Managing projects
(Semi-)Automatic analysis of online content 35/68Steffen Staab
Article
(Semi-)Automatic analysis of online content 36/68Steffen Staab
Defining a Coding-Job
(Semi-)Automatic analysis of online content 37/68Steffen Staab
Highlighting using Keywords and Clustering
(Semi-)Automatic analysis of online content 38/68Steffen Staab
Article coding
(Semi-)Automatic analysis of online content 39/68Steffen Staab
Preparing a code book (1)
(Semi-)Automatic analysis of online content 40/68Steffen Staab
Preparing a code book (2)
(Semi-)Automatic analysis of online content 41/68Steffen Staab
CONCLUSION
(Semi-)Automatic analysis of online content 42/68Steffen Staab
Lessons Learned
New targets
• Require new modeling of
gaps
Challenges
• Technology Readiness
Levels
• Many tools – no „good“ tool
(„done is better than
perfect“?)
• Reproducability
ToDos
• Eclipse/Protege of
annotation
• modular
• extensible
• open
• Optimizing the processes
(Semi-)Automatic analysis of online content 43/68Steffen Staab
No tool to rule them all
....
Entailment
Summaries
Arguments
Discourse
Opinions
Sentiments
Facts – who, when, where, what?
Syntax
Semantics
Pragmatics
Knowledge
(Semi-)Automatic analysis of online content 44/68Steffen Staab
THANK YOU FOR YOUR
ATTENTION!
(Semi-)Automatic analysis of online content 45/68Steffen Staab
C. C. Kling, J. Kunegis, S. Sizov, and S. Staab. “Detecting non-gaussian geographical topics in tagged photo
collections.” In: Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014,
New York, NY, USA, February 24-28, 2014.
I. C. Vogel, J. Milde, K. Stengel, S. Staab, C. C. Kling, and J. Kunegis. “Glaubwürdigkeit und Vertrauen von
Online-News.” In: Datenschutz und Datensicherheit 39.5 (2015), pp. 312–316.
S. Handschuh, S. Staab. CREAM – CREAting Metadata for the Semantic Web. Computer Networks. 42(5):
579-598, Elsevier 2003.
S. Handschuh, S. Staab, F. Ciravegna. S-CREAM – Semi-automatic CREAtion of Metadata.In: Proc. of the
European Conference on Knowledge Acquisition and Management – EKAW-2002 . Madrid, Spain,
October 1-4, 2002. LNCS/LNAI 2473, Springer, 2002, pp. 358-372.
C. Kling. Probabilistic Models for Context in Social Media. Novel Approaches and Inference Schemes.
Submitted as PhD thesis, Institute for Web Science and Technologies, University of Koblenz-Landau, to
be defended Nov/Dec 2016
Nasir Naveed, Thomas Gottron, Steffen Staab:Feature Sentiment Diversification of User Generated Reviews:
The FREuD Approach. ICWSM 2013
Christoph Carl Kling, Jérôme Kunegis, Heinrich Hartmann, Markus Strohmaier, Steffen Staab:Voting
Behaviour and Power in Online Democracy: A Study of LiquidFeedback in Germany's Pirate Party.
ICWSM 2015: 208-217
Bibliography
(Semi-)Automatic analysis of online content 46/68Steffen Staab
URLs
http://topicmodels.west.uni-koblenz.de
http://komepol.west.uni-koblenz.de
http://www.slideshare.net/steffenstaab

More Related Content

What's hot

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGThamme Gowda
 
Materials Informatics and Python
Materials Informatics and PythonMaterials Informatics and Python
Materials Informatics and PythonShintaro Fukushima
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with KerasQuantUniversity
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.Andy Petrella
 
Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics OverviewTony Fast
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan
 
Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Timothy Danford
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...Anubhav Jain
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Mariano Rodriguez-Muro
 
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...inside-BigData.com
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMfnothaft
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitGreg Landrum
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Anubhav Jain
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
 
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Anubhav Jain
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMfnothaft
 

What's hot (20)

Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRGData Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
 
Materials Informatics and Python
Materials Informatics and PythonMaterials Informatics and Python
Materials Informatics and Python
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.What is a distributed data science pipeline. how with apache spark and friends.
What is a distributed data science pipeline. how with apache spark and friends.
 
Materials Informatics Overview
Materials Informatics OverviewMaterials Informatics Overview
Materials Informatics Overview
 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
 
Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?Why is Bioinformatics a Good Fit for Spark?
Why is Bioinformatics a Good Fit for Spark?
 
The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...The Status of ML Algorithms for Structure-property Relationships Using Matb...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
 
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
Machine & Deep Learning: Practical Deployments and Best Practices for the Nex...
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKit
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAM
 

Similar to (Semi-)Automatic analysis of online contents

From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...Alex Pinto
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...Ilkay Altintas, Ph.D.
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectStuart Chalk
 
Open and Automated Machine Learning
Open and Automated Machine LearningOpen and Automated Machine Learning
Open and Automated Machine LearningJoaquin Vanschoren
 
No specimen (software) left behind
No specimen (software) left behindNo specimen (software) left behind
No specimen (software) left behindVince Smith
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationRaffael Marty
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
Who cares about Software Process Modelling? A First Investigation about the P...
Who cares about Software Process Modelling? A First Investigation about the P...Who cares about Software Process Modelling? A First Investigation about the P...
Who cares about Software Process Modelling? A First Investigation about the P...Daniel Mendez
 
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...CODE BLUE
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Luigi Vanfretti
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging EnvironmentsPaul Groth
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature reviewKashif Hussain
 
Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studiesalessio_ferrari
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...Raffaele Montella
 
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...Discovering emerging effects in Learning Networks with simulations Hendrik Dr...
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...Hendrik Drachsler
 
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...Zohaib Riaz
 

Similar to (Semi-)Automatic analysis of online contents (20)

OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
 
Open and Automated Machine Learning
Open and Automated Machine LearningOpen and Automated Machine Learning
Open and Automated Machine Learning
 
No specimen (software) left behind
No specimen (software) left behindNo specimen (software) left behind
No specimen (software) left behind
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
Who cares about Software Process Modelling? A First Investigation about the P...
Who cares about Software Process Modelling? A First Investigation about the P...Who cares about Software Process Modelling? A First Investigation about the P...
Who cares about Software Process Modelling? A First Investigation about the P...
 
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...
Making & Breaking Machine Learning Anomaly Detectors in Real Life by Clarence...
 
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
Modeling and Simulation of Electrical Power Systems using OpenIPSL.org and Gr...
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
How to conduct systematic literature review
How to conduct systematic literature reviewHow to conduct systematic literature review
How to conduct systematic literature review
 
Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studies
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
 
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...Discovering emerging effects in Learning Networks with simulations Hendrik Dr...
Discovering emerging effects in Learning Networks with simulations Hendrik Dr...
 
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
Conference talk: Understanding Vulnerabilities of Location Privacy Mechanisms...
 

More from Steffen Staab

Knowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureKnowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureSteffen Staab
 
Symbolic Background Knowledge for Machine Learning
Symbolic Background Knowledge for Machine LearningSymbolic Background Knowledge for Machine Learning
Symbolic Background Knowledge for Machine LearningSteffen Staab
 
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...Steffen Staab
 
Web Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, SustainableWeb Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, SustainableSteffen Staab
 
Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )Steffen Staab
 
Storing and Querying Semantic Data in the Cloud
Storing and Querying Semantic Data in the CloudStoring and Querying Semantic Data in the Cloud
Storing and Querying Semantic Data in the CloudSteffen Staab
 
Ontologien und Semantic Web - Impulsvortrag Terminologietag
Ontologien und Semantic Web - Impulsvortrag TerminologietagOntologien und Semantic Web - Impulsvortrag Terminologietag
Ontologien und Semantic Web - Impulsvortrag TerminologietagSteffen Staab
 
Opinion Formation and Spreading
Opinion Formation and SpreadingOpinion Formation and Spreading
Opinion Formation and SpreadingSteffen Staab
 
10 Jahre Web Science
10 Jahre Web Science10 Jahre Web Science
10 Jahre Web ScienceSteffen Staab
 
Wwsss intro2016-final
Wwsss intro2016-finalWwsss intro2016-final
Wwsss intro2016-finalSteffen Staab
 
10 Years Web Science
10 Years Web Science10 Years Web Science
10 Years Web ScienceSteffen Staab
 
Semantic Web Technologies: Principles and Practices
Semantic Web Technologies: Principles and PracticesSemantic Web Technologies: Principles and Practices
Semantic Web Technologies: Principles and PracticesSteffen Staab
 
Closing Session ISWC 2015
Closing Session ISWC 2015Closing Session ISWC 2015
Closing Session ISWC 2015Steffen Staab
 
ISWC2015 Opening Session
ISWC2015 Opening SessionISWC2015 Opening Session
ISWC2015 Opening SessionSteffen Staab
 
Bias in the Social Web
Bias in the Social WebBias in the Social Web
Bias in the Social WebSteffen Staab
 
Semantic Technologies and Programmatic Access to Semantic Data
Semantic Technologies and Programmatic Access to Semantic Data Semantic Technologies and Programmatic Access to Semantic Data
Semantic Technologies and Programmatic Access to Semantic Data Steffen Staab
 
Seamless semantics - avoiding semantic discontinuity
Seamless semantics - avoiding semantic discontinuitySeamless semantics - avoiding semantic discontinuity
Seamless semantics - avoiding semantic discontinuitySteffen Staab
 

More from Steffen Staab (20)

Knowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureKnowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sure
 
Symbolic Background Knowledge for Machine Learning
Symbolic Background Knowledge for Machine LearningSymbolic Background Knowledge for Machine Learning
Symbolic Background Knowledge for Machine Learning
 
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
 
Web Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, SustainableWeb Futures: Inclusive, Intelligent, Sustainable
Web Futures: Inclusive, Intelligent, Sustainable
 
Eyeing the Web
Eyeing the WebEyeing the Web
Eyeing the Web
 
Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )Concepts in Application Context ( How we may think conceptually )
Concepts in Application Context ( How we may think conceptually )
 
Storing and Querying Semantic Data in the Cloud
Storing and Querying Semantic Data in the CloudStoring and Querying Semantic Data in the Cloud
Storing and Querying Semantic Data in the Cloud
 
Semantics reloaded
Semantics reloadedSemantics reloaded
Semantics reloaded
 
Ontologien und Semantic Web - Impulsvortrag Terminologietag
Ontologien und Semantic Web - Impulsvortrag TerminologietagOntologien und Semantic Web - Impulsvortrag Terminologietag
Ontologien und Semantic Web - Impulsvortrag Terminologietag
 
Opinion Formation and Spreading
Opinion Formation and SpreadingOpinion Formation and Spreading
Opinion Formation and Spreading
 
The Web We Want
The Web We WantThe Web We Want
The Web We Want
 
10 Jahre Web Science
10 Jahre Web Science10 Jahre Web Science
10 Jahre Web Science
 
Wwsss intro2016-final
Wwsss intro2016-finalWwsss intro2016-final
Wwsss intro2016-final
 
10 Years Web Science
10 Years Web Science10 Years Web Science
10 Years Web Science
 
Semantic Web Technologies: Principles and Practices
Semantic Web Technologies: Principles and PracticesSemantic Web Technologies: Principles and Practices
Semantic Web Technologies: Principles and Practices
 
Closing Session ISWC 2015
Closing Session ISWC 2015Closing Session ISWC 2015
Closing Session ISWC 2015
 
ISWC2015 Opening Session
ISWC2015 Opening SessionISWC2015 Opening Session
ISWC2015 Opening Session
 
Bias in the Social Web
Bias in the Social WebBias in the Social Web
Bias in the Social Web
 
Semantic Technologies and Programmatic Access to Semantic Data
Semantic Technologies and Programmatic Access to Semantic Data Semantic Technologies and Programmatic Access to Semantic Data
Semantic Technologies and Programmatic Access to Semantic Data
 
Seamless semantics - avoiding semantic discontinuity
Seamless semantics - avoiding semantic discontinuitySeamless semantics - avoiding semantic discontinuity
Seamless semantics - avoiding semantic discontinuity
 

Recently uploaded

Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneCall girls in Ahmedabad High profile
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 

Recently uploaded (20)

Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service ThaneRussian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
Russian Call Girls Thane Swara 8617697112 Independent Escort Service Thane
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
10.pdfMature Call girls in Dubai +971563133746 Dubai Call girls
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Rohini 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls KolkataLow Rate Call Girls Kolkata Avani 🤌  8250192130 🚀 Vip Call Girls Kolkata
Low Rate Call Girls Kolkata Avani 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 

(Semi-)Automatic analysis of online contents

  • 1. Institute for Web Science and Technologies · University of Koblenz-Landau, Germany (Semi-)Automatic Analysis of Online Contents Steffen Staab @ststaab Web and Internet Science Group · ECS · University of Southampton, UK &
  • 2. (Semi-)Automatic analysis of online content 2/68Steffen Staab Content analysis
  • 3. (Semi-)Automatic analysis of online content 3/68Steffen Staab Is it difficult? „Nach dem Auspacken der LPS-105 präsentiert sich dem Betrachter ein stabiles Laufwerk, das genauso geringe Außenmaße besitzt wie die Maxtor.“ Unpacking the LPS 105 reveals a sturdy disk drive which is of the same small size as the Maxtor.
  • 4. (Semi-)Automatic analysis of online content 4/68Steffen Staab „Content“ analysis: What is in online content? .... Entailment Summaries Arguments Discourse Opinions Sentiments Facts – who, what, when? Syntax Semantics Pragmatics Knowledge
  • 5. (Semi-)Automatic analysis of online content 5/68Steffen Staab Purpose Technical objectives • Search • data & knowledge bases: • facts • arguments • ... Applications • Google Search • Watson • „Watson 2“ Social science and humanities objectives • Form hypotheses • Find indications • Recognize trends • ...
  • 6. (Semi-)Automatic analysis of online content 6/68Steffen Staab Objective oriented content analysis .... Entailment Summaries Arguments Discourse Opinions Sentiments Facts – who, what, when? Syntax Semantics Pragmatics Knowledge
  • 7. (Semi-)Automatic analysis of online content 7/68Steffen Staab SEMANTIC WEB ANNOTATION
  • 8. (Semi-)Automatic analysis of online content 8/68Steffen Staab CREAM – Creating Metadata (Handschuh et al 2002, 2003) Document Viewer / EditorOntology Guidance & Fact Browser Concepts Instances of Concepts Attribute Instances = instance of a property to a datatype instance Relationship Instances = instance of a property to a class instance
  • 9. (Semi-)Automatic analysis of online content 9/68Steffen Staab CREAM – Creating Metadata (Handschuh et al 2002, 2003) Open world - Target ontologies now could be: • Schema.org (3 Trillion facts collected by Google; 10,000 of concepts) • Wikidata 1,148,230 concepts (2 weeks ago)
  • 10. (Semi-)Automatic analysis of online content 10/68Steffen Staab Annotating facts with Cream +++ Open (wrt ontologies) Flexible Semi-automatic: SCREAM --- Effort for annotation (minimize # of clicks) Thick Client Tech Readiness Level ~5 A lot of effort to prepare tool for a task Limited accuracy
  • 11. (Semi-)Automatic analysis of online content 11/68Steffen Staab Technology Readiness Levels TRL 1: Beobachtung und Beschreibung des Funktionsprinzips (8-15 Jahre zur Marktreife) TRL 2: Beschreibung der Anwendung einer Technologie TRL 3: Nachweis der Funktionstüchtigkeit einer Technologie (5-13 Jahre zur Marktreife) TRL 4: Versuchsaufbau im Labor TRL 5: Versuchsaufbau in Einsatzumgebung TRL 6: Prototyp in Einsatzumgebung TRL 7: Prototyp im Einsatz (1-5 Jahre) TRL 8: Qualifiziertes System mit Nachweis der Funktionstüchtigkeit im Einsatzbereich TRL 9: Qualifiziertes System mit Nachweis des erfolgreichen Einsatzes
  • 12. (Semi-)Automatic analysis of online content 12/68Steffen Staab CLUSTERING OF TEXTDATA http://topicmodels.west.uni-koblenz.de With Christoph Kling
  • 13. (Semi-)Automatic analysis of online content 13/68Steffen Staab Text Mining Documents Documents are  PDFs, emails, tweets, Flickr photo tags, Word companions,… Documents consist of  bag of words  metadata - author(s) - timestamp - geolocation - publisher - booktitle - device ... Chinese food Vegan food Break -fast dimsum duck eggs ... vegan tofu ... eggs ham ... Objective: Cluster, categorize, & explain
  • 14. (Semi-)Automatic analysis of online content 14/68Steffen Staab Latent Dirichlet Allocation (LDA)
  • 15. (Semi-)Automatic analysis of online content 15/68Steffen Staab Latent Dirichlet Allocation (LDA) Document-topic distributions Topic-word distributions K topics M documents Each doc m from M has length Nm
  • 16. (Semi-)Automatic analysis of online content 16/68Steffen Staab Use Metadata to Help Topic Prediction  Improve topic detection → Morning times may help to improve the breakfast topic  Describe dependencies: metadata ↔ topics → breakfast topic happens during morning hours Chinese food Vegan food Break - fast dimsum duck eggs ... vegan tofu ... eggs ham ...
  • 17. (Semi-)Automatic analysis of online content 17/68Steffen Staab Use Metadata to Help Topic Prediction  Improve topic detection → Morning times may help to improve the breakfast topic  Describe dependencies: metadata ↔ topics → breakfast topic happens during morning hours  Usage  Autocompletion → From words to words  Prediction of search queries → From metadata to words → From words to metadata Chinese food Vegan food Break - fast dimsum duck eggs ... vegan tofu ... eggs ham ...
  • 18. (Semi-)Automatic analysis of online content 18/68Steffen Staab Dataset  Linux Kernel Mailinglist 3,400,000 emails with timestamps and mailinglist ID
  • 19. (Semi-)Automatic analysis of online content 19/68Steffen Staab  Nominal  Ordinal  Cyclic  Spherical  Networked Structures of Metadata Spaces Kern Desk Mail Spatial Model is not used in this application (but might be)!
  • 20. (Semi-)Automatic analysis of online content 20/68Steffen Staab Topics
  • 21. (Semi-)Automatic analysis of online content 21/68Steffen Staab Topics
  • 22. (Semi-)Automatic analysis of online content 22/68Steffen Staab Topics  Professional topics:  Hobbyist topics:
  • 23. (Semi-)Automatic analysis of online content 23/68Steffen Staab Topics  Metadata weighting:
  • 24. (Semi-)Automatic analysis of online content 24/68Steffen Staab 126,408 Online Fetish Users: First 8 Topics
  • 25. (Semi-)Automatic analysis of online content 25/68Steffen Staab Sociodemographics of Fetish dataset
  • 26. (Semi-)Automatic analysis of online content 26/68Steffen Staab Influence of Sociodemographics on Favorite Fetishes
  • 27. (Semi-)Automatic analysis of online content 27/68Steffen Staab Other applications of (extended) LDA Sentiment and Topics (Naveed et al ICWSM 2013) Topics and spatial knowledge (Kling et al WSDM 2014) Modelling of power (Kling et al ICWSM 2015)
  • 28. (Semi-)Automatic analysis of online content 28/68Steffen Staab BELIEVABILITY AND TRUST IN ONLINE NEWS With Christoph Kling, Jerome Kunegis Collaboraiton with Jutta Milde, Karin Stengel, Ines Vogel Ongoing work in KOMEPOL
  • 29. (Semi-)Automatic analysis of online content 29/68Steffen Staab Targets
  • 30. (Semi-)Automatic analysis of online content 30/68Steffen Staab Example article at Spiegel.de
  • 31. (Semi-)Automatic analysis of online content 31/68Steffen Staab Requirements Scalability: • # Documents • # Annotators • # Annotations per annotater Tool: • Administration • Crowdsourcing • Semi-automatic
  • 32. (Semi-)Automatic analysis of online content 32/68Steffen Staab Separating article management and coding
  • 33. (Semi-)Automatic analysis of online content 33/68Steffen Staab Text-Upload
  • 34. (Semi-)Automatic analysis of online content 34/68Steffen Staab Managing projects
  • 35. (Semi-)Automatic analysis of online content 35/68Steffen Staab Article
  • 36. (Semi-)Automatic analysis of online content 36/68Steffen Staab Defining a Coding-Job
  • 37. (Semi-)Automatic analysis of online content 37/68Steffen Staab Highlighting using Keywords and Clustering
  • 38. (Semi-)Automatic analysis of online content 38/68Steffen Staab Article coding
  • 39. (Semi-)Automatic analysis of online content 39/68Steffen Staab Preparing a code book (1)
  • 40. (Semi-)Automatic analysis of online content 40/68Steffen Staab Preparing a code book (2)
  • 41. (Semi-)Automatic analysis of online content 41/68Steffen Staab CONCLUSION
  • 42. (Semi-)Automatic analysis of online content 42/68Steffen Staab Lessons Learned New targets • Require new modeling of gaps Challenges • Technology Readiness Levels • Many tools – no „good“ tool („done is better than perfect“?) • Reproducability ToDos • Eclipse/Protege of annotation • modular • extensible • open • Optimizing the processes
  • 43. (Semi-)Automatic analysis of online content 43/68Steffen Staab No tool to rule them all .... Entailment Summaries Arguments Discourse Opinions Sentiments Facts – who, when, where, what? Syntax Semantics Pragmatics Knowledge
  • 44. (Semi-)Automatic analysis of online content 44/68Steffen Staab THANK YOU FOR YOUR ATTENTION!
  • 45. (Semi-)Automatic analysis of online content 45/68Steffen Staab C. C. Kling, J. Kunegis, S. Sizov, and S. Staab. “Detecting non-gaussian geographical topics in tagged photo collections.” In: Seventh ACM International Conference on Web Search and Data Mining, WSDM 2014, New York, NY, USA, February 24-28, 2014. I. C. Vogel, J. Milde, K. Stengel, S. Staab, C. C. Kling, and J. Kunegis. “Glaubwürdigkeit und Vertrauen von Online-News.” In: Datenschutz und Datensicherheit 39.5 (2015), pp. 312–316. S. Handschuh, S. Staab. CREAM – CREAting Metadata for the Semantic Web. Computer Networks. 42(5): 579-598, Elsevier 2003. S. Handschuh, S. Staab, F. Ciravegna. S-CREAM – Semi-automatic CREAtion of Metadata.In: Proc. of the European Conference on Knowledge Acquisition and Management – EKAW-2002 . Madrid, Spain, October 1-4, 2002. LNCS/LNAI 2473, Springer, 2002, pp. 358-372. C. Kling. Probabilistic Models for Context in Social Media. Novel Approaches and Inference Schemes. Submitted as PhD thesis, Institute for Web Science and Technologies, University of Koblenz-Landau, to be defended Nov/Dec 2016 Nasir Naveed, Thomas Gottron, Steffen Staab:Feature Sentiment Diversification of User Generated Reviews: The FREuD Approach. ICWSM 2013 Christoph Carl Kling, Jérôme Kunegis, Heinrich Hartmann, Markus Strohmaier, Steffen Staab:Voting Behaviour and Power in Online Democracy: A Study of LiquidFeedback in Germany's Pirate Party. ICWSM 2015: 208-217 Bibliography
  • 46. (Semi-)Automatic analysis of online content 46/68Steffen Staab URLs http://topicmodels.west.uni-koblenz.de http://komepol.west.uni-koblenz.de http://www.slideshare.net/steffenstaab