1. Presented by
Milind B. Gaikwad
(2016MNS006)
“Events Analysis Based on
Internet Information Retrieval and
Process Mining Tools”
SGGS IE & T, Nanded.
2. Contents
Introduction
Ontology Structure
Main Event and Key Attributes
Process Mining
Interaction with System
Experiment Result
Conclusion
Future Work
References
SGGS IE & T, Nanded.
3. Introduction
• Event Analysis
• Example
GST
(Goods and
Service tax)
Event
IT Industries
GDP
Services
SGGS IE & T, Nanded.
Firm
Firm
4. Continue…
• Minimal Characteristics of Event for Event Analysis
1) Participant of Event
2) Geographical Location
3) Relation between Events
4) Internal Relation of Event
• Trace :
Sequence of Events united by common use case or a news
message in this.
5. Ontology Structure
• Page Structure Ontology For Information Extraction
Two Level Ontology
1) Website Structure Description
2) Block Description
SGGS IE & T, Nanded.
Website
Home page News page
Header Body Footer
Title Ad div Info div
IsAPageOf IsAPageOf
IsAHeaderOf IsABodyOf IsAFooterOf
IsAPartOf IsAPartOf IsAPartOf
SitelevelPagelevel
6. Continue…
• Mechanism of Ontology
• Advantage
It is Structure Centered Information Retrieval Approach so Help
to Identify Content Duplication and Filter it Afterwards.
Make use of Information Divisions Hierarchical Structure
Interconnections
• RDF(Resource Description Framework)
SGGS IE & T, Nanded.
7. Main Event Types and Key Attributes
Data Source
• News Media
• Social Media
Facebook ,Tweeter etc.
Example
News fields oil disasters
1) Disaster ( date, oil company, place)
2) Industry news(oil company ,publication date)
3) Socio-environmental implication (publication date)
4) Socio-political(Date ,place)
5) Noise
SGGS IE & T, Nanded.
8. Process Mining
• Definition
• Example
Search system
News base
Data preparation
system
Event
logs
Tabular presentation
of data about events
SGGS IE & T, Nanded.
10. Continue…
• Capabilities of XES
Concept Extension
Lifecycle Extension
Organizational Extension
Time Extension
Semantic Extension
Id Extension
SGGS IE & T, Nanded.
11. Interaction With System
1.Ontology Setting
2. Search query
4. Primary model
and possible
attributes to group
and refine models
5. User response
6. Resulting model
User
Our system
3. Information
search and
database
provisioning
Internet
SGGS IE & T, Nanded.
12. Experiment Result
The experiment revealed some shortcomings
The inability to identify cause-effect relationships.
A large number of errors and losses at the stage of events extraction.
Inability to unify the synonymous word forms.
Inability to group and filtering model attributes.
SGGS IE & T, Nanded.
13. Conclusion
• Ontologies allow performing flexible tuning for various
domains, for example to investigate relations between
events in the field of economy, policy and etc.
SGGS IE & T, Nanded.
14. Scope and Future Work
• Data Extraction Techniques
1) Wget command
2) Web Scraping Techniques
BeautifulSoup in Python
3) API
Facebook API(), Twitter API (Tweepy)
• Classification Techniques
1) Naïve Bayes Classifier
2) SVM(Support Vector Machine)
SGGS IE & T, Nanded.
15. Continue…
• Scope
1) These Techniques can be used to find out Controversial Point
in the News.
2) Fake News Detection System can also be Implemented.
3) Highest Controversial News can be Ranked.
SGGS IE & T, Nanded.
16. References
• Books
1) Data Mining with Ontologies: Implementations, Findings, and Frameworks Book by Hector
Oscar Nigro and Sandra Elizabeth González Císaro.
2) Process Mining: Discovery, Conformance and Enhancement of Business Book by Wil van der.
• IEEE Papers
1) IEEE paper by author Mykhailo Granik, Volodymyr Mesyura
"Fake News Detection Using Naive Bayes Classifier“.
2) IEEE paper by authors Ismini Lourentzou, Graham Dyer, Abhishek Sharma and ChengXiang
Zhai Department of Computer Science University of Illinois at Urbana
"Hotspots of News Articles: Joint Mining of News Text & Social Media to Discover Controversial Points
in News“.
3) D. Calvanese, M. Montali, A. Syamsiyah, W.M.P. van der Aalst. “Ontology-Driven Extraction of Event
Logs from Relational Databases” In: Business Process Management Workshops 2015
• Website
1) https://www.tecmint.com/10-wget-command-examples-in-linux/
2) http://www.linuxjournal.com/content
3) https://www.youtube.com/watch?v=kIeLaNzw9hI
4) https://en.wikipedia.org/wiki/Naive_Bayes_classifier.
SGGS IE & T, Nanded.