This document presents a methodology for classifying stories as either love stories or horror stories using linguistic feature analysis and topic modeling. The proposed approach segments stories into paragraphs, analyzes linguistic features like lexical sophistication and parts of speech tagging, models the topics using LDA, and classifies the story based on the sentiment and types of adjectives found. The accuracy of classification improves when analyzing longer stories that provide more linguistic context.
Story classification using linguistic feature and topic modeling algorithm
1. STORY CLASSIFICATION USING LINGUISTIC FEATURE AND
TOPIC MODELING ALGORITHM
ā¢ MAHDI HASAN (152-15-5797)
ā¢ HASNAT SHOHEB(152-15-5813)
Ahmed Al Marouf
Lecturer, Department of CSE
Daffodil international University (DIU)
Supervised By
1. https://medium.com/swiftworld/topic-modeling-of-new-york-times-articles-11688837d32f
2. http://ieeexplore.ieee.org/document/6473685/
References
This project examines the potential for computational
tools to classify story . The story separated were several
paragraphs. The paragraphs were analyzed by the
computational tool, a variety of linguistic features with
lexical sophistication and then modeled using LDA
techniques.. The modelās accuracy increased when
longer story that provided more linguistic coverage. The
findings support the notions that paragraph types contain
specific linguistic features that allow them to be
distinguished from one another.
Abstract
We find the story whether it is love or horror .By using
linguistic feature and topic modeling algorithm, we can
determine which type of a story is this .
Introduction
Proposed methodology
The role of affect and emotion has long been noted for
human story comprehension. Stories are typically about
characters facing conflict. Sometimes the plot
complications (negative events or situations) have to be
overcome. In many such cases, one expects to encounter
sentiment expressions in the story. Not surprisingly, we
found that a large proportion of the stories, in both
validation and test sets, have sentiment-bearing words[2].
Our approach is to look at the polarity of sentiments and
for sentiment congruence in a story. Using a sentiment
dictionary that assigns sentiment values on a continuous
scale, and looking only at adjective type we have
demonstrated that even a rather simple, lexically-based
sentiment analysis, can provide a considerable
contribution to find out whether it's love or horror story
Discussion
Our results provide support to the notion that sentiment
is an important aspect of narrative comprehension, and
that TOPIC MODELING-analysis can be a strong
contributing factor for NLP analysis of stories. There are
a number of avenues for further exploration, such as
using machine learning methods to combine different
types of information that go into making a story, using
vector spaces, automated reasoning and extending the
feature extraction to capture other aspects of language
understanding.
Conclusions
Short story is a term of āliteratureā. āLiterature"
generally refers to any written or spoken material in any
language [1]. But in this project the term refers to the
works of the creative imagination on the likes short story.
Story is the most effective mode of expression to
represent this world. It encompasses every sphere of
human life like culture, tradition, history, psychology etc.
To depict the human life in all its richness, it used diction
expressing various emotions and feelings. In this project,
we deal with the linguistic feature analysis and topic
modeling of a story by using segmentation(1:15) which
help to find adjective type and provide classification of
that story.
Results
ā¢ SHAHABUDDDIN BHUYIA(152-15-483)
ā¢ AWOLAD HOSSAN(152-15-5536)
ā¢ ROKYBHUL RAYHAN(152-15-5597
(DEPARTMENT OF CSE)
At first we transform it in story.txt format.After that we
divided the story in 15 paragraph. Each para contain specific
topic name . Then we do topic modeling where we use LDA.
Next step is Feature extraction by probability where we can
find a word with the probability number . Then POS tag can
help us to determine whether the adjective is negative or
positive.After that we find the adjective type which can help
to determine the story type . We can use a flow chart for this
methodology .
Story types
Feature extraction
ā¢ Finding 15
different topics
ā¢ POS tagging of
topic relates
words
ā¢ Adjective type
finding (pos/neg)
Classification
(Horror /Love)
Preprocessing
Story segmentation (1:15)