This presentation contains an introduction to text summarization, ways to summary a text, improve summary with fuzzy classifier and deep learning model (RBM) and some results.
Association of deep learning algorithm with fuzzy logic for multi-document text summarization
1. LEARNING ALGORITHM WITH
FUZZY LOGIC FOR
MULTIDOCUMENT TEXT
SUMMARIZATION
Abd Almughith Alzabibi
Ahmad Ataya
Baraa Salhany
Mohammad Salem Kabbani
2. INTRODUCTION
With the rapid growth in the quantity and
complexity of documents sources on the
internet, it has become increasingly
important to provide improved mechanism to
user to find exact information from available
documents.
3. AUTOMATIC TEXT
SUMMARIZATION DEFINITION
Automatic text summarization is the summary of
the source version of the original text while
keeping its main content and helps the user to
quickly understand large volumes of information.
4. TEXT SUMMARIZATION
CAN BE CLASSIFIED IN
TWO WAYS:
โข abstractive summarization
โข extractive summarization
5. MAIN OBJECTIVE OF
EXTRACTION APPROACH
The main objective of text summarization
based on extraction approach is the
choosing of appropriate sentence as per the
requirement of a user.
9. DEFINE SET OF FIVE
FEATURES FOR EACH
SENTENCE
๏ถ Title Similarity Feature:
The ratio of the number of words in the
sentence that occur in title to the total
number of words in the title.
10. DEFINE SET OF FIVE
FEATURES FOR EACH
SENTENCE
๏ถ Positional Feature:
11. DEFINE SET OF FIVE
FEATURES FOR EACH
SENTENCE
๏ถ Term Weight Feature:
12. DEFINE SET OF FIVE
FEATURES FOR EACH
SENTENCE
๏ถ Concept Feature:
13. DEFINE SET OF FIVE
FEATURES FOR EACH
SENTENCE
๏ถ POS Tagger Feature.
17. FUZZY LOGIC SYSTEM
Set of rules are constructed by comparing the
sentences from the set of documents and the
sentences from the text summary.
18. FUZZY LOGIC SYSTEM
The defuzzifier finally modifies the feature
matrix based on the feature values assigned
to a particular rule and derives the fuzzy
score by evaluating the features values.
20. RESTRICTED
BOLTZMANN MACHINE
โข RBM is a stochastic neural
network
โข Consists of one layer of visible
units (neurons) and one layer of
hidden units
โข Units in each layer have no
connections between them and
are connected to all other units in
other layer as shown below
22. OPTIMAL FEATURE
MATRIX
After obtaining the refined sentence matrix from the
RBM it is further tested on a particular threshold
value for each feature we have calculated.
Ex: If for any sentence:
๐4 < ๐กโ๐4
then it will be filtered
23. To fine tune the feature vector set optimally we
use back propagation algorithm.
The deep learning algorithm in this phase uses
cross-entropy error to fine tune the obtained
feature vector set. The cross-entropy error for
adjustment is calculated for every feature of the
sentence.
OPTIMAL FEATURE
MATRIX
Natural Language Processing (NLP) technique is used for parsing, reduction of words and to generate text summery in abstractive summarization.
Extractive summarization is flexible and consumes less time as compared to abstractive summarization
Stop words are removed mainly to reduce the insignificant and noisy words.
The weight of the sentence can be calculated by adding the weight of all the terms in the sentence and dividing it by total number of terms in that sentence
In addition to the five features, an additional attribute also associated with the feature matrix.
The addition feature associated with the feature matrix is the class labels for each sentence.
The fuzzy classifier assigns the class labels to the sentences according to the fuzzy rules by processing the sentences.
RBM is a stochastic neural network (that is a network of neurons where each neuron has some random behavior when activated).