SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1146
Review of Topic Modeling and Summarization
Chinmay Patil[1], Parag Wayangankar[2], Pranay Yadav[3], Shweta Sharma[4]
[1] ,[2], [3]Student, [4] Professor, Department of Computer Engineering, Atharva College of Engineering, Mumbai
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Topic Modeling is a technique of unsupervised
machine learning which is used in discovering topics that
occur in a collection of documents. Latent Dirichlet Allocation
(LDA) is one of the most used algorithm for topic modeling. It
considers that documents are mixture of topics and eachtopic
is a mixture of different tokens or words. While considering
many documents, one can think that the topics extracted by
the LDA algorithm relate to all of the documents together. But
if we consider only one text document, if we try to extract
topics from it using LDA algorithm, we can say that these are
the keywords of the text document as it summarizes the entire
idea of the document in a concise form. This can be useful in
summarization of the document. Summarization with respect
to text is shortening of the text document such that it
highlights all the important pointsofthetextdocument. Inthis
paper, we represent a LDA model which helps to identify the
dominant topics in the textdocument, thenidentifiessentences
that reflect these dominant topicsandstichesthemtogetherto
formulate a human readable summary.
Key Words: Natural Language Processing, Text
Summarization, Latent Dirichlet Allocation, Topic Modeling
1.INTRODUCTION
As there is an ever-increasingamountofdata available,ithas
become important for extracting only important or only
meaningful information from this data sinceeverybitof data
is not useful. This is where topic modeling and
summarization can be of use. Due to the fact that the
algorithm we used here is unsupervised, it eliminates the
need for structured data to be provided to the model for it to
work Motivation for developing this is to reduce the time
required for reading or analyzing a text document. Text
documents come in a variety of form including newsreports,
Research papers, legal documents and many more, the task
can become tedious and some important information might
slip out if not done carefully. The advantage with such a
model doing the task is that one can decide the number of
topics or points one wants to discover in the text. Based on
that, the extraction would be done automatically, thus
reducing the time required for the same task is done
manually. Text summarization has two approaches namely
Abstractive and Extractive. We have chosen the extractive
summarization approach.
2. Literature Survey
Barde et al. [1] discusses various methods and tools usedfor
topic modeling with their features and limitation. Some of
the methods discussedareVectorSpaceModel (VSM),Latent
Semantic Indexing (LSI), Probabilistic Latent Semantic
Analysis (PLSA) and Latent Dirichlet Allocation(LDA).Some
tools discussed are Gensim, Standford topic modeling
toolbox, MALLET, BigARTM.
Surabhi Adhikari et al. [2] discusses different methods that
have been used for text summarization. Mainly, the paper
discusses two methods- Abstractive (ABS) and Extractive
(EXT) summarization. Also query based summarization is
discussed. The paper mostly discusses about the structured
based and semantic based approaches for summarization of
the text documents. Various datasets were used to test the
summaries produced by these models, such as the CNN
corpus, DUC2000, single and multiple text documents etc.
Kenli Li et al. [3] use Latent Dirichlet Allocation (LDA)
algorithm which is used to automatically generate text
corpora topics, and applied to sentences extraction based
multi-document summarization algorithms.Theapproachis
to combine the traditional summary generation algorithm
and the abstract generation algorithm based on deep
learning.
David Alfred Ostrowski [4] uses Latent Dirichlet Allocation
algorithm is used which is a generative probabilistic model
for a collection of discrete data. Evaluating this technique
from the perspective of classificationaswell asidentification
of noteworthy topics as it is applied to a filtered collection of
Twitter messages. Experimentsshowthatthesemethods are
effective for the identification of sub-topics as well as to
support classification within large-scale corpora.
Jinqiang Bian et al. [5] In their paper based on LDA Model, a
new method of sentence-ranking is proposed. The method
combines topic-distribution of each sentence with topic-
importance of the corpus together to calculate the posterior
probability of the sentence, and then, based on the posterior
probability, it selects sentences to form a summary. Topic-
distribution of each sentence represents the likelihood of
sentence belonging to each topic and topic-importance
represents the degree that the topics cover the significant
portion of the corpus. The method highlights the latent
topics and optimizes the summarization. Experimentresults
on the dataset DUC2006 show the advantage of the
multi document summarization algorithm proposed in the
paper document
J. N. Madhuri et al. [6] proposes a system for summarizing
documents using sentence ranking algorithms. Sentenceare
given weights and then ranked based on these weights. The
sentences with the highest rank areselectedinthesummary.
The sentences are ranked on the basis of the preprocessed
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1147
text and the weights are given by frequency of termsdivided
by the total number of terms in the document.
Shohreh Rad Rahimi et al. [7]exploresmanymethodsoftext
mining and text summarization. Text summarization can be
performed on the basis of various criteria. Some discussed
criteria are based on output summary, based on details,
based on contents, based on limitation, based on number of
input texts and based on language acceptance. It also
discusses various similarity measures which areusedintext
summarization
Hirohata et al. [8] presents automatic speechsummarization
techniques and its evaluation metrics. It mainly focuses on
sentence extraction based summarization methods for
making abstracts from some spontaneous presentations.
Some metrics that have been discussed are summarization
accuracy, sentence F-measure, ROUGE-n and some more.
Aditya Jain et al. [9] proposes a Neural Network based
approach for text summarization. The paper proposes an
approach to extract a good set of features followedby neural
network for supervised extractive summarization.Itassigns
a predictive score to each sentence and the sentences with
the highest predictive scores are added to the summary.
Liu Na et al. [10] present a system that use Latent Dirichlet
Allocation topic model for multi summarization. It extracts
title and content for each document provided and creates a
topic model for title and content. In the end it calculates
sentence weights according to the topic model and forms a
summary based on these sentence weights.
Mahsa Afsharizadeh [11] propose a technique of
summarization which is query oriented. Most important
sentences are extracted from the document based on a
feature extraction process where some features like
sentence length, normalized sentence length, sentence
position in the document, topic frequency etc. are used. 11
unique features are extracted. Based on these 11 every
sentence is scored and top ranked sentences are selectedfor
creating the summary.
Shweta Ganiger and K.M.M Rajashekhariah [12] discuss
implementation of some keyword extraction algorithms.
These algorithms were used to find how effective they are
when it comes to extracting important keywords from a
document. The 3 algorithms discussed here are TF-IDF
(Term frequency - Inverse Document Frequency), TextRank
and RAKE (Rapid Automatic Keyword Extraction).
3. CONCLUSIONS
Topic modeling and topic summarization are two important
tasks in natural language processing. With the help of LDA
algorithm for extracting keywords, the need for structured
data was eliminated which helped in reducing the time
required for creating the summary. Also, the extraction of
keywords or dominant topics can help in categorization
purpose which can increase the scope of the project where
suggestions can be made based on the similarity of different
topics with the given document.
4. ACKNOWLEDGEMENT
We owe a sincere thanks to our college Atharva College of
Engineering, especially our HeadofDepartment,Dr.Suvarna
Pansambal, our guide, Prof. Shweta Sharma for their kind
cooperation and guidance whichhelpedusinthe completion
of this project which would have seemed difficult without
their motivation, constant supportandvaluablesuggestions.
Moreover, the completion of this research would have been
impossible without the cooperation, suggestions andhelpof
our family and friends.
5. REFERENCES
[1] Bhagyashree Vyankatrao Barde and Anant
Madhavrao Bainwad, “An Overview of Topic
Modeling Methods and Tools”, International
Conference on Intelligent Computing and Control
Systems, ICICCS 2017.
[2] Rahul, Surabhi Adhikari, Monika, “NLP based
Machine Learning Approaches for Text
Summarization”, Proceedings of the Fourth
International Conference on Computing
Methodologies and Communication (ICCMC 2020).
[3] Ying Zhong, Zhuo Tang, Xiaofei Ding, Li Zhu,
Yuquan Le, Kenli Li, Keqin Li, “An Improved LDA
Multi-Document Summarization Model Based on
TensorFlow”, 2017 International Conference on
Tools with Artificial Intelligence.
[4] David Alfred Ostrowski, “Using Latent Dirichlet
Allocation for Topic Modelling in Twitter”,
Proceedings of the 2015 IEEE 9th International
Conference on Semantic Computing (IEEE ICSC
2015)
[5] Jinqiang Bian, Zengru Jiang, Qian Chen, “Research
On Multi-document Summarization Based On LDA
Topic Model”, 2014 Sixth International Conference
on Intelligent Human Machine Systems and
Cybernetics
[6] J. N. Madhuri and R. Ganesh Kumar, “Extractive
Text Summarization Using Sentence Ranking,”
2019 Int. Conf. Data Sci. Commun.
[7] Shohreh Rad Rahimi, Ali Toofanzadeh Mozhdehi,
Mohamad Abdolahi, “An Overview on Extractive
Text Summarizzation”, 2071 IEEE 4th International
Conference on knowledge-Based Engineering and
Innovation (KBEI)
[8] Hirohata, M., Shinnaka, Y., Iwano, K., & Furui, S.
(n.d.). “Sentence extraction-based presentation
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1148
summarization techniques and evaluationmetrics”,
Proceedings. (ICASSP ’05). IEEE International
Conference on Acoustics, Speech, and Signal
Processing, 2005.
[9] Jain, A., Bhatia, D., & Thakur, M. K. (2017),
“Extractive Text Summarization Using WordVector
Embedding”, 2017 International Conference on
Machine Learning and Data Science (MLDS).
[10] Na, L., Ming-xia, L., Ying, L., Xiao-jun, T.,
Haiwen, W.,  Peng, X. (2014), “Mixture of topic
model for multi-document summarization”, The
26th Chinese Control and Decision Conference
(2014 CCDC).
[11] Ebrahimpour-Komleh,H.,Afsharizadeh,M.,
 Bagheri, A. (2018), “Query-oriented text
summarization using sentence extraction
technique.”, 2018 4th International Conference on
Web Research (ICWR).
[12] K. M. M. Ganiger, S., and Rajashekharaiah,
(2018), “Comparative Study onKeywordExtraction
Algorithms for Single Extractive Document”, 2018
Second International Conference on Intelligent
Computing and Control Systems (ICICCS

More Related Content

Similar to Review of Topic Modeling and Summarization

AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGAUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGIRJET Journal
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...IJECEIAES
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemIRJET Journal
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...eSAT Publishing House
 
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...IRJET Journal
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesIRJET Journal
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...mlaij
 
Text Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to TextText Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to TextIRJET Journal
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET Journal
 
A domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicA domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicIAEME Publication
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Don Dooley
 
NLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic AnalysisNLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic AnalysisINFOGAIN PUBLICATION
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringIRJET Journal
 
Reviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clusteringReviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clusteringIRJET Journal
 
Text Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer ModelText Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer ModelIRJET Journal
 
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEMCANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEMIRJET Journal
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET Journal
 

Similar to Review of Topic Modeling and Summarization (20)

AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGAUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
 
A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...A hybrid approach for text summarization using semantic latent Dirichlet allo...
A hybrid approach for text summarization using semantic latent Dirichlet allo...
 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...Summarization using ntc approach based on keyword extraction for discussion f...
Summarization using ntc approach based on keyword extraction for discussion f...
 
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
 
A Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization MethodologiesA Comparative Study of Automatic Text Summarization Methodologies
A Comparative Study of Automatic Text Summarization Methodologies
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
 
Text Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to TextText Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to Text
 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
 
A domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logicA domain specific automatic text summarization using fuzzy logic
A domain specific automatic text summarization using fuzzy logic
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)
 
NLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic AnalysisNLP Based Text Summarization Using Semantic Analysis
NLP Based Text Summarization Using Semantic Analysis
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web Engineering
 
Reviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clusteringReviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clustering
 
Text Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer ModelText Summarization Using the T5 Transformer Model
Text Summarization Using the T5 Transformer Model
 
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEMCANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
CANDIDATE SET KEY DOCUMENT RETRIEVAL SYSTEM
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 

Recently uploaded (20)

Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 

Review of Topic Modeling and Summarization

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1146 Review of Topic Modeling and Summarization Chinmay Patil[1], Parag Wayangankar[2], Pranay Yadav[3], Shweta Sharma[4] [1] ,[2], [3]Student, [4] Professor, Department of Computer Engineering, Atharva College of Engineering, Mumbai ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Topic Modeling is a technique of unsupervised machine learning which is used in discovering topics that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is one of the most used algorithm for topic modeling. It considers that documents are mixture of topics and eachtopic is a mixture of different tokens or words. While considering many documents, one can think that the topics extracted by the LDA algorithm relate to all of the documents together. But if we consider only one text document, if we try to extract topics from it using LDA algorithm, we can say that these are the keywords of the text document as it summarizes the entire idea of the document in a concise form. This can be useful in summarization of the document. Summarization with respect to text is shortening of the text document such that it highlights all the important pointsofthetextdocument. Inthis paper, we represent a LDA model which helps to identify the dominant topics in the textdocument, thenidentifiessentences that reflect these dominant topicsandstichesthemtogetherto formulate a human readable summary. Key Words: Natural Language Processing, Text Summarization, Latent Dirichlet Allocation, Topic Modeling 1.INTRODUCTION As there is an ever-increasingamountofdata available,ithas become important for extracting only important or only meaningful information from this data sinceeverybitof data is not useful. This is where topic modeling and summarization can be of use. Due to the fact that the algorithm we used here is unsupervised, it eliminates the need for structured data to be provided to the model for it to work Motivation for developing this is to reduce the time required for reading or analyzing a text document. Text documents come in a variety of form including newsreports, Research papers, legal documents and many more, the task can become tedious and some important information might slip out if not done carefully. The advantage with such a model doing the task is that one can decide the number of topics or points one wants to discover in the text. Based on that, the extraction would be done automatically, thus reducing the time required for the same task is done manually. Text summarization has two approaches namely Abstractive and Extractive. We have chosen the extractive summarization approach. 2. Literature Survey Barde et al. [1] discusses various methods and tools usedfor topic modeling with their features and limitation. Some of the methods discussedareVectorSpaceModel (VSM),Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation(LDA).Some tools discussed are Gensim, Standford topic modeling toolbox, MALLET, BigARTM. Surabhi Adhikari et al. [2] discusses different methods that have been used for text summarization. Mainly, the paper discusses two methods- Abstractive (ABS) and Extractive (EXT) summarization. Also query based summarization is discussed. The paper mostly discusses about the structured based and semantic based approaches for summarization of the text documents. Various datasets were used to test the summaries produced by these models, such as the CNN corpus, DUC2000, single and multiple text documents etc. Kenli Li et al. [3] use Latent Dirichlet Allocation (LDA) algorithm which is used to automatically generate text corpora topics, and applied to sentences extraction based multi-document summarization algorithms.Theapproachis to combine the traditional summary generation algorithm and the abstract generation algorithm based on deep learning. David Alfred Ostrowski [4] uses Latent Dirichlet Allocation algorithm is used which is a generative probabilistic model for a collection of discrete data. Evaluating this technique from the perspective of classificationaswell asidentification of noteworthy topics as it is applied to a filtered collection of Twitter messages. Experimentsshowthatthesemethods are effective for the identification of sub-topics as well as to support classification within large-scale corpora. Jinqiang Bian et al. [5] In their paper based on LDA Model, a new method of sentence-ranking is proposed. The method combines topic-distribution of each sentence with topic- importance of the corpus together to calculate the posterior probability of the sentence, and then, based on the posterior probability, it selects sentences to form a summary. Topic- distribution of each sentence represents the likelihood of sentence belonging to each topic and topic-importance represents the degree that the topics cover the significant portion of the corpus. The method highlights the latent topics and optimizes the summarization. Experimentresults on the dataset DUC2006 show the advantage of the multi document summarization algorithm proposed in the paper document J. N. Madhuri et al. [6] proposes a system for summarizing documents using sentence ranking algorithms. Sentenceare given weights and then ranked based on these weights. The sentences with the highest rank areselectedinthesummary. The sentences are ranked on the basis of the preprocessed
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1147 text and the weights are given by frequency of termsdivided by the total number of terms in the document. Shohreh Rad Rahimi et al. [7]exploresmanymethodsoftext mining and text summarization. Text summarization can be performed on the basis of various criteria. Some discussed criteria are based on output summary, based on details, based on contents, based on limitation, based on number of input texts and based on language acceptance. It also discusses various similarity measures which areusedintext summarization Hirohata et al. [8] presents automatic speechsummarization techniques and its evaluation metrics. It mainly focuses on sentence extraction based summarization methods for making abstracts from some spontaneous presentations. Some metrics that have been discussed are summarization accuracy, sentence F-measure, ROUGE-n and some more. Aditya Jain et al. [9] proposes a Neural Network based approach for text summarization. The paper proposes an approach to extract a good set of features followedby neural network for supervised extractive summarization.Itassigns a predictive score to each sentence and the sentences with the highest predictive scores are added to the summary. Liu Na et al. [10] present a system that use Latent Dirichlet Allocation topic model for multi summarization. It extracts title and content for each document provided and creates a topic model for title and content. In the end it calculates sentence weights according to the topic model and forms a summary based on these sentence weights. Mahsa Afsharizadeh [11] propose a technique of summarization which is query oriented. Most important sentences are extracted from the document based on a feature extraction process where some features like sentence length, normalized sentence length, sentence position in the document, topic frequency etc. are used. 11 unique features are extracted. Based on these 11 every sentence is scored and top ranked sentences are selectedfor creating the summary. Shweta Ganiger and K.M.M Rajashekhariah [12] discuss implementation of some keyword extraction algorithms. These algorithms were used to find how effective they are when it comes to extracting important keywords from a document. The 3 algorithms discussed here are TF-IDF (Term frequency - Inverse Document Frequency), TextRank and RAKE (Rapid Automatic Keyword Extraction). 3. CONCLUSIONS Topic modeling and topic summarization are two important tasks in natural language processing. With the help of LDA algorithm for extracting keywords, the need for structured data was eliminated which helped in reducing the time required for creating the summary. Also, the extraction of keywords or dominant topics can help in categorization purpose which can increase the scope of the project where suggestions can be made based on the similarity of different topics with the given document. 4. ACKNOWLEDGEMENT We owe a sincere thanks to our college Atharva College of Engineering, especially our HeadofDepartment,Dr.Suvarna Pansambal, our guide, Prof. Shweta Sharma for their kind cooperation and guidance whichhelpedusinthe completion of this project which would have seemed difficult without their motivation, constant supportandvaluablesuggestions. Moreover, the completion of this research would have been impossible without the cooperation, suggestions andhelpof our family and friends. 5. REFERENCES [1] Bhagyashree Vyankatrao Barde and Anant Madhavrao Bainwad, “An Overview of Topic Modeling Methods and Tools”, International Conference on Intelligent Computing and Control Systems, ICICCS 2017. [2] Rahul, Surabhi Adhikari, Monika, “NLP based Machine Learning Approaches for Text Summarization”, Proceedings of the Fourth International Conference on Computing Methodologies and Communication (ICCMC 2020). [3] Ying Zhong, Zhuo Tang, Xiaofei Ding, Li Zhu, Yuquan Le, Kenli Li, Keqin Li, “An Improved LDA Multi-Document Summarization Model Based on TensorFlow”, 2017 International Conference on Tools with Artificial Intelligence. [4] David Alfred Ostrowski, “Using Latent Dirichlet Allocation for Topic Modelling in Twitter”, Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015) [5] Jinqiang Bian, Zengru Jiang, Qian Chen, “Research On Multi-document Summarization Based On LDA Topic Model”, 2014 Sixth International Conference on Intelligent Human Machine Systems and Cybernetics [6] J. N. Madhuri and R. Ganesh Kumar, “Extractive Text Summarization Using Sentence Ranking,” 2019 Int. Conf. Data Sci. Commun. [7] Shohreh Rad Rahimi, Ali Toofanzadeh Mozhdehi, Mohamad Abdolahi, “An Overview on Extractive Text Summarizzation”, 2071 IEEE 4th International Conference on knowledge-Based Engineering and Innovation (KBEI) [8] Hirohata, M., Shinnaka, Y., Iwano, K., & Furui, S. (n.d.). “Sentence extraction-based presentation
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 03 | Mar 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1148 summarization techniques and evaluationmetrics”, Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. [9] Jain, A., Bhatia, D., & Thakur, M. K. (2017), “Extractive Text Summarization Using WordVector Embedding”, 2017 International Conference on Machine Learning and Data Science (MLDS). [10] Na, L., Ming-xia, L., Ying, L., Xiao-jun, T., Haiwen, W., Peng, X. (2014), “Mixture of topic model for multi-document summarization”, The 26th Chinese Control and Decision Conference (2014 CCDC). [11] Ebrahimpour-Komleh,H.,Afsharizadeh,M., Bagheri, A. (2018), “Query-oriented text summarization using sentence extraction technique.”, 2018 4th International Conference on Web Research (ICWR). [12] K. M. M. Ganiger, S., and Rajashekharaiah, (2018), “Comparative Study onKeywordExtraction Algorithms for Single Extractive Document”, 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS