SlideShare a Scribd company logo
PROJECT PRESENTATION
ON
“AUTOMATIC KEYWORD EXTRACTION FOR
TEXT SUMMARIZATION”
Presented by
Biswarup Das
Roll-102217, No.-02220630
10th semester
Under the guidance of Dr. Rakesh Kumar, Assistant Professor.
Department of Computer Science,
Assam University, Silchar.
CONTENTS: -
 Introduction
 Objective
 Problem Statement
 Types of Summarization
 Literature Review
 Methodology and Implementation
 System Configuration
 Results
 Conclusion
 Future Work
 References
INTRODUCTION
 Summarization is a process where the most salient
features of a text are extracted and compiled into a
short abstract of the original document.
 In order to achieve this, we need to first mine the text
from the document.
TEXT MINING
 Text mining is a method of extracting information by
collecting various design and keywords from an
indefinite data.
 It basically includes of text sorting, Sentiment analysis
and various other features.
NATURAL LANGUAGE PROCESSING (NLP)
 NLP is all about the interaction between the computers and human
speech.
 Data which is come from the conversations, statements are
basically examples of unstructured data, which are very
disarranged and difficult to manage.
 For understanding of the text to computers we have to translate it
to a language computer, for achieving this we will use Word
Embedding.
WORD EMBEDDING
 It is a numerical representation of words. A common
representation is one-hot vector [1].
 This method encodes each word with a unique vector.
All values in that vector are zeroes except for a value 1,
which defines the word representation.
 The most popular word embedding are Word2Vec and
Glove.
OBJECTIVE
 The main objective of automatic text summarization is
presenting the source text into a shorter version with
semantics.
 The main advantage of using summary is, it reduces
time.
PROBLEM STATEMENT
 Generating a summary from a text document: -
 This helps in to understand a large amount of text.
 As the internet is available in every corner the
information is also growing at a certain pace.
 Which Ultimately becomes a challenge to
summarize all type of data.
TYPES OF SUMMARIZATIONS
 Basically there are two types of summarization-
1) Extractive Text Summarization:-In this method it
selects the information from the document as exactly it
appears in the source based to form the summary.
2) Abstractive text Summarization:- In this procedure, a
machine must need to grasp the idea of all the documents
which are being used as input and then it produces summary
for a particular given sentence.
LITERATURE REVIEW
 Arunlfo and Ledeneva [2] suggested a method of term selection with the help of
TF-IDF. They attain this by unsupervised method to generate necessary
summary.
 Krishnaveni et.al (2017) [3] suggested a text summarization on the basis of
heading as the conditions can be identify from the heading.
 Nikhil S. Shirwandkar et.al (2018) [4] proposed a method that uses both
Restricted Boltzmann Machine (RBM) and fuzzy logic to recognize the key
sentences. An approach is proposed to produce short and concise summaries
for long text documents.
 J.N.Madhuri et.al (2019) [5] has submitted a technique to create extractive
summary using sentence rating techniques with the help of term frequency later
than removing stop words. It works for any type of text but cannot differentiate
sentences.
METHODOLOGY AND IMPLEMENTATION
 Architecture:-
Figure 1- The Architecture of extractive text summarizer
DETAILS OF ARCHITECTURE
 Source File:-
- To create the summary a few inputs must be taken into consideration by
taking the document. The input document should only be in English
language.
- For uploading the file type we will use the following command:-
FIGURE 2- UPLOADING OF FILE TYPE
PRE-PROCESSING
- The input text is divided into sentences based on the sentence terminator. These
sentences are individually preprocessed using the below techniques.
- 1) Lower Casing- In this, the entire input data is transformed into lowercase
letters.
- For performing this task we will use the following command-
- Figure 3- Lowering of texts
2) Stop word Removal:-
- In this step all the stop words is being removed from the input data which
came on frequent basis.
- We will remove the stop words by following command-
- FIGURE 4- REMOVING OF STOP WORDS
FEATURE EXTRACTION
1) Word Frequency:- The total words that resides in the document is
take into count and make a frequency list of the words.
- To determine the word frequencies we will use following command:-
- Figure 5- Word Frequency
2) Sentence Tokenization:-
- For sentence tokenization we have to first lower the words
which we have done earlier, after comparing each word with
sentences we will determine the sentence scores follows with
specific sentence with scores.
- We will use this command to do this task:-
Figure 6- Text Tokenization
 Extraction of high score sentences:-
- After determining the sentence scores we will arrange them in
descending order and store in the list for the summary generation.
- The command is given below-
-
- Figure 7- Sentence Scores
 Summary Generation:-
- It depend on the locating of file’s theme ,which includes various popular
topics like term frequency, TF-IDF etc.
- The steps in the processing of this summarizer are as follows:
- 1) Conversion of input text into an intermediate depiction.
- 2) Giving a priority score for each sentences.
- We will use the following command to generate summary-
- Figure 8- Summary Generation
POST-PROCESSING
 We will convert the sentences from spacy span to strings for joining of entire
sentences.
 After that we will do list comprehension of the sentences of the previous step.
- We will do above steps using the following commands-
Figure 9- Conversion from spacy span to strings
Figure 10- List Comprehension
SYSTEM CONFIGURATION
 Software Configuration:-
1) Python.
2) Natural Language Toolkit(NLTK).
3) Jupyter Notebook.
4) Various other packages.
 Hardware Configuration:-
1. Processors (min. Intel i3 processor).
2. RAM (min. 2GB).
3. Hard disk (512GB is enough)
4. Power supply (input of 100V-240V)
RESULTS
 Screenshots-
1)
Figure 11- Original text
2)
Figure 12- List of Stop Words
3)
Figure 13- Word Frequencies
3)
Figure 14- Maximum Word Frequency
4)
Figure 15- Sentence Scores
5)
Figure 16- Summary generation with length
6)
Figure 17- Length Comparisons
7)
Figure 18- Keywords with its Equivalent scores
CONCLUSION AND FUTURE WORK
 Conclusion-
- The whole project work is done in an extractive text summarization technique.
The summarization method should create a useful summary in a short duration
with minimal redundancy and grammatically correct sentences.
- The other summarization techniques like abstractive method which is responsible
to generate more related and exact summaries, but the main catch is that it
requires more complicated heuristic algorithms.
- The summarization method needs to make more accurate summaries in less
time with the least quantity of redundancy.
 Future Scope-
- There are quite a few problems to solve, like the accuracy of the parsing
that reduced the sequential entireness that has to be improved.
- The main focus is based on to improve the parsing accuracy and to
minimize the redundancy.
- Although this work can be also be done in deep learning domain where we
can use layered structured and after training of the datasets it may show
more accurate summary.
REFERENCES
1. Valverde Tohalino, Jorge & Amancio, Diego. (2017). Extractive Multi-document
Summarization Using Multilayer Networks. Physica A: Statistical Mechanics and its
Applications. 503. 10.1016/j.physa.2018.03.013.
2. R. A. Garc´ıa- Hern´andez and Y. Ledeneva, “Word sequence models for single text
summarization,” in Proceedings of the 2nd International Conferences on Advances in
Computer-Human Interactions, ACHI 2009, pp. 44–48, IEEE, 2009.
3. P. Krishnaveni and Balasundaram S.R, “Automatic text summarization by local scoring
and ranking for improving coherence,” july 2017 2017 international conference of
computing methodologies and communication,doi:10.1109/ICCMC.2017.8282539.
4. N. S. Shirwandkar and S. Kulkarni, "Extractive Text Summarization Using Deep
Learning," 2018 Fourth International Conference on Computing Communication
Control and Automation (ICCUBEA), Pune, India, 2018, pp. 1-5.
doi:10.1109/ICCUBEA.2018.8697465.
5. J. N. Madhuri and R. Ganesh Kumar, "Extractive Text Summarization Using Sentence
Ranking," 2019 International Conference on Data Science and Communication (Icon
DSC), Bangalore, India, 2019, pp. 1-3. doi: 10.1109/IconDSC.2019.8817040.
THANK YOU

More Related Content

What's hot

Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
Huawei Technologies
 
Function Oriented Design
Function Oriented DesignFunction Oriented Design
Function Oriented Design
Sharath g
 
Issues in the design of Code Generator
Issues in the design of Code GeneratorIssues in the design of Code Generator
Issues in the design of Code Generator
Darshan sai Reddy
 
Text summarization
Text summarizationText summarization
Text summarization
kareemhashem
 
Sdlc
SdlcSdlc
Project report
Project reportProject report
Project report
Utkarsh Soni
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
ijtsrd
 
Practical sentiment analysis
Practical sentiment analysisPractical sentiment analysis
Practical sentiment analysis
Diana Maynard
 
System Analyst
System Analyst System Analyst
System Analyst
Mohammed Ali
 
Staffing level estimation
Staffing level estimation Staffing level estimation
Staffing level estimation
kavitha muneeshwaran
 
Oose lab notes
Oose lab notesOose lab notes
Oose lab notes
Aravindharamanan S
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
theyaseen51
 
Android Toast.pdf
Android Toast.pdfAndroid Toast.pdf
Android Toast.pdf
John Benetic
 
Reengineering including reverse & forward Engineering
Reengineering including reverse & forward EngineeringReengineering including reverse & forward Engineering
Reengineering including reverse & forward Engineering
Muhammad Chaudhry
 
Chapter 5 Syntax Directed Translation
Chapter 5   Syntax Directed TranslationChapter 5   Syntax Directed Translation
Chapter 5 Syntax Directed Translation
Radhakrishnan Chinnusamy
 
A PRESENTATION ON STRUTS & HIBERNATE
A PRESENTATION ON STRUTS & HIBERNATEA PRESENTATION ON STRUTS & HIBERNATE
A PRESENTATION ON STRUTS & HIBERNATE
Tushar Choudhary
 
USER INTERFACE DESIGN PPT
USER INTERFACE DESIGN PPTUSER INTERFACE DESIGN PPT
USER INTERFACE DESIGN PPT
vicci4041
 
software metrics(process,project,product)
software metrics(process,project,product)software metrics(process,project,product)
software metrics(process,project,product)
Amisha Narsingani
 
Ml ppt
Ml pptMl ppt
Ml ppt
Alpna Patel
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
Rebecca Williams
 

What's hot (20)

Compiler Chapter 1
Compiler Chapter 1Compiler Chapter 1
Compiler Chapter 1
 
Function Oriented Design
Function Oriented DesignFunction Oriented Design
Function Oriented Design
 
Issues in the design of Code Generator
Issues in the design of Code GeneratorIssues in the design of Code Generator
Issues in the design of Code Generator
 
Text summarization
Text summarizationText summarization
Text summarization
 
Sdlc
SdlcSdlc
Sdlc
 
Project report
Project reportProject report
Project report
 
LSTM Based Sentiment Analysis
LSTM Based Sentiment AnalysisLSTM Based Sentiment Analysis
LSTM Based Sentiment Analysis
 
Practical sentiment analysis
Practical sentiment analysisPractical sentiment analysis
Practical sentiment analysis
 
System Analyst
System Analyst System Analyst
System Analyst
 
Staffing level estimation
Staffing level estimation Staffing level estimation
Staffing level estimation
 
Oose lab notes
Oose lab notesOose lab notes
Oose lab notes
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
Android Toast.pdf
Android Toast.pdfAndroid Toast.pdf
Android Toast.pdf
 
Reengineering including reverse & forward Engineering
Reengineering including reverse & forward EngineeringReengineering including reverse & forward Engineering
Reengineering including reverse & forward Engineering
 
Chapter 5 Syntax Directed Translation
Chapter 5   Syntax Directed TranslationChapter 5   Syntax Directed Translation
Chapter 5 Syntax Directed Translation
 
A PRESENTATION ON STRUTS & HIBERNATE
A PRESENTATION ON STRUTS & HIBERNATEA PRESENTATION ON STRUTS & HIBERNATE
A PRESENTATION ON STRUTS & HIBERNATE
 
USER INTERFACE DESIGN PPT
USER INTERFACE DESIGN PPTUSER INTERFACE DESIGN PPT
USER INTERFACE DESIGN PPT
 
software metrics(process,project,product)
software metrics(process,project,product)software metrics(process,project,product)
software metrics(process,project,product)
 
Ml ppt
Ml pptMl ppt
Ml ppt
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 

Similar to Automatic keyword extraction.pptx

Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
IRJET Journal
 
Article Summarizer
Article SummarizerArticle Summarizer
Article Summarizer
Jose Katab
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
AIRCC Publishing Corporation
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
AIRCC Publishing Corporation
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
IRJET Journal
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ijnlc
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
kevig
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
kevig
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
IRJET Journal
 
IRJET - Voice based Natural Language Query Processing
IRJET -  	  Voice based Natural Language Query ProcessingIRJET -  	  Voice based Natural Language Query Processing
IRJET - Voice based Natural Language Query Processing
IRJET Journal
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
ijnlc
 
Automatic document clustering
Automatic document clusteringAutomatic document clustering
Automatic document clustering
IAEME Publication
 
Automation tool for evaluation of the quality of nlp based
Automation tool for evaluation of the quality of nlp basedAutomation tool for evaluation of the quality of nlp based
Automation tool for evaluation of the quality of nlp based
IAEME Publication
 
A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...
eSAT Journals
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
mlaij
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
IRJET Journal
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
REMEGIUSPRAVEENSAHAY
 
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
IJET - International Journal of Engineering and Techniques
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)
Don Dooley
 
IRJET - Text Summarizer.
IRJET -  	  Text Summarizer.IRJET -  	  Text Summarizer.
IRJET - Text Summarizer.
IRJET Journal
 

Similar to Automatic keyword extraction.pptx (20)

Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
Text Summarization of Food Reviews using AbstractiveSummarization and Recurre...
 
Article Summarizer
Article SummarizerArticle Summarizer
Article Summarizer
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Summarization of Software Artifacts : A Review
Summarization of Software Artifacts : A ReviewSummarization of Software Artifacts : A Review
Summarization of Software Artifacts : A Review
 
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
Advancements in Hindi-English Neural Machine Translation: Leveraging LSTM wit...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
Comparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization TechniquesComparative Study of Abstractive Text Summarization Techniques
Comparative Study of Abstractive Text Summarization Techniques
 
IRJET - Voice based Natural Language Query Processing
IRJET -  	  Voice based Natural Language Query ProcessingIRJET -  	  Voice based Natural Language Query Processing
IRJET - Voice based Natural Language Query Processing
 
Conceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarizationConceptual framework for abstractive text summarization
Conceptual framework for abstractive text summarization
 
Automatic document clustering
Automatic document clusteringAutomatic document clustering
Automatic document clustering
 
Automation tool for evaluation of the quality of nlp based
Automation tool for evaluation of the quality of nlp basedAutomation tool for evaluation of the quality of nlp based
Automation tool for evaluation of the quality of nlp based
 
A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...A template based algorithm for automatic summarization and dialogue managemen...
A template based algorithm for automatic summarization and dialogue managemen...
 
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
A Newly Proposed Technique for Summarizing the Abstractive Newspapers’ Articl...
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
 
team10.ppt.pptx
team10.ppt.pptxteam10.ppt.pptx
team10.ppt.pptx
 
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
 
Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)Automatic Text Summarization Using Natural Language Processing (1)
Automatic Text Summarization Using Natural Language Processing (1)
 
IRJET - Text Summarizer.
IRJET -  	  Text Summarizer.IRJET -  	  Text Summarizer.
IRJET - Text Summarizer.
 

Recently uploaded

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 

Recently uploaded (20)

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 

Automatic keyword extraction.pptx

  • 1. PROJECT PRESENTATION ON “AUTOMATIC KEYWORD EXTRACTION FOR TEXT SUMMARIZATION” Presented by Biswarup Das Roll-102217, No.-02220630 10th semester Under the guidance of Dr. Rakesh Kumar, Assistant Professor. Department of Computer Science, Assam University, Silchar.
  • 2. CONTENTS: -  Introduction  Objective  Problem Statement  Types of Summarization  Literature Review  Methodology and Implementation  System Configuration  Results  Conclusion  Future Work  References
  • 3. INTRODUCTION  Summarization is a process where the most salient features of a text are extracted and compiled into a short abstract of the original document.  In order to achieve this, we need to first mine the text from the document.
  • 4. TEXT MINING  Text mining is a method of extracting information by collecting various design and keywords from an indefinite data.  It basically includes of text sorting, Sentiment analysis and various other features.
  • 5. NATURAL LANGUAGE PROCESSING (NLP)  NLP is all about the interaction between the computers and human speech.  Data which is come from the conversations, statements are basically examples of unstructured data, which are very disarranged and difficult to manage.  For understanding of the text to computers we have to translate it to a language computer, for achieving this we will use Word Embedding.
  • 6. WORD EMBEDDING  It is a numerical representation of words. A common representation is one-hot vector [1].  This method encodes each word with a unique vector. All values in that vector are zeroes except for a value 1, which defines the word representation.  The most popular word embedding are Word2Vec and Glove.
  • 7. OBJECTIVE  The main objective of automatic text summarization is presenting the source text into a shorter version with semantics.  The main advantage of using summary is, it reduces time.
  • 8. PROBLEM STATEMENT  Generating a summary from a text document: -  This helps in to understand a large amount of text.  As the internet is available in every corner the information is also growing at a certain pace.  Which Ultimately becomes a challenge to summarize all type of data.
  • 9. TYPES OF SUMMARIZATIONS  Basically there are two types of summarization- 1) Extractive Text Summarization:-In this method it selects the information from the document as exactly it appears in the source based to form the summary. 2) Abstractive text Summarization:- In this procedure, a machine must need to grasp the idea of all the documents which are being used as input and then it produces summary for a particular given sentence.
  • 10. LITERATURE REVIEW  Arunlfo and Ledeneva [2] suggested a method of term selection with the help of TF-IDF. They attain this by unsupervised method to generate necessary summary.  Krishnaveni et.al (2017) [3] suggested a text summarization on the basis of heading as the conditions can be identify from the heading.  Nikhil S. Shirwandkar et.al (2018) [4] proposed a method that uses both Restricted Boltzmann Machine (RBM) and fuzzy logic to recognize the key sentences. An approach is proposed to produce short and concise summaries for long text documents.  J.N.Madhuri et.al (2019) [5] has submitted a technique to create extractive summary using sentence rating techniques with the help of term frequency later than removing stop words. It works for any type of text but cannot differentiate sentences.
  • 11. METHODOLOGY AND IMPLEMENTATION  Architecture:- Figure 1- The Architecture of extractive text summarizer
  • 12. DETAILS OF ARCHITECTURE  Source File:- - To create the summary a few inputs must be taken into consideration by taking the document. The input document should only be in English language. - For uploading the file type we will use the following command:- FIGURE 2- UPLOADING OF FILE TYPE
  • 13. PRE-PROCESSING - The input text is divided into sentences based on the sentence terminator. These sentences are individually preprocessed using the below techniques. - 1) Lower Casing- In this, the entire input data is transformed into lowercase letters. - For performing this task we will use the following command- - Figure 3- Lowering of texts
  • 14. 2) Stop word Removal:- - In this step all the stop words is being removed from the input data which came on frequent basis. - We will remove the stop words by following command- - FIGURE 4- REMOVING OF STOP WORDS
  • 15. FEATURE EXTRACTION 1) Word Frequency:- The total words that resides in the document is take into count and make a frequency list of the words. - To determine the word frequencies we will use following command:- - Figure 5- Word Frequency
  • 16. 2) Sentence Tokenization:- - For sentence tokenization we have to first lower the words which we have done earlier, after comparing each word with sentences we will determine the sentence scores follows with specific sentence with scores. - We will use this command to do this task:- Figure 6- Text Tokenization
  • 17.  Extraction of high score sentences:- - After determining the sentence scores we will arrange them in descending order and store in the list for the summary generation. - The command is given below- - - Figure 7- Sentence Scores
  • 18.  Summary Generation:- - It depend on the locating of file’s theme ,which includes various popular topics like term frequency, TF-IDF etc. - The steps in the processing of this summarizer are as follows: - 1) Conversion of input text into an intermediate depiction. - 2) Giving a priority score for each sentences. - We will use the following command to generate summary- - Figure 8- Summary Generation
  • 19. POST-PROCESSING  We will convert the sentences from spacy span to strings for joining of entire sentences.  After that we will do list comprehension of the sentences of the previous step. - We will do above steps using the following commands- Figure 9- Conversion from spacy span to strings Figure 10- List Comprehension
  • 20. SYSTEM CONFIGURATION  Software Configuration:- 1) Python. 2) Natural Language Toolkit(NLTK). 3) Jupyter Notebook. 4) Various other packages.  Hardware Configuration:- 1. Processors (min. Intel i3 processor). 2. RAM (min. 2GB). 3. Hard disk (512GB is enough) 4. Power supply (input of 100V-240V)
  • 22. 2) Figure 12- List of Stop Words
  • 23. 3) Figure 13- Word Frequencies
  • 24. 3) Figure 14- Maximum Word Frequency
  • 25. 4) Figure 15- Sentence Scores 5) Figure 16- Summary generation with length
  • 26. 6) Figure 17- Length Comparisons
  • 27. 7) Figure 18- Keywords with its Equivalent scores
  • 28. CONCLUSION AND FUTURE WORK  Conclusion- - The whole project work is done in an extractive text summarization technique. The summarization method should create a useful summary in a short duration with minimal redundancy and grammatically correct sentences. - The other summarization techniques like abstractive method which is responsible to generate more related and exact summaries, but the main catch is that it requires more complicated heuristic algorithms. - The summarization method needs to make more accurate summaries in less time with the least quantity of redundancy.
  • 29.  Future Scope- - There are quite a few problems to solve, like the accuracy of the parsing that reduced the sequential entireness that has to be improved. - The main focus is based on to improve the parsing accuracy and to minimize the redundancy. - Although this work can be also be done in deep learning domain where we can use layered structured and after training of the datasets it may show more accurate summary.
  • 30. REFERENCES 1. Valverde Tohalino, Jorge & Amancio, Diego. (2017). Extractive Multi-document Summarization Using Multilayer Networks. Physica A: Statistical Mechanics and its Applications. 503. 10.1016/j.physa.2018.03.013. 2. R. A. Garc´ıa- Hern´andez and Y. Ledeneva, “Word sequence models for single text summarization,” in Proceedings of the 2nd International Conferences on Advances in Computer-Human Interactions, ACHI 2009, pp. 44–48, IEEE, 2009. 3. P. Krishnaveni and Balasundaram S.R, “Automatic text summarization by local scoring and ranking for improving coherence,” july 2017 2017 international conference of computing methodologies and communication,doi:10.1109/ICCMC.2017.8282539. 4. N. S. Shirwandkar and S. Kulkarni, "Extractive Text Summarization Using Deep Learning," 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 2018, pp. 1-5. doi:10.1109/ICCUBEA.2018.8697465. 5. J. N. Madhuri and R. Ganesh Kumar, "Extractive Text Summarization Using Sentence Ranking," 2019 International Conference on Data Science and Communication (Icon DSC), Bangalore, India, 2019, pp. 1-3. doi: 10.1109/IconDSC.2019.8817040.