SlideShare a Scribd company logo
1 of 10
Department of Artificial Intelligence
Project Phase –I Title Finalization Seminar
Winter 2022 (Session: 2022-2023)
G H Raisoni College of Engineering, Nagpur
Presented By:
1. Prajwal Kolhe - 45(A)
2. Faizan Khan - 68(A)
3. Apoorva Dhimole - 65(A)
4. Lakshya Chauraghade - 39(A)
Guide:-
Pranali Dhawas
Assistance Professor
GHRCE ,Nagpur
Date : 5th Aug 2022
Title of the Project:-
Document Analyzer using Deep Learning
Introduction
• Data in different forms are present in
every organization like colleges ,
schools, companies etc.
• In this research, our objective is to
build a prediction model for analyzing
and classifying these documents.
• It is the most tedious job low wages
workers do . It is a time consuming
but necessary task.
• Similar to other methods of analysis
in qualitative research, document
analysis requires repeated review,
examination, and interpretation of
the data in order to gain meaning and
empirical knowledge of the construct
being studied.
Abstract
• Many companies and big organizations have
numerous documents in bulk and required to
keep them in different clusters.
• In recent years, this job has becoming time
consuming as no of document and article has
increased .
• The objective for this study is to identify the
document and classifying them accordingly.
• Documentary analysis is a type of qualitative
research in which documents are reviewed by
the analyst to assess an appraisal theme.
Dissecting documents involves coding content
into subjects like how focus group or interview
transcripts are investigated. A rubric can
likewise be utilized to review or score a
document.
• To analyze and classify the documents using CNN .
• To extract features of the documents using algorithms.
• To create a working model that classify the document on the basics of
feature that are extracted.
• The model will use image segmentation and CNN to determine the
articles.
Objectives
• N. Chen and D. Blostein. A survey of document image classification: Problem
statement, classifier architecture and performance evaluation. IJDAR, 10(1):1–
16, 2007
• K. Collins-Thompson and R. Nickolov. A clustering-based algorithm for
automatic document separation. In SIGIR, pages 1–8, 2002.
• CNNs are trained to perform a classification task, but a CNN trained on
classification can be exploited to perform retrieval also. These feature vectors
are high-dimensional, but their dimensionality can be reduced significantly via
principal component analysis without significantly affecting their discriminative
power . Ranking these images of the training data will return a sorted list of
documents.
Literature Survey (Survey of existing products)
• The type of document is determined according to many specifications, such as
the design of the document, the header and footer, the body of the document
and how the writing is formatted within the document, all of these factors help
in the process of identifying the type of document.
• But some type of documents also have common features for example
government certificate have seal of the govt. and/or logo , which can help
classify the documents.
Proposed Methodology/System Architecture
• Category:
Machine Learning, Deep Learning
• Programming Language:
Python
• Tools & Libraries:
Plotly Dash, CNN, ImageNet ,Keras
• IDE:
Jupyter
• Prerequisites:
Python, Machine Learning, Deep Learning, Neural Network
• DataSets :
Kaggel : Link
https://www.kaggle.com/code/kaledhoshme/documents-classification-
using-cnn/data
Hardware / Software Specification
Our proposed solution is model that will accurately classify document and articles,
proposed model is made using CNN and image feature extraction , fine-tuning
these features that are extracted on document images pushed results even
higher.
the CNN approach to document image representation exceeds the power of hand-
crafted alternatives.
Conclusion
[1] Batres-Estrada, B. (2015). Deep learning for multivariate financial time series.
[2] Emerson, S., Kennedy, R., O'Shea, L., & O'Brien, J. (2019, May). Trends and Applications of Machine
Learning in Quantitative Finance. In 8th International Conference on Economics and Finance Research
(ICEFR 2019).
[3] Heaton, J. B., Polson, N. G., & Witte, J. H. (2017). Deep learning for finance: deep portfolios. Applied
Stochastic Models in Business and Industry, 33(1), 3-12.
[4] Moritz, B., & Zimmermann, T. (2016). Tree-based conditional portfolio sorts: The relation between
past and future stock returns. Available at SSRN 2740751.
[5] Olah, C. (2015). Understanding lstm networks–colah’s blog. Colah. github. io.
[6] Paiva, F. D., Cardoso, R. T. N., Hanaoka, G. P., & Duarte, W. M. (2018). DecisionMaking for Financial
Trading: A Fusion Approach of Machine Learning and Portfolio Selection. Expert Systems with
Applications.
[7] Patterson J., 2017. Deep Learning: A Practitioner’s Approach, O’Reilly Media.
[8] Siami-Namini, S., & Namin, A. S. (2018). Forecasting economics and financial time series: Arima vs.
lstm. arXiv preprint arXiv:1803.06386.
[9] Takeuchi, L., & Lee, Y. Y. A. (2013). Applying deep learning to enhance momentum trading strategies
in stocks. In Technical Report. Stanford University.
References
Thank You

More Related Content

Similar to 45,68,65,39 (2).pptx

A study on attention-based deep learning architecture model for image captioning
A study on attention-based deep learning architecture model for image captioningA study on attention-based deep learning architecture model for image captioning
A study on attention-based deep learning architecture model for image captioningIAESIJAI
 
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...Scott Bou
 
Document Recommendation using Boosting Based Multi-graph Classification: A Re...
Document Recommendation using Boosting Based Multi-graph Classification: A Re...Document Recommendation using Boosting Based Multi-graph Classification: A Re...
Document Recommendation using Boosting Based Multi-graph Classification: A Re...IRJET Journal
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusabilityAlexander Decker
 
The overlaps between Action Research and Design Research
The overlaps between Action Research and Design ResearchThe overlaps between Action Research and Design Research
The overlaps between Action Research and Design ResearchSandeep Purao
 
Requirements Engineering - System Vision
Requirements Engineering - System VisionRequirements Engineering - System Vision
Requirements Engineering - System VisionBirgit Penzenstadler
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemIJSRD
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemIJSRD
 
qualitative research
qualitative researchqualitative research
qualitative researchguest0ee0d0
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET Journal
 
A S URVEY ON D OCUMENT I MAGE A NALYSIS AND R ETRIEVAL S YSTEMS
A S URVEY ON  D OCUMENT  I MAGE  A NALYSIS AND  R ETRIEVAL  S YSTEMSA S URVEY ON  D OCUMENT  I MAGE  A NALYSIS AND  R ETRIEVAL  S YSTEMS
A S URVEY ON D OCUMENT I MAGE A NALYSIS AND R ETRIEVAL S YSTEMSIJCI JOURNAL
 
Query-Focused Extractive Text Summarization for Multi-Topic Document
Query-Focused Extractive Text Summarization for Multi-Topic DocumentQuery-Focused Extractive Text Summarization for Multi-Topic Document
Query-Focused Extractive Text Summarization for Multi-Topic Documentssuserf66333
 
Ontology based clustering in research project
Ontology based clustering in research projectOntology based clustering in research project
Ontology based clustering in research projecteSAT Publishing House
 
How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?Anandhan22
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxTutors India
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxphdassistance101
 
Automated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringAutomated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringMary Montoya
 
Methodology chapter
Methodology chapterMethodology chapter
Methodology chapterengrhassan21
 

Similar to 45,68,65,39 (2).pptx (20)

A study on attention-based deep learning architecture model for image captioning
A study on attention-based deep learning architecture model for image captioningA study on attention-based deep learning architecture model for image captioning
A study on attention-based deep learning architecture model for image captioning
 
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
Analysis Of Data Mining Model For Successful Implementation Of Data Warehouse...
 
Document Recommendation using Boosting Based Multi-graph Classification: A Re...
Document Recommendation using Boosting Based Multi-graph Classification: A Re...Document Recommendation using Boosting Based Multi-graph Classification: A Re...
Document Recommendation using Boosting Based Multi-graph Classification: A Re...
 
Ijetcas14 438
Ijetcas14 438Ijetcas14 438
Ijetcas14 438
 
11.software modules clustering an effective approach for reusability
11.software modules clustering an effective approach for  reusability11.software modules clustering an effective approach for  reusability
11.software modules clustering an effective approach for reusability
 
The overlaps between Action Research and Design Research
The overlaps between Action Research and Design ResearchThe overlaps between Action Research and Design Research
The overlaps between Action Research and Design Research
 
Requirements Engineering - System Vision
Requirements Engineering - System VisionRequirements Engineering - System Vision
Requirements Engineering - System Vision
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry System
 
Analysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry SystemAnalysis on Student Admission Enquiry System
Analysis on Student Admission Enquiry System
 
qualitative research
qualitative researchqualitative research
qualitative research
 
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-MeansIRJET- Concept Extraction from Ambiguous Text Document using K-Means
IRJET- Concept Extraction from Ambiguous Text Document using K-Means
 
A S URVEY ON D OCUMENT I MAGE A NALYSIS AND R ETRIEVAL S YSTEMS
A S URVEY ON  D OCUMENT  I MAGE  A NALYSIS AND  R ETRIEVAL  S YSTEMSA S URVEY ON  D OCUMENT  I MAGE  A NALYSIS AND  R ETRIEVAL  S YSTEMS
A S URVEY ON D OCUMENT I MAGE A NALYSIS AND R ETRIEVAL S YSTEMS
 
Query-Focused Extractive Text Summarization for Multi-Topic Document
Query-Focused Extractive Text Summarization for Multi-Topic DocumentQuery-Focused Extractive Text Summarization for Multi-Topic Document
Query-Focused Extractive Text Summarization for Multi-Topic Document
 
Ontology based clustering in research project
Ontology based clustering in research projectOntology based clustering in research project
Ontology based clustering in research project
 
D1802023136
D1802023136D1802023136
D1802023136
 
How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?How to conduct a bibliometric analysis?
How to conduct a bibliometric analysis?
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptx
 
PPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptxPPT-How to Conduct Bibliometric Analyses.pptx
PPT-How to Conduct Bibliometric Analyses.pptx
 
Automated Thai Online Assignment Scoring
Automated Thai Online Assignment ScoringAutomated Thai Online Assignment Scoring
Automated Thai Online Assignment Scoring
 
Methodology chapter
Methodology chapterMethodology chapter
Methodology chapter
 

Recently uploaded

5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniR. Sosa
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdfKamal Acharya
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological universityMohd Saifudeen
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...Amil baba
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualBalamuruganV28
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfEr.Sonali Nasikkar
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUankushspencer015
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Studentskannan348865
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationEmaan Sharma
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 

Recently uploaded (20)

5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney Uni
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 

45,68,65,39 (2).pptx

  • 1. Department of Artificial Intelligence Project Phase –I Title Finalization Seminar Winter 2022 (Session: 2022-2023) G H Raisoni College of Engineering, Nagpur Presented By: 1. Prajwal Kolhe - 45(A) 2. Faizan Khan - 68(A) 3. Apoorva Dhimole - 65(A) 4. Lakshya Chauraghade - 39(A) Guide:- Pranali Dhawas Assistance Professor GHRCE ,Nagpur Date : 5th Aug 2022 Title of the Project:- Document Analyzer using Deep Learning
  • 2. Introduction • Data in different forms are present in every organization like colleges , schools, companies etc. • In this research, our objective is to build a prediction model for analyzing and classifying these documents. • It is the most tedious job low wages workers do . It is a time consuming but necessary task. • Similar to other methods of analysis in qualitative research, document analysis requires repeated review, examination, and interpretation of the data in order to gain meaning and empirical knowledge of the construct being studied.
  • 3. Abstract • Many companies and big organizations have numerous documents in bulk and required to keep them in different clusters. • In recent years, this job has becoming time consuming as no of document and article has increased . • The objective for this study is to identify the document and classifying them accordingly. • Documentary analysis is a type of qualitative research in which documents are reviewed by the analyst to assess an appraisal theme. Dissecting documents involves coding content into subjects like how focus group or interview transcripts are investigated. A rubric can likewise be utilized to review or score a document.
  • 4. • To analyze and classify the documents using CNN . • To extract features of the documents using algorithms. • To create a working model that classify the document on the basics of feature that are extracted. • The model will use image segmentation and CNN to determine the articles. Objectives
  • 5. • N. Chen and D. Blostein. A survey of document image classification: Problem statement, classifier architecture and performance evaluation. IJDAR, 10(1):1– 16, 2007 • K. Collins-Thompson and R. Nickolov. A clustering-based algorithm for automatic document separation. In SIGIR, pages 1–8, 2002. • CNNs are trained to perform a classification task, but a CNN trained on classification can be exploited to perform retrieval also. These feature vectors are high-dimensional, but their dimensionality can be reduced significantly via principal component analysis without significantly affecting their discriminative power . Ranking these images of the training data will return a sorted list of documents. Literature Survey (Survey of existing products)
  • 6. • The type of document is determined according to many specifications, such as the design of the document, the header and footer, the body of the document and how the writing is formatted within the document, all of these factors help in the process of identifying the type of document. • But some type of documents also have common features for example government certificate have seal of the govt. and/or logo , which can help classify the documents. Proposed Methodology/System Architecture
  • 7. • Category: Machine Learning, Deep Learning • Programming Language: Python • Tools & Libraries: Plotly Dash, CNN, ImageNet ,Keras • IDE: Jupyter • Prerequisites: Python, Machine Learning, Deep Learning, Neural Network • DataSets : Kaggel : Link https://www.kaggle.com/code/kaledhoshme/documents-classification- using-cnn/data Hardware / Software Specification
  • 8. Our proposed solution is model that will accurately classify document and articles, proposed model is made using CNN and image feature extraction , fine-tuning these features that are extracted on document images pushed results even higher. the CNN approach to document image representation exceeds the power of hand- crafted alternatives. Conclusion
  • 9. [1] Batres-Estrada, B. (2015). Deep learning for multivariate financial time series. [2] Emerson, S., Kennedy, R., O'Shea, L., & O'Brien, J. (2019, May). Trends and Applications of Machine Learning in Quantitative Finance. In 8th International Conference on Economics and Finance Research (ICEFR 2019). [3] Heaton, J. B., Polson, N. G., & Witte, J. H. (2017). Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry, 33(1), 3-12. [4] Moritz, B., & Zimmermann, T. (2016). Tree-based conditional portfolio sorts: The relation between past and future stock returns. Available at SSRN 2740751. [5] Olah, C. (2015). Understanding lstm networks–colah’s blog. Colah. github. io. [6] Paiva, F. D., Cardoso, R. T. N., Hanaoka, G. P., & Duarte, W. M. (2018). DecisionMaking for Financial Trading: A Fusion Approach of Machine Learning and Portfolio Selection. Expert Systems with Applications. [7] Patterson J., 2017. Deep Learning: A Practitioner’s Approach, O’Reilly Media. [8] Siami-Namini, S., & Namin, A. S. (2018). Forecasting economics and financial time series: Arima vs. lstm. arXiv preprint arXiv:1803.06386. [9] Takeuchi, L., & Lee, Y. Y. A. (2013). Applying deep learning to enhance momentum trading strategies in stocks. In Technical Report. Stanford University. References