SlideShare a Scribd company logo
1 of 8
Download to read offline
POWERED BY : GROUP 1 
NIA/DESI/POPON/JIMMY/RASYID
-Information Extraction (IE) as an important thing for archievingdata from text paper document to be easily maintain and reprocess data 
-The manual process of gathering the information consuming too much time and energy
-To create a system that can gathering and also provide an information to user with a simple way 
-To classify documents in database 
-To provide the good algorithm in information extraction 
-To provide application that make SubDirectoratePublic Opinion, BPS to archievingdata from newspaper
-Informationextraction(IE)isthetaskofautomaticallyextractingstructuredinformationfromunstructeredand/orsemi-structureddocuments 
-OCRisaspecialsystemisusedtoidentifyprintedtextpapertypedandprintedusingaprinterwhichisthenfurtherprocessedbyusingaparticularalgorithmintoacharacterthatcanberecognizedandprocessedintoinformation 
-Documentclassificationonthelaststep
1. Input(Newspaper/epaper) 
2. Cropping+ Image Processing 
3. OCR 
4. Summarizing 
5. Classification 
Newspaper,etc 
OCR 
Plain text 
Input 
Proses 
(get information) 
Output 
e-paper
Gerbawani, R. A. Somadi. 2013. “Peringkasan Dokumen Bahasa Indonesia Menggunakan Logika Fuzzy”. Bogor : Fakultas Matematika dan Ilmu Pengetahuan Alam IPB. 
Trisedya, Bayu Distiawan & Jais, Hardinal. 2009. “Klasifikasi Dokumen Menggunakan Algoritma Naive Bayes dengan Penambahan Parameter Probabilitas Parent Category”. Jakarta: Fakultas Ilmu Komputer Universitas Indonesia. 
Pramesti, TitisH.W. 2014. “PengenalanKarakterTeksMenggunakanMetodeNeural Network Backpropagation”. Malang: JurusanTeknikElektro, FakultasTeknikUniversitasBrawijaya.
THANK YOU 
FOR YOUR ATTENTION

More Related Content

Similar to Automated data extraction from newspaper documents

IRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET Journal
 
Stages and components of ip
Stages and components of ipStages and components of ip
Stages and components of ipVikash Rathour
 
INFORMATION RETRIEVAL ‎AND DISSEMINATION
INFORMATION RETRIEVAL ‎AND DISSEMINATIONINFORMATION RETRIEVAL ‎AND DISSEMINATION
INFORMATION RETRIEVAL ‎AND DISSEMINATIONLibcorpio
 
Slide Ngajar E-Filing cover.pdf
Slide Ngajar E-Filing cover.pdfSlide Ngajar E-Filing cover.pdf
Slide Ngajar E-Filing cover.pdfrahmantoyuri
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSULTHAN BASHA
 
Information Storage and Retrieval : A Case Study
Information Storage and Retrieval : A Case StudyInformation Storage and Retrieval : A Case Study
Information Storage and Retrieval : A Case StudyBhojaraju Gunjal
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to DatabasesMohd Tousif
 
Extract and Analyze Data from PDF File and Web : A Review
Extract and Analyze Data from PDF File and Web : A ReviewExtract and Analyze Data from PDF File and Web : A Review
Extract and Analyze Data from PDF File and Web : A ReviewIRJET Journal
 
Resume Parsing And Processing Using Hadoop (1)
Resume Parsing And Processing Using Hadoop (1)Resume Parsing And Processing Using Hadoop (1)
Resume Parsing And Processing Using Hadoop (1)Sourav Madhesiya
 
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEM
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEMA LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEM
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEMcscpconf
 
A language independent approach to develop urduir system
A language independent approach to develop urduir systemA language independent approach to develop urduir system
A language independent approach to develop urduir systemcsandit
 
Computer is an electronic device or combination of electronic devices
Computer is an electronic device or combination of electronic devicesComputer is an electronic device or combination of electronic devices
Computer is an electronic device or combination of electronic devicesArti Arora
 
V6 i5 0267
V6 i5 0267V6 i5 0267
V6 i5 0267om12345
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringIRJET Journal
 
Modul e filing
Modul e filingModul e filing
Modul e filingdedidarwis
 

Similar to Automated data extraction from newspaper documents (20)

IRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining Techniques
 
Stages and components of ip
Stages and components of ipStages and components of ip
Stages and components of ip
 
INFORMATION RETRIEVAL ‎AND DISSEMINATION
INFORMATION RETRIEVAL ‎AND DISSEMINATIONINFORMATION RETRIEVAL ‎AND DISSEMINATION
INFORMATION RETRIEVAL ‎AND DISSEMINATION
 
Slide Ngajar E-Filing cover.pdf
Slide Ngajar E-Filing cover.pdfSlide Ngajar E-Filing cover.pdf
Slide Ngajar E-Filing cover.pdf
 
Sulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_ScienceSulthan's DBMS for_Computer_Science
Sulthan's DBMS for_Computer_Science
 
Information Storage and Retrieval : A Case Study
Information Storage and Retrieval : A Case StudyInformation Storage and Retrieval : A Case Study
Information Storage and Retrieval : A Case Study
 
RM 4 UNIT.pptx
RM 4 UNIT.pptxRM 4 UNIT.pptx
RM 4 UNIT.pptx
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to Databases
 
50120140503012
5012014050301250120140503012
50120140503012
 
Extract and Analyze Data from PDF File and Web : A Review
Extract and Analyze Data from PDF File and Web : A ReviewExtract and Analyze Data from PDF File and Web : A Review
Extract and Analyze Data from PDF File and Web : A Review
 
Resume Parsing And Processing Using Hadoop (1)
Resume Parsing And Processing Using Hadoop (1)Resume Parsing And Processing Using Hadoop (1)
Resume Parsing And Processing Using Hadoop (1)
 
Ijetcas14 409
Ijetcas14 409Ijetcas14 409
Ijetcas14 409
 
CRC Final Report
CRC Final ReportCRC Final Report
CRC Final Report
 
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEM
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEMA LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEM
A LANGUAGE INDEPENDENT APPROACH TO DEVELOP URDUIR SYSTEM
 
A language independent approach to develop urduir system
A language independent approach to develop urduir systemA language independent approach to develop urduir system
A language independent approach to develop urduir system
 
Computer is an electronic device or combination of electronic devices
Computer is an electronic device or combination of electronic devicesComputer is an electronic device or combination of electronic devices
Computer is an electronic device or combination of electronic devices
 
V6 i5 0267
V6 i5 0267V6 i5 0267
V6 i5 0267
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Extraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web EngineeringExtraction and Retrieval of Web based Content in Web Engineering
Extraction and Retrieval of Web based Content in Web Engineering
 
Modul e filing
Modul e filingModul e filing
Modul e filing
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Automated data extraction from newspaper documents