SlideShare a Scribd company logo
The New Library of Alexandria 
Overview 
Bibliotheca Alexandrina (BA)
Ø Center of excellence in the production 
and dissemination of knowledge 
Ø Place of dialogue, learning and 
understanding between cultures and 
peoples
Ø The World’s Window on Egypt 
Ø Egypt’s Window on the World 
Ø Instrument for Rising to the Challenges of 
the Digital Age 
Ø Center for Dialogue Between Peoples and 
Civilizations
Not just a Library of Books but rather a vast cultural and 
scientific complex
A library that can accommodate millions of books
7 
http://archive.bibalex.org
8
14
15 
http://descegy.bibalex.org
16 
http://lartarab.bibalex.org
17 
More than 230,000 Arabic books are 
freely available online for Arabic 
readers worldwide
18 
http://suezcanal.bibalex.org
19
20 
http://naguib.bibalex.org/
21 
http://nasser.bibalex.org
22 
http://sadat.bibalex.org
Ø Project Overview 
Ø Collection Overview 
Ø Data Representation 
Ø System Workflow 
— DAF (Digital Assets Factory) 
— Cataloguing 
— Website 
§ Solr search Engine 
§ Article Viewer 
24
25
Ø Centre for Economic, Judicial, and Social 
Study and Documentation (CEDEJ) 
collaborated with Bibliotheca Alexandrina 
(BA) for the digitization of its archive of 
massive press articles collection 
Ø The project consists of multiple modules to: 
— Index the Press Archive Collection 
— Control data entry workflow 
— Digitize and process data 
— Catalogue and review Articles 
— Archive Web Publishing 
26
27
Ø Package of press archive 
— 800,000+ press clips varying between 
§ Press 
§ Reports 
— 500+ publishers 
— 60,000+ writers and reporters 
— 200 Different subjects 
§ Economic, politics, social life, etc… 
— Archive Languages: 
§ Arabic, English and French 
— Date range from 1966 to 2009 
28
Ø Finished so far 
— 115,000 press clips varying between 
§ Press 
§ Reports 
— 200 publishers 
— 14,000 writers and reporters 
— 100 Different subjects 
§ Economic, politics, social life, etc… 
— Archive Languages: 
§ Arabic, English and French 
— Date range from 1966 to 2009 
29
30
Ø A list of packaged press archive is submitted to 
Bibliotheca Alexandrina to be scanned and 
catalogued 
Ø Source of data is a collection of boxes 
Ø The box is organized on the following 
hierarchy 
— Folder 
— File 
— Sub-File 
— Document 
Ø Document represents a single page of press 
31
32
33
34
35
36
37
38
Article Creation 
39
Article Metadata 
40
Lookups Management 
41
Reports 
42
43
44
45
Ø Based on Apache Lucene project v4.1 
Ø SolrNet API is used to connect to Solr 
server 
Ø Features 
— Simple/Advanced search 
— Results Highlighting 
— Fields AutoComplete 
— Text search (Article Viewer) 
46
47
48
49
50
51
52
53
Ø Article viewer is used for previewing articles 
— It is one of multiple viewers developed at BA 
Ø Architecture 
— Server Side: RESTful services 
— Client Side: JavaScript using JSONP 
Ø Features 
— Image preview 
— Metadata preview 
— Text selection 
— Searching/highlighting 
— Zooming options: fit width/height 
54
Ø Viewer Web Services 
— Metadata Web Service: 
§ Retrieve article catalogue metadata 
§ Return technical information (width, height, page 
count..) 
— Content Web Service: 
§ Retrieve the image of each single page in the article 
applying scaling to custom width and height 
responsively 
§ Return the selected text based on the user highlighted 
area 
— Search Web Service: 
§ Perform the search using Solr engine APIs in the 
content of the articles 
§ Highlight the matching phrases in the article image 
55
56
57
58

More Related Content

Similar to Managing the Digitization of Large Press Archives

CS honours library training
CS honours library trainingCS honours library training
CS honours library training
pvhead123
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
Micah Altman
 
CS Honors Library Training - February 2017
CS Honors Library Training - February 2017CS Honors Library Training - February 2017
CS Honors Library Training - February 2017
pvhead123
 
Computer Science Library Training
Computer Science Library TrainingComputer Science Library Training
Computer Science Library Training
pvhead123
 
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
ABES
 
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
Robert H. McDonald
 
NEBASE Hour - August 2008 - What's New At OCLC?
NEBASE Hour - August 2008 - What's New At OCLC?NEBASE Hour - August 2008 - What's New At OCLC?
NEBASE Hour - August 2008 - What's New At OCLC?
Nebraska Library Commission
 
The Biblissima Portal: Current state and future plans
The Biblissima Portal: Current state and future plansThe Biblissima Portal: Current state and future plans
The Biblissima Portal: Current state and future plans
Equipex Biblissima
 
web opac
 web opac  web opac
web opac
akash kurmi
 
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with EuropeanaThe Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
Vangelis Banos
 
Dm2 e ontotext-nov2012
Dm2 e ontotext-nov2012Dm2 e ontotext-nov2012
Dm2 e ontotext-nov2012
Mariana Damova, Ph.D
 
Mariana Damova - Ontotext
Mariana Damova - OntotextMariana Damova - Ontotext
Mariana Damova - Ontotext
Digitised Manuscripts to Europeana
 
Building library networks with linked data
Building library networks with linked dataBuilding library networks with linked data
Building library networks with linked data
Enno Meijers
 
Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...
Valentine Charles
 
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and researchIIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
American Art Collaborative
 
Intro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLWIntro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLW
Glen Robson
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
charper
 
Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19
Janifer Gatenby
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018
Antoine Isaac
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
Peter Haase
 

Similar to Managing the Digitization of Large Press Archives (20)

CS honours library training
CS honours library trainingCS honours library training
CS honours library training
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 
CS Honors Library Training - February 2017
CS Honors Library Training - February 2017CS Honors Library Training - February 2017
CS Honors Library Training - February 2017
 
Computer Science Library Training
Computer Science Library TrainingComputer Science Library Training
Computer Science Library Training
 
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
Jabes 2010 - Session plénière "Les bibliothèques sur un nuage"
 
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
 
NEBASE Hour - August 2008 - What's New At OCLC?
NEBASE Hour - August 2008 - What's New At OCLC?NEBASE Hour - August 2008 - What's New At OCLC?
NEBASE Hour - August 2008 - What's New At OCLC?
 
The Biblissima Portal: Current state and future plans
The Biblissima Portal: Current state and future plansThe Biblissima Portal: Current state and future plans
The Biblissima Portal: Current state and future plans
 
web opac
 web opac  web opac
web opac
 
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with EuropeanaThe Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
 
Dm2 e ontotext-nov2012
Dm2 e ontotext-nov2012Dm2 e ontotext-nov2012
Dm2 e ontotext-nov2012
 
Mariana Damova - Ontotext
Mariana Damova - OntotextMariana Damova - Ontotext
Mariana Damova - Ontotext
 
Building library networks with linked data
Building library networks with linked dataBuilding library networks with linked data
Building library networks with linked data
 
Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...
 
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and researchIIIF and Mirador at the YCBA: image based scholarly collaboration and research
IIIF and Mirador at the YCBA: image based scholarly collaboration and research
 
Intro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLWIntro to IIIF and IIIF @NLW
Intro to IIIF and IIIF @NLW
 
Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
 
Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19Multilingual presentation ifla 2013 08-19
Multilingual presentation ifla 2013 08-19
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 

More from DLFCLIR

Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge Project
DLFCLIR
 
Biomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin LivingstonBiomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin Livingston
DLFCLIR
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnership
DLFCLIR
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital Scholarship
DLFCLIR
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpace
DLFCLIR
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... Services
DLFCLIR
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
DLFCLIR
 
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
DLFCLIR
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYU
DLFCLIR
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
DLFCLIR
 

More from DLFCLIR (10)

Public Knowledge Project
Public Knowledge ProjectPublic Knowledge Project
Public Knowledge Project
 
Biomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin LivingstonBiomedical Annotation - Kevin Livingston
Biomedical Annotation - Kevin Livingston
 
Introducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnershipIntroducing NYU to Digital Scholarship: A faculty-library partnership
Introducing NYU to Digital Scholarship: A faculty-library partnership
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital Scholarship
 
Sustaining ArchivesSpace
Sustaining ArchivesSpaceSustaining ArchivesSpace
Sustaining ArchivesSpace
 
From Projects to... Services
From Projects to... ServicesFrom Projects to... Services
From Projects to... Services
 
An Introduction to Linked Data and Microdata
An Introduction to Linked Data and MicrodataAn Introduction to Linked Data and Microdata
An Introduction to Linked Data and Microdata
 
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
Dlf 2011UDFR-a-semantic-registry-for-format-representation-information-v1
 
Charter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYUCharter Nonstarter by Eric Stedfeld, NYU
Charter Nonstarter by Eric Stedfeld, NYU
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 

Recently uploaded

Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
danielkiash986
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
deepaannamalai16
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
melliereed
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
ssuser13ffe4
 

Recently uploaded (20)

Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
Pharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brubPharmaceutics Pharmaceuticals best of brub
Pharmaceutics Pharmaceuticals best of brub
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
Nutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour TrainingNutrition Inc FY 2024, 4 - Hour Training
Nutrition Inc FY 2024, 4 - Hour Training
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
 

Managing the Digitization of Large Press Archives

  • 1.
  • 2. The New Library of Alexandria Overview Bibliotheca Alexandrina (BA)
  • 3. Ø Center of excellence in the production and dissemination of knowledge Ø Place of dialogue, learning and understanding between cultures and peoples
  • 4. Ø The World’s Window on Egypt Ø Egypt’s Window on the World Ø Instrument for Rising to the Challenges of the Digital Age Ø Center for Dialogue Between Peoples and Civilizations
  • 5. Not just a Library of Books but rather a vast cultural and scientific complex
  • 6. A library that can accommodate millions of books
  • 8. 8
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. 14
  • 17. 17 More than 230,000 Arabic books are freely available online for Arabic readers worldwide
  • 19. 19
  • 23.
  • 24. Ø Project Overview Ø Collection Overview Ø Data Representation Ø System Workflow — DAF (Digital Assets Factory) — Cataloguing — Website § Solr search Engine § Article Viewer 24
  • 25. 25
  • 26. Ø Centre for Economic, Judicial, and Social Study and Documentation (CEDEJ) collaborated with Bibliotheca Alexandrina (BA) for the digitization of its archive of massive press articles collection Ø The project consists of multiple modules to: — Index the Press Archive Collection — Control data entry workflow — Digitize and process data — Catalogue and review Articles — Archive Web Publishing 26
  • 27. 27
  • 28. Ø Package of press archive — 800,000+ press clips varying between § Press § Reports — 500+ publishers — 60,000+ writers and reporters — 200 Different subjects § Economic, politics, social life, etc… — Archive Languages: § Arabic, English and French — Date range from 1966 to 2009 28
  • 29. Ø Finished so far — 115,000 press clips varying between § Press § Reports — 200 publishers — 14,000 writers and reporters — 100 Different subjects § Economic, politics, social life, etc… — Archive Languages: § Arabic, English and French — Date range from 1966 to 2009 29
  • 30. 30
  • 31. Ø A list of packaged press archive is submitted to Bibliotheca Alexandrina to be scanned and catalogued Ø Source of data is a collection of boxes Ø The box is organized on the following hierarchy — Folder — File — Sub-File — Document Ø Document represents a single page of press 31
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 36. 36
  • 37. 37
  • 38. 38
  • 43. 43
  • 44. 44
  • 45. 45
  • 46. Ø Based on Apache Lucene project v4.1 Ø SolrNet API is used to connect to Solr server Ø Features — Simple/Advanced search — Results Highlighting — Fields AutoComplete — Text search (Article Viewer) 46
  • 47. 47
  • 48. 48
  • 49. 49
  • 50. 50
  • 51. 51
  • 52. 52
  • 53. 53
  • 54. Ø Article viewer is used for previewing articles — It is one of multiple viewers developed at BA Ø Architecture — Server Side: RESTful services — Client Side: JavaScript using JSONP Ø Features — Image preview — Metadata preview — Text selection — Searching/highlighting — Zooming options: fit width/height 54
  • 55. Ø Viewer Web Services — Metadata Web Service: § Retrieve article catalogue metadata § Return technical information (width, height, page count..) — Content Web Service: § Retrieve the image of each single page in the article applying scaling to custom width and height responsively § Return the selected text based on the user highlighted area — Search Web Service: § Perform the search using Solr engine APIs in the content of the articles § Highlight the matching phrases in the article image 55
  • 56. 56
  • 57. 57
  • 58. 58