SlideShare a Scribd company logo
1 of 27
Digital Content Creation
RupeshKumarA
Email:a.rupeshkumar@gmail.com
Digitization
• Digitization refers to the process of translating a piece of
information such as a book, journal articles, sound recordings,
pictures, audio tapes or video recordings, etc. into bits.
• Bits are the fundamental units of information in a computer
system.
• Converting information into these binary digits (bits) is called
digitisation.
• Thefirst step in digitizationis scanning.
• Whenanobjectisscanned,itisconverted intoadigitalimage.
• A digital image is composed of a set of pixels (picture elements),
arrangedaccording toapre-definedratioofcolumnsandrows.
• An image file can be managed as regular computer file and can be
retrieved,printedandmodifiedusing appropriatesoftware.
• Images containing text can be converted into text files using a
process calledOpticalCharacterRecognition(OCR).
OCR
• Optical Character Recognition, or OCR, is a technology that
enables a user to convert different types of documents, such as
scanned paper documents, PDF files or images captured by a
digitalcameraintoeditableand searchable data.
• The mechanical or electronic conversion of images of typed,
handwritten or printed text into machine-encoded text, whether
fromascanned document,aphotoofadocument,ascene-photo.
Techniquesin OCR
• Pre-processing
• Character recognition
• Post-processing
Pre-processing
• Pre-processing involves certain tasks to improve character recognition
and its accuracy.
• Pre-processing includes
• de-skewing: setting the characters perfectly horizontal or vertical if they
are slant
• Despeckle: removing positive and negative spots, smoothing edges
• Binarization:converting images to b&w
• Line removal: clearing non-character lines andboxes
• Line and word detection
• Script recognition: recognizing the script of the text
CharacterRecognition
• Character recognition may involve:
• Matrix matching:comparing an image to a stored glyph on a
pixel-by-pixelbasis.
• It is also knownas “patternmatching” or“image correlation”.
• Featureextraction: decomposing (dividing) glyphs into
featureslikelines, closed loops, linedirection and line
intersections.
MatrixMatching
FeatureExtraction
Post-processing
• The output stream may be a plain text stream or fileof
characters.
• More sophisticated OCR systems can preserve the original
layoutof thepage.
OCR Software
• Tesseract
• Screenworm
• ABBYY FineReader
• FreeOCR
• SimpleOCR
• OmniPage
• GOCR
• Microsoft OfficeOnenote
ElectronicDocument
• Any electronic media content which is intended to be used in either
electronic form or as printed output.
• E-documents donot include computer programs or system files.
• E-documents come in a varietyof file formats.
• Today, most e-docs in different file formats will have at least one file
viewer (e.g. Adobe Reader for PDFfiles).
• File format incompatibility poses achallenge for e-docs.
• Development of non-proprietary, standardized file formats is a solution
to tackle incompatibility (e.g. HTML, OpenDocument).
FileFormats (in digitization)
• Several fileformats are used for documentsto be included in
digital libraries.
• Most common formatis PDF.
• Other formats include:
– TIFF: Tagged Image File Format
– JPG (JPEG): Joint Photographic Experts Group
– PNG: Portable Network Graphics
– GIF: Graphics Interchange Format
– PS or EPS: PostScript or Encapsulated PostScript
PortableDocumentFormat
• A file format used to present documents in a manner
independentof software, hardware, and operating systems.
• PDF file encapsulates a complete description of a fixed-layout
flat document, including the text, fonts, graphics, and other
informationneededtodisplay it.
• A PDF file will look the same way on a variety of computers
irrespective of operating systems.
History
• PDFwas developedby AdobeCorporation in early 1990s.
• Before the emergence of World Wide Web and HTML format, PDF
waspopularin DesktopPublishing(DTP).
• PDFwasaproprietary formatcontrolledby Adobetill2008.
• On July 1, 2008, it was released as an open standard and
published by ISO as
ISO 32000-1:2008.
TechnicalAspectsof PDF
• PDFuses the followingtechnologies:
– PostScript page description programming language, for generating
the layout and graphics.
– A font-embedding/replacement system to allow fonts to travel
with the documents
– A structured storage system to bundle these elements and any
associated content into a single file, with data compression where
appropriate.
SpecialFeatures
• PDF files may contain interactive elements such as
annotations, form fields, video and Flash animation. Such
filesare called “RichMediaPDF”.
• A PDF file may be encrypted for security, or digitally signed
for authentication.
• PDF documents can contain display settings, including the
pagedisplay layout and zoom level.
Borndigitalandlegacydocuments
• Born digital documents are resources or items created and
managedin digital form.
• They may be: digital photographs, digital documents,
harvested Web content, digital manuscripts, electronic
records, staticdata sets, digital art, digital mediapublications.
• Born digital documents can be easily processed for inclusion
in thedigitallibrary as they are nativelyin digitalformat.
Legacy documents
• Legacy documents are resources or items which are originally in ‘non-digital’
form and have to be converted into ‘digital’ form for inclusion in a digital
library.
• Photographs, documents, manuscripts, print records, art, media publications
are examplesoflegacydocuments.
• The process of converting legacy documents into digital form to make them
compatiblefordigitallibrariesisknownas‘digitization’.
• Legacy documents pose greater challenge for digital libraries as their
conversiontodigitalformisverytedious.
ScholarlyCommunication
• Scholarly communication is the process by which academics,
scholars and researchers share and publish their research findings
so that they are available to the wider academic community and
beyond.
• Scholarly communication is “the system through which research
and other scholarly writings are created, evaluated for quality,
disseminated to the scholarly community, and preserved for
futureuse.”
ScholarlyLiterature
• Writings in a scholarly journals& books, E-journals
• Reviews, preprints and working papers,
• Writings in encyclopaedias, dictionaries,and annotated
content,data,
• blogs, discussion forums, professional and scholarlyhubs and
conference papers.
• Sound and video recordings
Terminologyin ScholarlyCommunication
• Manuscript:a scholarly documentwhich has notyetbeen
submittedforpublication.
• Preprint: a scholarly documentacceptedforpublicationin a
journal or book;materialacceptedto beusedin a presentationat
a conference.
• Article: a scholarly documentwhich has beenpublished.
• Paper: a scholarly documentor materialwhich have been
presentedataconference.
• E-Script:an electronicmanuscript.
ElectronicPublishing
• E-publishing includes the digital publication of e-books, digital
magazines, and the development of digital libraries and
catalogues.
• The electronic publishing process follows some aspects of the
traditional paper-based publishing process but differs from
traditionalpublishingin twoways:
– 1)itdoesnotincludeusingan offsetprintingpresstoprintthefinal
productand
– 2)itavoidsthedistributionofaphysicalproduct(e.g.,paper books,
papermagazines,orpapernewspapers).
• Because the content is electronic, it may be distributed over
theInternetand throughelectronic bookstores.
• Users can read the material on a range of electronic and
digital devices, including desktop computers, laptops, tablet
computers, smartphones or e-reader tablets.
E-Journal
• Electronic journals, also known as ejournals, ejournals, and electronic
serials, are scholarly journals or intellectual magazines that can be
accessed viaelectronic transmission.
• An e-journal closely resembles a print journal in structure, but will be in
electronic format.
• Often a journal article will be available for download in two formats - as a
PDF and in HTML format.
• E-journals allow new types on content to be included in journals, for
example video material, or the data sets on which research has been
based.
E-book
• An electronic book (or e-book) is a book publication made
available in digital form, consisting of text, images, or both,
readable on the flat-panel display of computers or other
electronic devices.
• An e-book may be an e-only book or an electronic version of a
printedbook.
E-book fileformats
• PDF(.pdf)
• Open eBook (.opf)
• EPUB (.epub)
• Compiled html(chm)
• DjVu (.djvu)
• Mobipocket (.mobi)

More Related Content

What's hot

Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery serviceKankana Baishya
 
Collection development in digital libraries
Collection development in digital librariesCollection development in digital libraries
Collection development in digital librarieskawaagneK
 
Chain indexing
Chain indexingChain indexing
Chain indexingsilambu111
 
Institutional repositories
Institutional repositoriesInstitutional repositories
Institutional repositoriesSmita Chandra
 
Z39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptZ39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptSUNILKUMARSINGH
 
Digital library software
Digital library softwareDigital library software
Digital library softwareavid
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information scienceharshaec
 
Modes of formation of subject
Modes of formation of subjectModes of formation of subject
Modes of formation of subjectaditi bhandarkar
 
User education in Libraries
User education in Libraries User education in Libraries
User education in Libraries Humayun Khan
 

What's hot (20)

Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery service
 
Collection development in digital libraries
Collection development in digital librariesCollection development in digital libraries
Collection development in digital libraries
 
Digital Library Initiatives in India
Digital Library Initiatives in IndiaDigital Library Initiatives in India
Digital Library Initiatives in India
 
Chain indexing
Chain indexingChain indexing
Chain indexing
 
Use and user study
Use and user study Use and user study
Use and user study
 
Institutional repositories
Institutional repositoriesInstitutional repositories
Institutional repositories
 
Marc format
Marc formatMarc format
Marc format
 
Z39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptZ39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol ppt
 
Soul
Soul Soul
Soul
 
Koha presentation
Koha presentationKoha presentation
Koha presentation
 
Spiral of Scientific Method Arun Joseph MPhil
Spiral of Scientific Method   Arun Joseph MPhilSpiral of Scientific Method   Arun Joseph MPhil
Spiral of Scientific Method Arun Joseph MPhil
 
Digital library software
Digital library softwareDigital library software
Digital library software
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information science
 
Modes of formation of subject
Modes of formation of subjectModes of formation of subject
Modes of formation of subject
 
Information Analysis Consolidation and Repackaging (IACR): an overview
Information Analysis Consolidation and Repackaging (IACR): an overviewInformation Analysis Consolidation and Repackaging (IACR): an overview
Information Analysis Consolidation and Repackaging (IACR): an overview
 
ISBD
ISBDISBD
ISBD
 
DELNET.pptx
DELNET.pptxDELNET.pptx
DELNET.pptx
 
Digital Library
Digital LibraryDigital Library
Digital Library
 
Dspace software
Dspace softwareDspace software
Dspace software
 
User education in Libraries
User education in Libraries User education in Libraries
User education in Libraries
 

Similar to Digital Content Creation

Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4SIMONTHOMAS S
 
e-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currente-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currentpbajcsy
 
Chapter 7 : MAKING MULTIMEDIA
Chapter 7 : MAKING MULTIMEDIAChapter 7 : MAKING MULTIMEDIA
Chapter 7 : MAKING MULTIMEDIAazira96
 
chapter7-151010022348-lva1-app6892 (1).pptx
chapter7-151010022348-lva1-app6892 (1).pptxchapter7-151010022348-lva1-app6892 (1).pptx
chapter7-151010022348-lva1-app6892 (1).pptxJayasheelanP
 
Digitalization manual (2).pptx
Digitalization manual (2).pptxDigitalization manual (2).pptx
Digitalization manual (2).pptxawokeyirdaw1
 
Multimedia Presentation and Authoring
Multimedia Presentation and AuthoringMultimedia Presentation and Authoring
Multimedia Presentation and AuthoringTamanna Sehgal
 
Multimedia tech.sec a & b
Multimedia tech.sec a & bMultimedia tech.sec a & b
Multimedia tech.sec a & bSonu Sharma
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7 carnillr
 
computer literacy chapter2.pptx
computer literacy chapter2.pptxcomputer literacy chapter2.pptx
computer literacy chapter2.pptxToobaFarooq10
 
E resources
E resourcesE resources
E resourcesavid
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practiceHelen Nneka Okpala
 
Daunted by digitization by Bruce Covington
Daunted by digitization by Bruce CovingtonDaunted by digitization by Bruce Covington
Daunted by digitization by Bruce CovingtonBruce Covington
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia WorkflowsDwight Kelly
 

Similar to Digital Content Creation (20)

Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4
 
e-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Currente-Services to Keep Your Digital Files Current
e-Services to Keep Your Digital Files Current
 
Digitization
DigitizationDigitization
Digitization
 
DIGITAL LIBRARY
DIGITAL LIBRARYDIGITAL LIBRARY
DIGITAL LIBRARY
 
Chapter 7 : MAKING MULTIMEDIA
Chapter 7 : MAKING MULTIMEDIAChapter 7 : MAKING MULTIMEDIA
Chapter 7 : MAKING MULTIMEDIA
 
Chapter 7
Chapter 7Chapter 7
Chapter 7
 
chapter7-151010022348-lva1-app6892 (1).pptx
chapter7-151010022348-lva1-app6892 (1).pptxchapter7-151010022348-lva1-app6892 (1).pptx
chapter7-151010022348-lva1-app6892 (1).pptx
 
Digitalization manual (2).pptx
Digitalization manual (2).pptxDigitalization manual (2).pptx
Digitalization manual (2).pptx
 
Multimedia Presentation and Authoring
Multimedia Presentation and AuthoringMultimedia Presentation and Authoring
Multimedia Presentation and Authoring
 
Digital library software
Digital library softwareDigital library software
Digital library software
 
MULTMEDIA DATABASE.ppt
MULTMEDIA DATABASE.pptMULTMEDIA DATABASE.ppt
MULTMEDIA DATABASE.ppt
 
Multimedia tech.sec a & b
Multimedia tech.sec a & bMultimedia tech.sec a & b
Multimedia tech.sec a & b
 
Chapter 7
Chapter 7 Chapter 7
Chapter 7
 
computer literacy chapter2.pptx
computer literacy chapter2.pptxcomputer literacy chapter2.pptx
computer literacy chapter2.pptx
 
E resources
E resourcesE resources
E resources
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practice
 
new one
new onenew one
new one
 
Daunted by digitization by Bruce Covington
Daunted by digitization by Bruce CovingtonDaunted by digitization by Bruce Covington
Daunted by digitization by Bruce Covington
 
Unit 78 technical file
Unit 78 technical fileUnit 78 technical file
Unit 78 technical file
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia Workflows
 

More from Dept of Library and Information Science Tumkur University

More from Dept of Library and Information Science Tumkur University (20)

Institutional Repositories and Open Access Movement
Institutional Repositories and Open Access MovementInstitutional Repositories and Open Access Movement
Institutional Repositories and Open Access Movement
 
Digital Library Software
Digital Library SoftwareDigital Library Software
Digital Library Software
 
Digital Content Management
Digital Content ManagementDigital Content Management
Digital Content Management
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Library Architecture
Digital Library ArchitectureDigital Library Architecture
Digital Library Architecture
 
Interoperability in Digital Libraries
Interoperability in Digital LibrariesInteroperability in Digital Libraries
Interoperability in Digital Libraries
 
International Digital Library Initiatives
International Digital Library InitiativesInternational Digital Library Initiatives
International Digital Library Initiatives
 
Evolution of Digital Libraries
Evolution of Digital LibrariesEvolution of Digital Libraries
Evolution of Digital Libraries
 
Digital Library Conferences
Digital Library ConferencesDigital Library Conferences
Digital Library Conferences
 
Basic Concepts of Digital Library
Basic Concepts of Digital LibraryBasic Concepts of Digital Library
Basic Concepts of Digital Library
 
Types of Libraries
Types of LibrariesTypes of Libraries
Types of Libraries
 
Resource Sharing and Networking
Resource Sharing and NetworkingResource Sharing and Networking
Resource Sharing and Networking
 
Basics of Research
Basics of ResearchBasics of Research
Basics of Research
 
Historical Method of Research
Historical Method of ResearchHistorical Method of Research
Historical Method of Research
 
Five Laws of Library Science
Five Laws of Library ScienceFive Laws of Library Science
Five Laws of Library Science
 
Library Classification
Library ClassificationLibrary Classification
Library Classification
 
How to create a filter for mails in GMail
How to create a filter for mails in GMailHow to create a filter for mails in GMail
How to create a filter for mails in GMail
 
How to add custom signature in GMail
How to add custom signature in GMailHow to add custom signature in GMail
How to add custom signature in GMail
 
How to attach a file with a mail in GMail
How to attach a file with a mail in GMailHow to attach a file with a mail in GMail
How to attach a file with a mail in GMail
 
How to create a new email account using GMail
How to create a new email account using GMailHow to create a new email account using GMail
How to create a new email account using GMail
 

Recently uploaded

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Digital Content Creation

  • 2. Digitization • Digitization refers to the process of translating a piece of information such as a book, journal articles, sound recordings, pictures, audio tapes or video recordings, etc. into bits. • Bits are the fundamental units of information in a computer system. • Converting information into these binary digits (bits) is called digitisation. • Thefirst step in digitizationis scanning.
  • 3. • Whenanobjectisscanned,itisconverted intoadigitalimage. • A digital image is composed of a set of pixels (picture elements), arrangedaccording toapre-definedratioofcolumnsandrows. • An image file can be managed as regular computer file and can be retrieved,printedandmodifiedusing appropriatesoftware. • Images containing text can be converted into text files using a process calledOpticalCharacterRecognition(OCR).
  • 4. OCR • Optical Character Recognition, or OCR, is a technology that enables a user to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digitalcameraintoeditableand searchable data. • The mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether fromascanned document,aphotoofadocument,ascene-photo.
  • 5. Techniquesin OCR • Pre-processing • Character recognition • Post-processing
  • 6. Pre-processing • Pre-processing involves certain tasks to improve character recognition and its accuracy. • Pre-processing includes • de-skewing: setting the characters perfectly horizontal or vertical if they are slant • Despeckle: removing positive and negative spots, smoothing edges • Binarization:converting images to b&w • Line removal: clearing non-character lines andboxes • Line and word detection • Script recognition: recognizing the script of the text
  • 7. CharacterRecognition • Character recognition may involve: • Matrix matching:comparing an image to a stored glyph on a pixel-by-pixelbasis. • It is also knownas “patternmatching” or“image correlation”. • Featureextraction: decomposing (dividing) glyphs into featureslikelines, closed loops, linedirection and line intersections.
  • 10. Post-processing • The output stream may be a plain text stream or fileof characters. • More sophisticated OCR systems can preserve the original layoutof thepage.
  • 11. OCR Software • Tesseract • Screenworm • ABBYY FineReader • FreeOCR • SimpleOCR • OmniPage • GOCR • Microsoft OfficeOnenote
  • 12. ElectronicDocument • Any electronic media content which is intended to be used in either electronic form or as printed output. • E-documents donot include computer programs or system files. • E-documents come in a varietyof file formats. • Today, most e-docs in different file formats will have at least one file viewer (e.g. Adobe Reader for PDFfiles). • File format incompatibility poses achallenge for e-docs. • Development of non-proprietary, standardized file formats is a solution to tackle incompatibility (e.g. HTML, OpenDocument).
  • 13. FileFormats (in digitization) • Several fileformats are used for documentsto be included in digital libraries. • Most common formatis PDF. • Other formats include: – TIFF: Tagged Image File Format – JPG (JPEG): Joint Photographic Experts Group – PNG: Portable Network Graphics – GIF: Graphics Interchange Format – PS or EPS: PostScript or Encapsulated PostScript
  • 14. PortableDocumentFormat • A file format used to present documents in a manner independentof software, hardware, and operating systems. • PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other informationneededtodisplay it. • A PDF file will look the same way on a variety of computers irrespective of operating systems.
  • 15. History • PDFwas developedby AdobeCorporation in early 1990s. • Before the emergence of World Wide Web and HTML format, PDF waspopularin DesktopPublishing(DTP). • PDFwasaproprietary formatcontrolledby Adobetill2008. • On July 1, 2008, it was released as an open standard and published by ISO as ISO 32000-1:2008.
  • 16. TechnicalAspectsof PDF • PDFuses the followingtechnologies: – PostScript page description programming language, for generating the layout and graphics. – A font-embedding/replacement system to allow fonts to travel with the documents – A structured storage system to bundle these elements and any associated content into a single file, with data compression where appropriate.
  • 17. SpecialFeatures • PDF files may contain interactive elements such as annotations, form fields, video and Flash animation. Such filesare called “RichMediaPDF”. • A PDF file may be encrypted for security, or digitally signed for authentication. • PDF documents can contain display settings, including the pagedisplay layout and zoom level.
  • 18. Borndigitalandlegacydocuments • Born digital documents are resources or items created and managedin digital form. • They may be: digital photographs, digital documents, harvested Web content, digital manuscripts, electronic records, staticdata sets, digital art, digital mediapublications. • Born digital documents can be easily processed for inclusion in thedigitallibrary as they are nativelyin digitalformat.
  • 19. Legacy documents • Legacy documents are resources or items which are originally in ‘non-digital’ form and have to be converted into ‘digital’ form for inclusion in a digital library. • Photographs, documents, manuscripts, print records, art, media publications are examplesoflegacydocuments. • The process of converting legacy documents into digital form to make them compatiblefordigitallibrariesisknownas‘digitization’. • Legacy documents pose greater challenge for digital libraries as their conversiontodigitalformisverytedious.
  • 20. ScholarlyCommunication • Scholarly communication is the process by which academics, scholars and researchers share and publish their research findings so that they are available to the wider academic community and beyond. • Scholarly communication is “the system through which research and other scholarly writings are created, evaluated for quality, disseminated to the scholarly community, and preserved for futureuse.”
  • 21. ScholarlyLiterature • Writings in a scholarly journals& books, E-journals • Reviews, preprints and working papers, • Writings in encyclopaedias, dictionaries,and annotated content,data, • blogs, discussion forums, professional and scholarlyhubs and conference papers. • Sound and video recordings
  • 22. Terminologyin ScholarlyCommunication • Manuscript:a scholarly documentwhich has notyetbeen submittedforpublication. • Preprint: a scholarly documentacceptedforpublicationin a journal or book;materialacceptedto beusedin a presentationat a conference. • Article: a scholarly documentwhich has beenpublished. • Paper: a scholarly documentor materialwhich have been presentedataconference. • E-Script:an electronicmanuscript.
  • 23. ElectronicPublishing • E-publishing includes the digital publication of e-books, digital magazines, and the development of digital libraries and catalogues. • The electronic publishing process follows some aspects of the traditional paper-based publishing process but differs from traditionalpublishingin twoways: – 1)itdoesnotincludeusingan offsetprintingpresstoprintthefinal productand – 2)itavoidsthedistributionofaphysicalproduct(e.g.,paper books, papermagazines,orpapernewspapers).
  • 24. • Because the content is electronic, it may be distributed over theInternetand throughelectronic bookstores. • Users can read the material on a range of electronic and digital devices, including desktop computers, laptops, tablet computers, smartphones or e-reader tablets.
  • 25. E-Journal • Electronic journals, also known as ejournals, ejournals, and electronic serials, are scholarly journals or intellectual magazines that can be accessed viaelectronic transmission. • An e-journal closely resembles a print journal in structure, but will be in electronic format. • Often a journal article will be available for download in two formats - as a PDF and in HTML format. • E-journals allow new types on content to be included in journals, for example video material, or the data sets on which research has been based.
  • 26. E-book • An electronic book (or e-book) is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. • An e-book may be an e-only book or an electronic version of a printedbook.
  • 27. E-book fileformats • PDF(.pdf) • Open eBook (.opf) • EPUB (.epub) • Compiled html(chm) • DjVu (.djvu) • Mobipocket (.mobi)

Editor's Notes

  1. http://www.oclc.org/content/dam/research/activities/hiddencollections/borndigital.pdf