SlideShare a Scribd company logo
1 of 54
Download to read offline
Government Engineering College, Hassan 573 201
Visvesvaraya Technological University, Belgaum




               Project Report on

              Digital Reader

            4GH08CS010       Chethan.H.A

            4GH09CS402      Gowtham.A.M

            4GH08CS034       Pavan.P.Naik

            4GH08CS058        Yogesh.K.S


                Under the Guidance of

              Mr. Annaiah HB.E.,M.T ech.,
                    Asst. Professor
        Dept. of Computer Science & Engineering
                     GEC, Hassan




Department of Computer Science & Engineering
   Government Engineering College Hassan
                June, 2012
Government Engineering College, Hassan 573 201
   Visvesvaraya Technological University, Belgaum




                                   Certificate
   This is to certify that the project work entitled “Digital Reader” is a bonafide work
carried out by Chethan H.A(4GH08CS010), Gowtham A.M(4GH09CS402), Pavan P
Naik(4GH08CS034), Yogesh K.S(4GH08CS058) in partial fulfillment of the award of the
degree of Bachelor of Engineering in Computer Science & Engineering of Visvesvaraya
Technological University, Belgaum, during the year 2011 - 2012. It is certified that all
corrections / suggestions indicated during internal evaluation have been incorporated in
the report. The project report has been approved as it satisfies the academic require-
ments in respect of the project work prescribed for the Bachelor of Engineering Degree.

   Guide                                          Head of the Department
   Mr. Annaiah H B.E.,M.T ech.                    Dr.K.C.RavishankarB.E.,M.T ech.,P h.D.
   Assistant Professor                            Professor and Head
   Dept of CS & E                                 Dept of CS & E
   GEC, Hassan 573 201                            GEC, Hassan 573 201



                                     Principal
                          Dr. KarisiddappaB.E.,M.T ech.,P h.D.
                                     Principal
                               GEC, Hassan 573 201

                                          Examiners : 1.

                                                          2.
    Date :
    Place : Hassan
Acknowledgement
   At the outset we express our most sincere grateful thanks to our Guide
Annaiah H,Asst.Professor Department of CS & E, for his continous
support and advice not only during the course of our project but also during
the period of our stay in GECH.

We express our gratitude to Dr.K.C. Ravishankar, Professor and Head,
Department of CS & E for his encouragement and support throughout our
work.

We wish to express our thanks to our beloved Principal, Dr.Karisiddappa,
for encouragement throughout our studies.

Finally we express our gratitude to all teaching and non-teaching staff of
Dept. of CS & E, fellow classmates and our parents for their timely support
and suggestions.




Chethan H.A

Gowtham A.M

Pavan P Naik

Yogesh K.S




                                     i
Table of Contents

Table of Contents                                                                                                                               ii

List of Figures                                                                                                                                iii

Abstract                                                                                                                                       iv

1 Introduction                                                                                                                                  1
  1.1 Problem Statement . . . .       .   .   .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    2
  1.2 Objectives . . . . . . . . .    .   .   .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    2
  1.3 Motivation . . . . . . . . .    .   .   .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    3
  1.4 Applications . . . . . . . .    .   .   .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    3
  1.5 Organization of the report      .   .   .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    3

2 Document Readers                                                                                                                              4
  2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                          4

3 Requirement Analysis                                                                                                                          6
  3.1 Literature Survey . . . . . . . . .             .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    6
  3.2 Technologies Used . . . . . . . . .             .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   11
  3.3 JPedal Library . . . . . . . . . .              .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   16
  3.4 Eclipse . . . . . . . . . . . . . . .           .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
  3.5 Java Runtime Environment(JRE)                   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22
  3.6 Hardware Requirements . . . . .                 .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   23
  3.7 Software Requirements . . . . . .               .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   24

4 Design and Implementation                                                                                                                    25
  4.1 Model-View-Controller(MVC)              .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   25
  4.2 Developing an HTML Viewer               .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   26
  4.3 Developing an XML Viewer .              .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   28
  4.4 Developing a Text Viewer . .            .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   31
  4.5 Developing an RTF Viewer . .            .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   34
  4.6 PDF Viewer . . . . . . . . . .          .   .   .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   36

                                                      ii
4.7   Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                 38
      4.8   File Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                               40
      4.9   Image Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                  42

5 Results and analysis                                                                                                                                      43
  5.1 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                     43
  5.2 Types of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                        44
  5.3 Testing with JUnit using Eclipse . . . . . . . . . . . . . . . . . . . . . . .                                                                        44

6 Conclusions and Future Enhancements                                                                                                                       47

References                                                                                                                                                  48




List of Figures

3.1    Workspace launcher . . . . . . .               .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
3.2    Creating New Project . . . . .                 .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   18
3.3    Creating Package . . . . . . . .               .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   19
3.4    Create Java class . . . . . . . .              .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
3.5    Run Java Project . . . . . . . .               .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   20
3.6    Creating the JAR file . . . . . .               .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   21
3.7    Adding a library(.jar) to project              .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   22

4.1    MVC Architecture . .       .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   26
4.2    HTML Viewer . . . . .      .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   29
4.3    Xml Viewer . . . . . .     .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   32
4.4    Text Viewer and editor     .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   35
4.5    RTF Viewer . . . . . .     .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   37
4.6    PDF Viewer . . . . . .     .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   38
4.7    Text search . . . . . .    .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   40
4.8    File search . . . . . .    .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   41
4.9    Image Viewer . . . . .     .   .   .   .   .   .   .   .    .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   42




                                                                  iii
Abstract
    Digital Reader is a application software to view and edit and manage files
in Text(.txt), Portable Document Format (.pdf), Rich Text Form(.rtf) ,Hyper-
Text Markup Language(.html) and Extensible Markup Language (.xml) and
image file formats like JPG.Digital Reader can be applicable to any operating
system and supports users to view and edit .txt , .pdf, .rtf, .html , .xml text
file extension fromats and also few image file formats. The grahical user in-
terface of the Digital reader can be changed to the users wish.User can select
his own text format to view or edit in this single platform and make easy of
the viewing and editing process.

Digital Reader representations are discrete, the information represented can
be discrete, such as numbers, letters or computer icons,images.Digital reader is
a multi-format digital document reader software that is built using platform
independent Java language and designed User Interface with swings pack-
ages.There are several readers available out there supporting one or the other
readable document formats but there are none which can support multiple
formats like pdf, rtf, html,xml text etc.,. But this software enables the user
to read multiple document formats using the single software. Java is taken as
a language for the development of digital reader because it is platform inde-
pendent and supports rich user interface design with its swings package. This
application makes it possible to save system resources in launching different
applications for different documents as only one application can handle sev-
eral document formats.




                                       iv
Chapter 1

Introduction

Digital documents have become more in usage nowadays as compared to regular paper
usage since it save both time and money with the paper and is also environment friendly
as use of paper involves cutting down trees to make them. So when there are digital
documents created, there must be a digital reader for reading those documents and using
them for other purposes like printing, editing, etc. For this purpose there are several
applications available as its readers.This application makes it possible to save system re-
sources in launching different applications for different documents as only one application
can handle several document formats.This application makes it possible to save system
resources in launching different applications for different documents as only one appli-
cation can handle several document formats.There are digital documents created, there
must be a digital reader for reading those documents.For this purpose there are several
applications available as its readers.The readers available to view the documents differ
from format to format.Hence there exist many applications for viewing the documents.

A filename extension is a suffix (separated from the basefilename by a dot) to the name
of a computer file applied to indicate the encoding (file format) of its contents or usage.
Examples of filename extensions are .htm, .pdf, .xml and .txt.Filename extensions can be
considered a type of metadata. They are commonly used to imply information about the
way data might be stored in the file.Formatted text, styled text or rich text, as opposed
to plain text, has styling information beyond the minimum of semantic elements: colours,
styles (boldface, italic), sizes and special features (such as hyperlinks).

Formatted text cannot rightly be identified with binary files or be distinct from ASCII
text. This is because formatted text is not necessarily binary, it may be text-only, such
as HTML, RTF or enriched text files, and it may be ASCII-only. Conversely, a plain text
file may be non-ASCII (in an encoding such as Unicode UTF-8). Text-only formatted
text is achieved by markup which too is textual, while some editors of formatted text like

                                            1
Digital Reader                                                                 Chapter 1

Microsoft Word save in a binary format. For viewing Portable document format(pdf)
document files there are readers like Adobe acrobat, foxit reader, Nitro pdf reader etc.,
For viewing (Rich text format) rtf documents, there are software like WordPad, ABWord,
MS Word, etc.,. Similarly there are other various software for viewing other document
formats.

Adobe Acrobat is a family of application software developed by Adobe Systems to view,
create, manipulate, print and manage files in Portable Document Format (PDF). All
members of the family, except Adobe Reader (formerly Acrobat Reader), are proprietary
commercial software, while the latter is available as freeware and can be downloaded
from Adobe’s web site. Adobe Reader enables users to view and print PDF files but has
negligible PDF creation capabilities. Acrobat and Reader are widely used as a way to
present information with a fixed layout similar to a paper publication.
Microsoft Office Word is a word processor designed by Microsoft. It was first released in
1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later
written for several other platforms including IBM PCs running DOS (1983), the Apple
Macintosh (1984), the ATT Unix PC (1985), Atari ST (1986), SCO UNIX, OS/2, and
Microsoft Windows (1989). It is a component of the Microsoft Office software system; it
is also sold as a standalone product and included in Microsoft Works Suite.




1.1     Problem Statement
Always there is a need for an application to open virtually any document format including
PDFs, TXTs, RTFs, HTMLs, XMLs, and Images that can used in any of the operat-
ing systems provided there exists Java runtime environment(JRE) in the host Operating
System.




1.2     Objectives
Presenting a simple application called Digital Reader (DR) which supports the fol-
lowing view modes and corresponding file formats which can work on different Operating
systems and platforms. The aim and goal of this project is to create an application for
reading multiple document formats.

   • Text/Binary/: all files (unlimited size)

   • RTF: RTF/ UTF-8 encoded files
Dept. of CS & E, GEC, Hassan                                                           2
Digital Reader                                                              Chapter 1

   • Images: JPG.

   • Internet: all file types supported by Web Browsers (HTML/ XML etc)

   • Portable document format(PDF) with suitable view.

   • Suppotable for Microsoft Excel viewer.

Digital Reader is a Unicode-compatible and can be opened in a click of a mouse.




1.3     Motivation
As there was no application that can support most of the document formats, this was
the motivation for us to develop this kind of application.




1.4     Applications
Some of the applications can be listed as follows:

   • Support for multiple document formats such as text,pdf,rtf,html,xml.

   • Additonal utilities such as Filesearch and Textsearch.

   • Comparators for editable and non-editable documents.

   • Editors with Enable/Disable option.


1.5     Organization of the report
Breif description or bird view of the remaining report topics.




Dept. of CS & E, GEC, Hassan                                                       3
Chapter 2

Document Readers

2.1     Overview
A Document Reader is an application software that presents the data stored in a com-
puter file in a human-friendly form. The file contents are generally displayed on the
screen, or they may be printed. Also, they may be read aloud using speech synthesis.

Document Readers may edit files, yet it is common for them to be able to save data
in a different file format, or to copy information from the viewed file to the system-
wide clipboard. They must have sufficient knowledge about the format of the file to be
viewed. Even plain text files are not so simple file viewers may have to handle different
code pages and newlinestyles.A fundamental type of Document Readers are filters that
translate binary files into plain text (one example antiword). However, depending on
the competence of the translating routines, some information may be lost. Disassemblers
also fall in this category.Another common type of Document Reader is a picture viewer
that can display picture files of various formats. Common features here are thumbnail
preview and creation, and image zooming.

For more complex or proprietary file formats, file viewers are usually provided by the
same companies that make the editing software using those formats (viewers may be
distributed free of charge, while editors have to be bought).

A Document Reader is full-functionality software in the sense that it have a capabil-
ity to create a file, or modify the content of an existing one. Instead, it is can be used to
display or print the content.The primary reason behind this missing functionality is often
due to marketing and control. For example, a popular software program, Adobe Acrobat,
can be used to create content for most computer platforms, under various operating sys-
tems. To ensure that people can access the documents created with Adobe Acrobat, the

                                             4
Digital Reader                                                                 Chapter 2

software publisher created a viewer program, the Acrobat Reader, and made it available
for free. This viewer application allows the content created by the proprietary authoring
software to be readable on all supported operating-system platforms, free of charge, thus
making it a more attractive solution.

There are many products that can qualify as a file viewer: Microsoft Word viewer or
Microsoft PowerPoint viewer, and the OpenOffice equivalents are examples. In a sense,
a web browser is a type of Document Reader, which translates, or renders, the HTML
markups into a human-friendly presentation. Although HTML is plain text, viewing an
HTML file in a browser and in a text editor produces significantly different results.

Although web browsers are arguably the best Document Reader, since they support
many graphic, multimedia, and document formats, they are likely to always be some-
what lacking in the output quality and feature performance of more specialized software
packages. Creating and using alternative publishing systems and their accompanying
Document Readers still makes a lot of business sense, given the content and presentation
control they provide.




Dept. of CS & E, GEC, Hassan                                                           5
Chapter 3

Requirement Analysis

3.1     Literature Survey
To study and analyze more about the various document formats, the following literature
survey has done and discussed in this chapter.




3.1.1    Portable Document Format(PDF)
Portable Document Format (PDF) is an open standard for document exchange. This
file format, created by Adobe Systems in 1993, is used for representing documents in
a manner independent of application software, hardware, and operating systems. Each
PDF file encapsulates a complete description of a fixed-layout flat document, including
the text, fonts, graphics, and other information needed to display it. While the PDF
specification has been available free of charge since at least 2001, PDF was originally a
proprietary format controlled by Adobe. It was officially released as an open standard
on July 1, 2008, and published by the International Organization for Standardization as
ISO 32000-1:2008. In 2008, Adobe published a Public Patent License to ISO 32000-1
granting a royalty-free rights for all patents owned by Adobe that are necessary to make,
use, sell and distribute PDF compliant implementations.




3.1.2    Rich Text Format(RTF)
The Rich Text Format (often abbreviated RTF) is a proprietary document file format
with published specification developed by Microsoft Corporation since 1987 for Microsoft
products and for cross-platform document interchange.

                                           6
Digital Reader                                                                     Chapter 3


Most word processors are able to read and write some versions of RTF. There are sev-
eral different revisions of RTF specification and portability of files will depend on what
version of RTF is being used. RTF specifications are changed and published with major
Microsoft Word and Office versions.

It should not be confused with enriched text (mimetype ”text/enriched” of RFC 1896)
or its predecessor Rich Text (mimetype ”text/richtext” of RFC 1341 and 1521); nor with
IBM’s RFT-DCA (Revisable Format Text-Document Content Architecture) which are
completely different specifications.



3.1.3     Hypertext Markup Language (HTML)
HTML is the markup language used for most web pages. E-books using HTML can be
read using a Web browser. The specifications for the format are available without charge
from the W3C.

HTML adds specially marked meta-elements to otherwise plain text encoded using char-
acter sets like ASCII or UTF-8. As such, suitably formatted files can be, and sometimes
are, generated by hand using a plain text editor or programmer’s editor. Many HTML
generator applications exist to ease this process and often require less intricate knowledge
of the format details involved.

HTML on its own is not a particularly efficient format to store information in, requiring
more storage space for a given work than many other formats. However, several e-Book
formats including the Amazon Kindle, Open eBook, Compressed HM, Mobipocket and
EPUB store each book chapter in HTML format, then use ZIP compression to compress
the HTML data, images, metadata and style sheets into a single, significantly smaller,
file.HTML files encompass a wide range of standards and displaying HTML files correctly
can be complicated. Additionally many of the features supported, such as forms, are not
relevant to e-books.



3.1.4     Text Document(.txt)
A text file (sometimes spelled ”textfile”: an old alternate name is ”flatfile”) is a kind
of computer file that is structured as a sequence of lines of electronic text. A text file
exists within a computer file system. The end of a text file is often denoted by placing one
or more special characters, known as an end-of-file marker, after the last line in a text file.
Dept. of CS & E, GEC, Hassan                                                               7
Digital Reader                                                                  Chapter 3


”Text file” refers to a type of container, while plain text refers to a type of content.
Text files can contain plain text, but they are not limited to such.At a generic level of
description, there are two kinds of computer files: text files and binary files.




3.1.5    Extensible Markup Language (.xml)
Extensible Markup Language (XML) is a markup language that defines a set of rules for
encoding documents in a format that is both human-readable and machine-readable. It
is defined in the XML 1.0 Specification produced by the W3C, and several other related
specifications, all gratis open standards.

The design goals of XML emphasize simplicity, generality, and usability over the In-
ternet. It is a textual data format with strong support via Unicode for the languages of
the world. Although the design of XML focuses on documents, it is widely used for the
representation of arbitrary data structures, for example in web services.

Many application programming interfaces (APIs) have been developed for software de-
velopers to use to process XML data, and several schema systems exist to aid in the
definition of XML-based languages.

As of 2009, hundreds of XML-based languages have been developed, including RSS,
Atom, SOAP, and XHTML. XML-based formats have become the default for many office-
productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org and
LibreOffice (OpenDocument), and Apple’s iWork. XML has also been employed as the
base language for communication protocols, such as XMPP.




3.1.6    Image viewer
An image viewer or image browser is a computer program that can display stored graphi-
cal image; it can often handle various graphics file formats. Such software usually renders
the image according to properties of the display such as color depth, display resolution,
and color profile.

Although you may use a full-featured bitmap graphics editor (such as Photoshop or
the GIMP or the StylePix) as an image viewer, these have many editing functionalities
Dept. of CS & E, GEC, Hassan                                                            8
Digital Reader                                                                   Chapter 3

which are not needed for just viewing images, and therefore usually start rather slowly.
Also, most viewers have functionalities that editors usually lack, such as stepping through
all the images in a directory (possibly as a slideshow).

Image viewers give maximal flexibility to the user by providing a direct view of the
directory structure available on a hard disk. Most image viewers do not provide any kind
of automatic organization of pictures and therefore the burden remains on the user to
create and maintain their folder structure (using tag- or folder-based methods). How-
ever, some image viewers also have features for organizing images, especially an image
database, and hence can also be used as image organizers.

Some image viewers, such as Windows Photo Viewer that comes with Windows oper-
ating systems, change a JPEG image if it is rotated, resulting in loss of image quality;
others offer lossless rotation.



3.1.7    File Search
Desktop search is the name for the field of search tools which search the contents of a
user’s own computer files, rather than searching the Internet. These tools are designed to
find information on the user’s PC, including web browser histories, e-mail archives, text
documents, sound files, images and video.

One of the main advantages of desktop search programs is that search results arrive
in a few seconds; Windows search companion ”can be some help, but it searches through
Windows files and folders only, not e-mail or contact databases, and unless you enable
the Indexing Service (in Windows 2000 or XP), the Windows search tool is extremely
slow.” Windows Vista enables the indexing service by default.

Desktop search is emerging as a concern for large firms for two main reasons: untapped
productivity and security. A commonly cited statistic states that 80 per cent of a com-
pany’s data is locked up inside unstructured data the information stored on an end
user’s PC, the files and directories they’ve created on a network, documents stored in
repositories such as corporate intranets and a multitude of other locations. Moreover,
many companies have structured or unstructured information stored in older file formats
to which they don’t have ready access.

Desktop search engines build and maintain an index database to achieve reasonable per-
formance when searching several gigabytes of data. Indexing usually takes place when

Dept. of CS & E, GEC, Hassan                                                             9
Digital Reader                                                                   Chapter 3

the computer is idle and most search applications can be set to suspend it if a portable
computer is running on batteries, in order to save power. When indexing the files, desktop
search tools collect three types of information about files:

   • file and directory names.

   • metadata, such as titles, authors, comments in file types such as MP3, PDF and
     JPEG.

   • content of supported documents.

Besides programs that use indexing, there are many programs that open and search files
instantly. Their disadvantage is that they can search only a certain directory, not the
entire computer, but their great advantage is that they do not load the resources of com-
puter with indexing. Furthermore, they always use the current status of the documents.



3.1.8    Text Search
A search engine is an information retrieval system designed to help find information
stored on a computer system. The search results are usually presented in a list and are
commonly called hits. Search engines help to minimize the time required to find informa-
tion and the amount of information which must be consulted, akin to other techniques
for managing information overload.

The most public, visible form of a search engine is a Web search engine which searches
for information on the World Wide Web.

Search engines provide an interface to a group of items that enables users to specify
criteria about an item of interest and have the engine find the matching items. The
criteria are referred to as a search query. In the case of text search engines, the search
query is typically expressed as a set of words that identify the desired concept that one
or moredocuments may contain. There are several styles of search query syntax that
vary in strictness. It can also switch names within the search engines from previous sites.
Whereas some text search engines require users to enter two or three words separated by
white space, other search engines may enable users to specify entire documents, pictures,
sounds, and various forms of natural language. Some search engines apply improvements
to search queries to increase the likelihood of providing a quality set of items through a
process known as query expansion.




Dept. of CS & E, GEC, Hassan                                                            10
Digital Reader                                                                  Chapter 3




3.1.9     File comparison
File comparison in computing compares the contents of computer files, finding their com-
mon contents and their differences. The result of the comparison may be presented in
a graphic user interface or as part of larger tasks in networks, file systems, or revision
control.

Some widely-used file comparison programs are diff, cmp, FileMerge, Araxis Merge, Win-
Merge, Beyond Compare, and Microsoft File Compare.

Many text editors and word processors perform file comparison to highlight the changes
to a document.




3.2     Technologies Used
3.2.1    Java programming language
Java is a programming language originally developed by James Gosling at Sun Microsys-
tems (which has since merged into Oracle Corporation) and released in 1995 as a core
component of Sun Microsystems’ Java platform. Java applications are typically compiled
to bytecode (class file) that can run on any Java Virtual Machine (JVM) regardless of
computer architecture. Java is a general-purpose, concurrent, class-based, object-oriented
language that is specifically designed to have as few implementation dependencies as pos-
sible. It is intended to let application developers ”write once, run anywhere” (WORA),
meaning that code that runs on one platform does not need to be recompiled to run
on another. Java is currently one of the most popular programming languages in use,
particularly for client-server web applications.

There were five primary goals in the creation of the Java language:
Dept. of CS & E, GEC, Hassan                                                           11
Digital Reader                                                                 Chapter 3

It   should   be simple, object-oriented and familiar
It   should   be robust and secure
It   should   be architecture-neutral and portable
It   should   execute with high performance
It   should   be interpreted, threaded, and dynamic

    Java is an object-oriented programming language with a built-in application program-
ming interface (API) that can handle graphics and user interfaces and that can be used
to create applications or applets. Because of its rich set of API’s, similar to Macintosh
and Windows, and its platform independence, Java can also be thought of as a platform
in itself. Java also has standard libraries for doing mathematics.

Much of the syntax of Java is the same as C and C++. One major difference is that Java
does not have pointers. However, the biggest difference is that you must write object
oriented code in Java. Procedural pieces of code can only be embedded in objects. In the
following we assume that the reader has some familiarity with a programming language.
In particular, some familiarity with the syntax of C/C++ is useful.
Major release versions of Java, along with their release dates:
JDK 1.0 (January 23, 1996)
JDK 1.1 (February 19, 1997)
J2SE 1.2 (December 8, 1998)
J2SE 1.3 (May 8, 2000)
J2SE 1.4 (February 6, 2002)
J2SE 5.0 (September 30, 2004)
Java SE 6 (December 11, 2006)
Java SE 7 (July 28, 2011)

Sun has defined and supports four editions of Java targeting different application envi-
ronments and segmented many of its APIs so that they belong to one of the platforms.
The platforms are:
Java Card for smartcards.
Java Platform, Micro Edition (Java ME) targeting environments with limited resources.
Java Platform, Standard Edition (Java SE) targeting workstation environments.
Java Platform, Enterprise Edition (Java EE) targeting large distributed enterprise or In-
ternet environments.

The original and reference implementation Java compilers, virtual machines, and class
libraries were developed by Sun from 1995. As of May 2007, in compliance with the spec-
Dept. of CS & E, GEC, Hassan                                                          12
Digital Reader                                                                  Chapter 3

ifications of the Java Community Process, Sun relicensed most of its Java technologies
under the GNU General Public License.
In Java we distinguish between applications, which are programs that perform the same
functions as those written in other programming languages, and applets, which are pro-
grams that can be embedded in a Web page and accessed over the Internet. Our initial
focus will be on writing applications. When a program is compiled, a byte code is pro-
duced that can be read and executed by any platform that can run Java.




3.2.2    Swings
Swing is far from a simple component toolkit, however. It includes rich undo support,
a highly customizable text package, integrated internationalization and accessibility sup-
port. To truly leverage the cross-platform capabilities of the Java platform, Swing sup-
ports numerous look and feels, including the ability to create your own look and feel.
The ability to create a custom look and feel is made easier with Synth, a look and feel
specifically designed to be customized. Swing wouldn’t be a component toolkit without
the basic user interface primitives such as drag and drop, event handling, customizable
painting, and window management.
Dept. of CS & E, GEC, Hassan                                                           13
Digital Reader                                                                 Chapter 3


The Swing toolkit includes a rich set of components for building GUIs and adding inter-
activity to Java applications. Swing includes all the components you would expect from
a modern toolkit: table controls, list controls, tree controls, buttons, and labels.

Swing is part of the Java Foundation Classes (JFC). The JFC also include other fea-
tures important to a GUI program, such as the ability to add rich graphics functionality
and the ability to create a program that can work in different languages and by users
with different input devices.




The following list shows some of the features that Swing and the Java Foundation Classes
provide.



Swing GUI Components

The Swing toolkit includes a rich array of components: from basic components, such as
buttons and check boxes, to rich and complex components, such as tables and text. Even
deceptively simple components, such as text fields, offer sophisticated functionality, such
as formatted text input or password field behavior. There are file browsers and dialogs
to suit most needs, and if not, customization is possible. If none of Swing’s provided
Dept. of CS & E, GEC, Hassan                                                          14
Digital Reader                                                                   Chapter 3

components are exactly what you need, you can leverage the basic Swing component
functionality to create your own.



Java 2D API

To make your application stand out; convey information visually; or add figures, images,
or animation to your GUI, you’ll want to use the Java 2D API. Because Swing is built on
the 2D package, it’s trivial to make use of 2D within Swing components. Adding images,
drop shadows, compositing it’s easy with Java 2D.



Pluggable Look-and-Feel Support

Any program that uses Swing components has a choice of look and feel. The classes
shipped by Oracle provide a look and feel that matches that of the platform. The Synth
package allows you to create your own look and feel. The GTK+ look and feel makes
hundreds of existing look and feels available to Swing programs.

A program can specify the look and feel of the platform it is running on, or it can
specify to always use the Java look and feel, and without recompiling, it will just work.
Or, you can ignore the issue and let the UI manager sort it out.



Data Transfer

Data transfer, via cut, copy, paste, and drag and drop, is essential to almost any applica-
tion. Support for data transfer is built into Swing and works between Swing components
within an application, between Java applications, and between Java and native applica-
tions.



Internationalization

This feature allows developers to build applications that can interact with users world-
wide in their own languages and cultural conventions. Applications can be created that
accept input in languages that use thousands of different characters, such as Japanese,
Chinese, or Korean.

Swing’s layout managers make it easy to honor a particular orientation required by the
UI. For example, the UI will appear right to left in a locale where the text flows right to

Dept. of CS & E, GEC, Hassan                                                            15
Digital Reader                                                                     Chapter 3

left. This support is automatic: You need only code the UI once and then it will work
for left to right and right to left, as well as honor the appropriate size of components that
change as you localize the text.



Accessibility API

People with disabilities use special software assistive technologies that mediates the user
experience for them. Such software needs to obtain a wealth of information about the
running application in order to represent it in alternate media: for a screen reader to read
the screen with synthetic speech or render it via a Braille display, for a screen magnifier
to track the caret and keyboard focus, for on-screen keyboards to present dynamic key-
boards of the menu choices and toolbar items and dialog controls, and for voice control
systems to know what the user can control with his or her voice. The accessibility API
enables these assistive technologies to get the information they need, and to program-
matically manipulate the elements that make up the graphical user interface.



Undo Framework API

Swing’s undo framework allows developers to provide support for undo and redo. Undo
support is built in to Swing’s text component. For other components, Swing supports
an unlimited number of actions to undo and redo, and is easily adapted to an applica-
tion. For example, you could easily enable undo to add and remove elements from a table.



Flexible Deployment Support

If you want your program to run within a browser window, you can create it as an applet
and run it using Java Plug-in, which supports a variety of browsers, such as Internet
Explorer, Firefox, and Safari. If you want to create a program that can be launched from
a browser, you can do this with Java Web Start. Of course, your application can also run
outside of browser as a standard desktop application.




3.3      JPedal Library
JPedal is a Java library for viewing and manipulating PDF files. It aims to provide Java
developers with complete PDF reader functionality as well as PDF content extraction.
The full version of JPedal is distributed under a commercial license with a cut down
version available under a LGPL license.
Dept. of CS & E, GEC, Hassan                                                              16
Digital Reader                                                                   Chapter 3


The full version of JPedal provides a PDF viewer with multiple page viewing modes,
PDF to image conversion, printing, searching and text/image extraction. It can be used
as part of a client or server Swing or SWT application, thin client, applet, JavaFX, JSP
or webstart.The LGPL versions lacks some of the functionality of the commercial version
such as printing, extraction and different page display options.




3.4     Eclipse
Eclipse is a multi-language software development environment comprising an integrated
development environment (IDE) and an extensible plug-insystem. It is written mostly
in Java. It can be used to develop applications in Java and, by means of various plug-
ins, other programming languagesincluding Ada, C, C++, COBOL, Haskell, Perl, PHP,
Python, R, Ruby (including Ruby on Rails framework), Scala, Clojure, Groovy and
Scheme. It can also be used to develop packages for the software Mathematica. Develop-
ment environments include the Eclipse Java development tools (JDT) for Java, Eclipse
CDT for C/C++, and Eclipse PDT for PHP, among others.



3.4.1    Running Eclipse
After installing the Eclipse SDK in a directory, you can start the Workbench by running
the Eclipse executable included with the release (you also need a 1.4.2 JRE). On Win-
dows, the executable file is called eclipse.exe, and is located in the eclipse sub-directory
of the install.
To start Eclipse double-click on the file eclipse.exe (Microsoft Windows) or eclipse (Linux
/ Mac) in the directory where you unpacked Eclipse. The system will prompt you for
a workspace. The workspace is the place where you store your Java projects (more on
workspaces later).Select an empty directory and press Ok.Figure 3.1 shows workspace
launcher.




3.4.2    Creating New Project
Select from the menu File then New then project.Enter Digital Reader as the project
name.Select the flag Create separate folders for sources and class files.Figure 3.2 shows
creating of new project.

Dept. of CS & E, GEC, Hassan                                                            17
Digital Reader                                                              Chapter 3


                           Figure 3.1: Workspace launcher




                          Figure 3.2: Creating New Project




Press finish to create the project. A new project is created and displayed as a folder.
Open the Digital Reader folder and explore the content of this folder.




Dept. of CS & E, GEC, Hassan                                                       18
Digital Reader                                                                   Chapter 3

3.4.3    Create package
Create a new package. A good convention is to use the same name for the top package
and the project. Create therefore the package controller.
Select the folder src, right click on it and select New then Package.Figure 3.3 shows cre-
ating package.



                              Figure 3.3: Creating Package




3.4.4    Create Java class
We will now create a Java class. Right click on your package and select New then Class.
Enter in a class name and click finish. Eclipse may yell at you and say ”The use of the
default package is discouraged.” Packages in Java provide a nice means for organizing
your classes and this is a good thing. Nevertheless, for just getting this example working,
we can skip thinking about packages.Figure 3.4 creating java class.




Dept. of CS & E, GEC, Hassan                                                            19
Digital Reader                                                              Chapter 3

                            Figure 3.4: Create Java class




3.4.5    Run your project in Eclipse
Now run your code. Right click on your Java class and select Run-as then Java applica-
tion.Figure 3.5 shows this.Finished! You can see the output in the console.


                            Figure 3.5: Run Java Project




Dept. of CS & E, GEC, Hassan                                                       20
Digital Reader                                                                     Chapter 3

3.4.6     Prepare to run program outside Eclipse (create JAR file)
To run your Java program outside of Eclipse you need to export it as a jar file. A jar file
is the standard distribution format for Java applications.
Select your project, right click on it and select Export.Figure 3.6 shows creating of JAR
file.




                            Figure 3.6: Creating the JAR file




Select JAR file, select Next and then Select your project and maintain the export desti-
nation and a name for the jar file. we named it Digital Reader.jar.



3.4.7     Adding a library (.jar ) to your project
The following describes how to add Java libraries to your project. Java libraries are
distributed via ”jar” files. It assumes that you have a jar file available; if not feel free to
skip this step.
Create a new Java project Digital Reader Then, create a new folder called lib, by right
clicking on your project and selecting New then Folder.

From the menu select File then Import then General then File System. Select your
Dept. of CS & E, GEC, Hassan                                                              21
Digital Reader                                                                     Chapter 3

jar and select the folder lib as target. Alternatively, just copy and paste your jar file into
the ”lib” folder.
Right click on your project and select Properties. Under Java Build Path then Libraries
select the button ”Add JARs”.The following example shows how the result would look
like, if the junit-4.4.jar had been added to the project.Figure 3.7 shows the adding of
library to the project.



                       Figure 3.7: Adding a library(.jar) to project




Afterwards you can use the classes contained in the jar file in your Java source code.Press
finish. This creates a jar file in your selected output directory.




3.5      Java Runtime Environment(JRE)
The Java Runtime Environment (JRE) provides the libraries, the Java Virtual Machine,
and other components to run applets and applications written in the Java programming
language. In addition, two key deployment technologies are part of the JRE: Java Plug-
in, which enables applets to run in popular browsers; and Java Web Start, whichdeploys
standalone applications over a network.The Java Runtime Environment (JRE) includes
the JVM, as the JRE provides some standard libraries and the JVM which can be used
Dept. of CS & E, GEC, Hassan                                                              22
Digital Reader                                                                     Chapter 3

to execute a Java program.A Java virtual machine (JVM) is a virtual machine that can
execute Java bytecode.

It is the code execution component of the Java software platform.A Java virtual ma-
chine is software that is implemented on virtual and non-virtual hardware and on stan-
dard operating systems. A JVM provides an environment in which Java bytecode can
be executed, enabling such features as automated exception handling, which provides
root- cause debugging information for every software error (exception), independent of
the source code.
A JVM is distributed along with a set of standard class libraries that implement the Java
application programming interface (API). Appropriate APIs bundled together with JVM
form the Java Runtime Environment (JRE).JVMs are available for many hardware and
software platforms.Thus, the JVM is a crucial component of the Java platform.

The Java Development Kit (JDK) is an Oracle Corporation product aimed at Java
developers. Since the introduction of Java, it has been by far the most widely used Java
Software Development Kit (SDK).
The JDK has as its primary components a collection of programming tools, including:
java :The loader for Java applications. This tool is an interpreter and can interpret the
class files generated by the javac compiler.
javac :The compiler, which converts source code into Java bytecode.
appletviewer :This tool can be used to run and debug Java applets without a web browser.

    The JDK also comes with a complete Java Runtime Environment, usually called a
private runtime, due to the fact that it is separated from the regular JRE and has extra
contents. It consists of a Java Virtual Machine and all of the class libraries present in the
production environment, as well as additional libraries only useful to developers, such as
the internationalization libraries and the IDL libraries.
The Java Developers Kit (JDK) also includes the JVM, standard class libraries, and sev-
eral other tools that a developer needs in order to create a Java program.JRE contains
the runtime environment such as JVM and other Java classes (AWT, SWING), but does
not contain any development tools such as a compiler or a debugger.Both JDK and JRE
contains the JVM.




3.6      Hardware Requirements
Processor :Pentium III onwords

Dept. of CS & E, GEC, Hassan                                                              23
Digital Reader                            Chapter 3

RAM:128MB
Hard disk space:5GB
Best view in 1024x768 screen resolution


3.7     Software Requirements
Operating System: Windows/Linux/Mac
Runtime Environment:JRE or JDK




Dept. of CS & E, GEC, Hassan                    24
Chapter 4

Design and Implementation

4.1     Model-View-Controller(MVC)
Model-view-controller (MVC) design is most commonly used with swings because of their
connection and since weve implemented this project in swings, we chose MVC for our
design of User Interface. MVC was first introduced by Trygve Reenskaug, a Smalltalk
developer at the Xerox Palo Alto Research Center in 1979, and helps to decouple data
access and business logic from the manner in which it is displayed to the user. More
precisely, MVC can be broken down into three elements:

   • Model: The model represents data and the rules that govern access to and updates
     of this data. In enterprise software, a model often serves as a software approximation
     of a real-world process.

   • View: The view renders the contents of a model. It specifies exactly how the
     model data should be presented. If the model data changes, the view must update
     its presentation as needed. This can be achieved by using a push model, in which
     the view registers itself with the model for change notifications, or a pull model in
     which the view is responsible for calling the model when it needs to retrieve the
     most current data.

   • Controller: The controller translates the user’s interactions with the view into
     actions that the model will perform. In a stand-alone GUI client, user interactions
     could be button clicks or menu selections, whereas in an enterprise web application,
     they appear as GET and POST HTTP requests. Depending on the context, a
     controller may also select a new view for example, a web page of results to present
     back to the user.The figure 4.1 shows the MVC architecture.




                                            25
Digital Reader                                                               Chapter 4

                            Figure 4.1: MVC Architecture




4.2     Developing an HTML Viewer
The HTML Viewer can be developed using JTextPane and HTMLEditorKit to display and
edit HTML documents. The following features are included:


   • Creating a new HTML document

   • Opening an existing HTML document

   • Saving changes

   • Saving the document under a new name/location

   • Prompting the user to save changes before loading a new document or exiting the
     application


    Class HtmlViewer
    This class extends JFrame to provide the supporting frame for this example. Several
instance variables are declared:

   • JTextPane m editor: main text component.

Dept. of CS & E, GEC, Hassan                                                        26
Digital Reader                                                                   Chapter 4

   • StyleContext m context: a group of styles and their associated resources for the
     documents in this example.

   • HTMLDocument m doc: current document model.

   • HTMLEditorKit m kit: editor kit that knows how to read/write HTML documents.

   • SimpleFilter m HTMLFilter: file filter for .HTML files.

   • JToolBar m toolBar: toolbar containing New, Open, Save buttons.

   • JFileChooser m chooser: file chooser used to load and save HTML files.

   • File m currentFile: currently opened HTML file (if any).

   • boolean m textChanged: keeps track of whether any changes have been made since
     the document was last saved.

The HtmlViewer constructor first instantiates our JTextPane and HTMLEditorKit, and
assigns the editor kit to the text pane (it is important that this is done before any docu-
ments are created). The editor component is then placed in a JScrollPane which is placed
in the center of the frame. The JFileChooser component is created and an instance of
our SimpleFilter class is used as a filter to only allow the choice of HTML documents.
A WindowListener is added to call our custom promptToSave() method to ensure that
the user has the opportunity to save any changes before closing the application. This
WindowListener also ensures that our editor component automatically receives the focus
when this application regains the focus.

The createMenuBar() method creates a menu bar with a single menu titled File and
a tool bar with three buttons. Actions for New, Open, Save, Save As and Exit are cre-
ated and added to the File menu. The New, Open and Save actions are also added to
the tool bar The important difference is that we use InputStreams and OutputStreams
rather than Readers and Writers. The reason for this is that HTML uses 1-byte encoding
which is incompatible with the 2-byte encoding used by readers and writers.

    The getDocumentName() method simply returns the name of the file correspond-
ing to the current document, or untitled if it hasnt been saved to disk.The newDocu-
ment() method is responsible for creating a new HTMLDocument instance using HTM-
LEditorKits createDefaultDocument() method. Once created our StyleContext variable,
m context, is assiged to this new documents stylesheet with HTMLDocuments get-
StyleSheet() method. The title of the frame is then updated and a Runnable instance
is created and sent to SwingUtilities.invokeLater() method to scroll the document to the
beginning when it is finished loading. Finally, an instance of our custom UpdateListener
Dept. of CS & E, GEC, Hassan                                                            27
Digital Reader                                                                 Chapter 4

class (described below) is added as a DocumentListener, and the m textChanged variable
is set to false to indicate that no changes to the document have been made yet.

The openDocument() is similar to the newDocument() method but uses the JFileChooser
to allow selection of an existing HTML file to load, and uses an InputStream object to
read the contents of that file. The saveFile() method takes a boolean parameter specify-
ing whether the method should act as a Save As process or just a regular Save. If true,
indicating a Save As process, the JFileChooser is displayed to allow the user to specify
the file and location to save the document to. An OutputStream is used to write the
contents of the document to the destination File.

The promptToSave() method checks the m textChanged flag and, if true, displays a
JOptionPane asking whether or not the current document should be saved. This method
is called before a new document is created, a document is opened or the application is
closed to ensure that the user has a chance to save any changes to the current document
before losing them.
The showError() method is used to display error messages in a JOptionPane. It is often
useful to display exceptions to users so that they know an error happened and so that
they may eventually report errors back to you if they are in fact bugs.

Class UpdateListener
    This DocumentListener subclass is used to modify the state of our m textChanged
variable. Whenever an insertion, removal or document change is made this variable is set
to true. This allows HtmlProcessors promptToSave() method to ensure the user has the
option of saving any changes before loading a new document or exiting the application.
The figure 4.2 shows the snapshot of HTML Viewer.




4.3     Developing an XML Viewer
Extensible Markup Language (XML) is a markup language that defines a set of rules for
encoding documents in a format that is both human-readable and machine-readable. It
is defined in the XML 1.0 Specification produced by the W3C, and several other related
specifications ,all gratis open standards.The design goals of XML emphasize simplicity,
generality, and usability over the Internet. It is a textual data format with strong sup-
port via Unicode for the languages of the world. Although the design of XML focuses
on documents, it is widely used for the representation of arbitrary data structures, for

Dept. of CS & E, GEC, Hassan                                                          28
Digital Reader                                                               Chapter 4

                               Figure 4.2: HTML Viewer




example in web services.Many application programming interfaces (APIs) have been de-
veloped for software developers to use to process XML data, and several schema systems
exist to aid in the definition of XML-based languages.As of 2009, hundreds of XML-based
languages have been developed,including RSS, Atom, SOAP, and XHTML. XML-based
formats have become the default for many office-productivity tools, including Microsoft
Office (Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple’s
iWork. XML has also been employed as the base language for communication protocols,
such as XMPP.This shows how to display an XML document using a JTree and Javas
built-in XML support. Because XML documents are heirarchical in nature, JTree is a
natural fit for the task.

Class XmlViewer
Two packages that are required to implement this viewer:

   • javax.xml.parsers: This package consists of classes used to process XML documents.

   • org.w3c.dom: This package consists of a set of interfaces that define the DOM
     (Document Object Model) which is an API that allows dynamic access to the
     structure and data of XML documents. Examples include Document, Element,
     Node, Attr, etc.


Dept. of CS & E, GEC, Hassan                                                        29
Digital Reader                                                                Chapter 4

XmlViewer extends JFrame and represents the main application frame for this example.
Five instance variables are defined:

   • Document m doc: the current XML document (note that this is an instance of
     org.w3c.dcom.Document; not a text document [javax.swing.text.Document]).

   • JTree m tree: the tree component used to display the current XML file.

   • DefaultTreeModel m model: tree model constructed to mimic the XML file.

   • JFileChooser m chooser: File chooser used for opening and saving XML files.

   • File m currentFile: reference to the current XML file.

    The XmlViewer constructor creates and installs a toolbar with our createToolbar()
method, instantiates tree model m model with a top node containing No XML loaded as
user data, and instantiates tree m tree with m model.
Selection is set to SINGLE TREE SELECTION and the tree is set to noneditable. A
custom tree cell renderer is created to display an appropriate icon based on whether a
node represents an XML document element or not. This renderer is assign to our tree
and the tree is then placed in a scroll pane which is added to the center of the frame.
File chooser m chooser is instantiated and an xml file filter is applied to it so only XML
files will be displayed.

The createToolbar() method creates a JToolBar with an Open button that invokes our
openDocument() method.

The getDocumentName() method retrieves the name of the current file referenced with
our m currentFile variable.

The openDocument() method shows our JFileChooser in a separate thread to allow se-
lection of an XML file to open for viewing.

javax.xml.parsers.DocumentBuilderFactorys static newInstance() is used to create a new
instance of DocumentBuilderFactory which is used to create an instance of Document-
Builder.DocumentBuilder is used to parse the selected XML file with its parse() method,
storing the resulting Document instance in our m doc variable. The root element of this
document is retrieved with Documents getDocumentElement() method. This element is
used as the root node for our tree model, and our custom createTreeNode() creates the
tree node heirarchy corresponding to the node passed to it. The resulting tree node is
then set as the root node of our tree and our custom expandTree() method is used to
expand all nodes to display the entire XML document.
Dept. of CS & E, GEC, Hassan                                                         30
Digital Reader                                                                Chapter 4

The createTreeNode() method takes a root node (instance of org.w3c.dom.Node) as pa-
rameter [note that org.w3c.dom.Element is a subinterface of Node]. Our canDisplayN-
ode() method is used to find out whether the root node is either an element or a text
node. If so an instance of our custom XmlViewerNode is created to represent that node.
Then a NodeList is created representing all child nodes of the root node. For each child
node a tree node is created by passing it recursively to createTreeNode().

The canDisplayNode() method checks whether a given Node is of type ELEMENT NODE
or TEXT NODE. If it isnt one of these types it should not be displayed in our tree.
The showError() method is used to display exceptions in a JOptionPane dialog. The
static expandTree() methods are responsible for expanding each parent node so that the
entire tree is expanded and visible in the viewer.

Class XmlViewerNode
This class extends DefaultMutableTreeNode to represent the XML nodes in our viewer.
The main customization is in the overriden toString() implementation which returns a
textual representation of the node depending on whether it is of type ELEMENT NODE
or TEXT NODE. The figure 4.3 shows the snapshot of XML Viewer.




4.4     Developing a Text Viewer
Class BasicTextEditor
This class extends JFrame and provides the parent frame for our example. Two class
variables are declared:

   • String APP NAME: name of this example used in title bar.

   • String FONTS[]: an array of font family names.

Instance variables:

   • Font[]m fonts: an array of Font instances which can be used to render our JTextArea
     editor.

   • JTextArea m editor: used as our text editor.

   • JMenuItem[] m fontMenus: an array of menu items representing available fonts.

   • JCheckBoxMenuItem m bold: menu item which sets/unsets the bold property of
     the current font.
Dept. of CS & E, GEC, Hassan                                                         31
Digital Reader                                                                Chapter 4

                               Figure 4.3: Xml Viewer




   • JCheckBoxMenuItem m italic: menu item which sets/unsets the italic property of
     the current font.

   • JFileChooser m chooser: used to load and save simple text files.

   • File m currentFile: the current File instance corresponding to the current docu-
     ment.

   • boolean m textChanged: will be set to true if the current document has been
     changed; will be set to false if the document was just opened or saved. This flag is
     used in combination with a DocumentListener to determine whether or not to save
     the current document before dismissing it.

The BasicTextEditor constructor populates our m fonts array with Font instances corre-
sponding to the names provided in FONTS[]. The m editor JTextArea is then created
and placed in a JScrollPane. This scroll pane is added to the center of our frames con-
tent pane and we append some simple text to m editor for display at startup. Our
Dept. of CS & E, GEC, Hassan                                                         32
Digital Reader                                                                  Chapter 4

createMenuBar() method is called to create the menu bar to manage this application,
and this menu bar is then added to our frame using the setJMenuBar() method. The
createMenuBar() method creates and returns a JMenuBar. Each menu item receives an
ActionListener to handle it’s selection. Two menus are added titled File and Font. The
File menu is assigned a mnemonic character, f, and by pressing ALT+F while the ap-
plication frame is active, its popup will be displayed allowing navigation with either the
mouse or keyboard. The Font menu is assigned the mnemonic character o.

The New menu item in the File menu is responsible for creating a new (empty) document.
It doesnt really replace JTextAreas Document. Instead it simply clears the contents of
our editor component. Before it does so, however, it calls our custom promptToSave()
method to determine whether we want to continue without saving the current changes (if
any) or not. Note that an icon is used for this menu item. Also note that this menu item
can be selected with the keyboard by pressing n when the File menus popup is visible,
because we assigned it n as a mnemonic. We also assigned it the accelerator CTRL+N.
Therefore, this menus action will be directly invoked whenever that key combination
is pressed. (All other menus and menu items in this example also receive appropriate
mnemonics and accelerators.)

The Open menu item brings up our m chooser JFileChooser component to allow se-
lection of a text file to open. Once a text file is selected, we open a FileReader on it
and invoke read() on our JTextArea component to read the file’s content (which creates a
new PlainDocument containing the selected files content to replace the current JTextArea
document ). The Save menu item brings up m chooser to select a destination and file
name to save the current text to (if previously not set). Once a text file is selected, we
open a FileWriter on it and invoke write() on our JTextArea component to write its con-
tent to the destination file. The Save As menu is similar to the Save menu, but prompts
the user to select a new file. The Exit menu item terminates program execution. This is
separated from the first three menu items with a menu separator to create a more logical
display.

    The Font menu consists of several menu items used to select the font and font style
used in our editor. All of these items receive the same ActionListener which invokes our
updateEditor() method. To give the user an idea of how each font looks, each font is
used to render the corresponding menu item text. Since only one font can be selected
at any given time, we use JRadioButtonMenuItems for these menu items, and add them
all to a ButtonGroup instance which manages a single selection. To create each menu
item we iterate through our FONTS array and create a JRadioButtonMenuItem corre-
sponding to each entry. Each item is set to unselected (except for the first one), assigned
Dept. of CS & E, GEC, Hassan                                                           33
Digital Reader                                                                  Chapter 4

a numerical mnemonic corresponding to the current FONTS array index, assigned the
appropriate Font instance for rendering its text, assigned our multi-purpose ActionLis-
tener, and added to our ButtonGroup along with the others.

The two other menu items in the Font menu manage the bold and italic font properties.
They are implemented as JCheckBoxMenuItems since these properties can be selected or
unselected independently. These items also are assigned the same ActionListener as the
radio button items to process changes in their selected state.

The updateEditor() method updates the current font used to render the editing com-
ponent by checking the state of each check box item and determining which radio button
item is currently selected. The m bold and m italic components are disabled and uns-
elected if the Courier font is selected, and enabled otherwise. The appropriate m fonts
array element is selected and a Font instance is derived from it corresponding to the
current state of the check box items using Fonts deriveFont() method.

The figure 4.4 shows the snapshot of Text Viewer.




4.5     Developing an RTF Viewer
Using Styles to manage a set of attributes as a single named entity can greatly simplify
text editing. The user only has to apply a known style to a selected region of text rather
than selecting all appropriate text attributes from the provided toolbar components. By
adding a combo box allowing the choice of styles, we can not only save the user time and
effort, but we can also provide more uniform text formatting throughout the resulting
document (or potentially set of documents). In this section we’ll add style management
to our word processor. Well also show how it is possible to create a new style, modify an
existing style, or reapply a style to modified text.

Class WordProcessor
One new instance variable has been added: JComboBox m cbStyles: toolbar component
to manage styles.
Note that a new custom method showStyles() (see below) is now called after creating a
new document or after loading an existing one.
    The createMenuBar() method creates a new menu with two new menu items for up-
dating and reapplying styles, and a new combo box for style selection. The editable
styles combobox, m cbStyles, will hold a list of styles declared in the current document.
Dept. of CS & E, GEC, Hassan                                                           34
Digital Reader                                                                  Chapter 4

                           Figure 4.4: Text Viewer and editor




It receives an ActionListener which checks whether the currently selected style name is
present among the existing styles. If not, we add it to the drop-down list and retrieve
a new Style instance for the selected name using StyledDocuments addStyle() method.
This new Style instance is associated with the text attributes of the character element
at the current caret position. Otherwise, if the given style name is known already, we
retrieve the selected style using StyledDocuments getStyle() method and apply it to the
selected text by passing it to our custom setAttributeSet() method.



An ambiguous situation occurs when the user selects a style for text which already has the
same style, but whose attributes have been modified. The user may either want to update
the selected style using the selected text as a base, or reapply the existing style to the
selected text. To resolve this situation we need to ask the user what to do. We chose to
add two menu items which allow the user to either update or reapply the current selection.


Dept. of CS & E, GEC, Hassan                                                           35
Digital Reader                                                                  Chapter 4

    The menu items to perform these tasks are titled Update and Reapply, and are
grouped into the Style menu. The Style menu is added to the Format menu. The
Update menu item receives an ActionListener which retrieves the text attributes of the
character element at the current caret position, and assigns them to the selected style.
The Reapply menu item receives an ActionListener which applies the selected style to the
selected text (one might argue that this menu item would be more appropriately titled
Apply – the implications are ambiguous either way).

   Our showAttributes() method receives additional code to manage the new styles com-
bobox, m cbStyles, when the caret moves through the document. It retrieves the style cor-
responding to the current caret position with StyledDocuments getLogicalStyle() method,
and selects the appropriate entry in the combobox.

    The new showStyles() method is called to populate the m cbStyles combobox with
the style names from a newly created or loaded document. First it removes the current
content of the combobox if it is not empty (another work around due to the fact that
if you call removeAllItems() on an empty JComboBox, an exception will be thrown).
An Enumeration of style names is then retrieved with StyledDocuments getStyleNames()
method, and these names are added to the combobox.

    Open an existing RTF file, and note how the styles combobox is populated by the
style names defined in this document. Verify that the selected style is automatically
updated while the caret moves through the document. Select a portion of text and select
a different style from the styles combobox. Note how all text properties are updated
according to the new style.

    Try selecting a portion of text and modifying its attributes (for instance, foreground
color). Type a new name in the styles combobox and press Enter. This will create a new
style which can be applied to any other document text.

   The figure 4.5 shows the snapshot of RTF Viewer.




4.6     PDF Viewer
Portable Document Format (PDF) is a file format used to represent documents in a man-
ner independent of application software, hardware, and operating systems. Each PDF
file encapsulates a complete description of a fixed-layout flat document, including the
Dept. of CS & E, GEC, Hassan                                                           36
Digital Reader                                                                 Chapter 4

                                Figure 4.5: RTF Viewer




text, fonts, graphics, and other information needed to display it. In 1991, Adobe Systems
co-founder John Warnock outlined a system called ”Camelot” that evolved into PDF.

A Java PDF Viewer which is powerful and fully-featured, but simple to setup, inte-
grate and customise. JPedal offers everything you need from a multi-platform Java PDF
viewer, including PDF Search and extraction. It also offers full PDF to Print support.
Viewing PDF is not an add-on, nor an afterthought with the JPedal Java PDF library.
The PDF Viewer is the central feature, and we have spent over 10 years developing our
fully-featured PDF viewer in Java.

Class PDFViewer
construct a pdf viewer, passing in the full file name using JPanelDemo(String name).
FontMappings.setFontReplacements() Ensure non-embedded font map to sensible replace-
ments and store file name for use in page changer after that it opens the PDF and reads
its internal details. check if encryption present and acertain password, return true if

Dept. of CS & E, GEC, Hassan                                                          37
Digital Reader                                                               Chapter 4

content accessable using boolean checkEncryption() method.

check if file is encrypted if file has a null password it will have been decoded and is-
FileViewable will return true and popup window if password needed after that try and
reopen with new password and create a standalone program. User may pass in name of
file as option.

   The figure 4.6 shows the snapshot of PDF Viewer.


                               Figure 4.6: PDF Viewer




4.7     Text Search
The Text Search utility is constructed by making use of java regular expression library
called Swinggrep.
Dept. of CS & E, GEC, Hassan                                                        38
Digital Reader                                                                   Chapter 4

Functinality
1.Showing each file path only once, giving more spaces for the lines found.
2.Searching all subdirectories recursively (this feature can be disabled with a checkbox.
3.Highlighting every match in a line, not just the first one.
4.Better colors.
5.Statistics about matches.
6.Use of swing rather than awt for graphics.
7.The ability to save results.



Class GrepFrame
First we will create a main class to provide a platform input actions through Jbut-
tons,Jfilechooser and Jtextarea. A frame to search the contents of a set of files with
pathnames based on the current directory (as per ’grep’) and display the results.

   Regex SearchPattern;
The above statement indicates that The pattern we’re searching for.after that we will
create a main window to perform search action.
GrepFrame MainFrame;

String FilePathnameString = new String();
The above staement indicates that The pathname of the file currently being used.which
provides the absolute path of the file.
GrepDirectoryWalker GrepSubDirWalker = new GrepDirectoryWalker();
The walker that performs the search on each directory.The document into which the re-
sults of our search are placed. Note that this must be declared after GrepSubDirWalker,
because it uses the latter’s
JTextPane ResultsTextPane = new JTextPane(ResultsDocument);
The results display the area itself.Container buildFirstLine() method Construct a con-
tainer for the main components of the first line of the return A new container of all the
main first line components. Container buildSecondLine() method Construct a container
for the main components of the second line of the screen. return A new container of all the
main second line components. setResultsFont() method Set the font used for displaying
the results. The size of the font is determined from that of the current metal user text
font. Construct file name of the form ’grepyyyymmddpattern.txt’. return Default file
name into which to save the current results using createDefaultSaveFileName() method.




Dept. of CS & E, GEC, Hassan                                                            39
Digital Reader                                                                     Chapter 4

Pop up a dialog to display the given exception. param InputException. The excep-
tion whose message is to be displayed using showExceptionDialog(Exception InputEx-
ception) method.Get the results from the last search in the form of a string. return A
string representation of the results using String getResultsString() method.create a class
GrepFrameListener to receive frame events, to allow us the opportunity to set the focus
to the right field.

The figure 4.7 shows the snapshot of Text Search.




                                  Figure 4.7: Text search




4.8      File Search
The file search utility searches only related files which the user has requested it also proves
the absolute path of the file.

Dept. of CS & E, GEC, Hassan                                                              40
Digital Reader                                                               Chapter 4

Class FileSearch
A search engine is an information retrieval system designed to help find information
stored on a computer system. The search results are usually presented in a list and are
commonly called hits. Search engines help to minimize the time required to find informa-
tion and the amount of information which must be consulted, akin to other techniques
for managing information overload.
First we will create a main class to provide a platform input actions through Jbut-
tons,Jfilechooser and Jtextarea.we will set the main path for the file to be searched.
The figure 4.8 shows the snapshot of File Search.


                                Figure 4.8: File search




Dept. of CS & E, GEC, Hassan                                                        41
Digital Reader                                                                  Chapter 4

4.9     Image Viewer
Class ImageViewer
An image viewer or image browser is a computer program that can display stored graphi-
cal image; it can often handle various graphics file formats. Such software usually renders
the image according to properties of the display such as color depth, display resolution,
and color profile.

Image image;
boolean scaled;
private JTextArea log;
These are the variables which are necessary to display images.
filter = new ExtensionFileFilter(new String[] ”gif”, ”GIF”, ”jpg”, ”JPG”, ”jpeg”, ”JPEG”,
”GIF and JPG image files”);
chooser.addChoosableFileFilter(filter);
chooser.removeChoosableFileFilter(chooser.getAcceptAllFileFilter());
The figure 4.9 shows the snapshot of Image Viewer.


                                Figure 4.9: Image Viewer




Dept. of CS & E, GEC, Hassan                                                           42
Chapter 5

Results and analysis

As the digital reader supports multiple document formats under a single platform that is
working in architecture independent nature. This is because of the language we have used
to develop this application is Java.inorder to analyse our application the has to undergo
various testing and analysis on the application.


5.1     Testing
Software testing is an investigation conducted to provide stakeholders with information
about the quality of the product or service under test. Software testing can also provide
an objective, independent view of the software to allow the business to appreciate and
understand the risks of software implementation. Test techniques include, but are not
limited to, the process of executing a program or application with the intent of finding
software bugs (errors or other defects).
Software testing can be stated as the process of validating and verifying that a software
program/application/product:
    1.meets the requirements that guided its design and development.
2.works as expected.
3.can be implemented with the same characteristics.
4.satisfies the needs of stakeholders.

    Software testing, depending on the testing method employed, can be implemented
at any time in the development process. However, most of the test effort traditionally
occurs after the requirements have been defined and the coding process has been com-
pleted. Although in the Agile approaches most of the test effort is, conversely, on-going.
As such, the methodology of the test is governed by the software development method-
ology adopted.


                                           43
Digital Reader                                                                     Chapter 5

Different software development models will focus the test effort at different points in
the development process. Newer development models, such as Agile, often employ test
driven development and place an increased portion of the testing in the hands of the
developer, before it reaches a formal team of testers. In a more traditional model, most
of the test execution occurs after the requirements have been defined and the coding
process has been completed.




5.2      Types of Testing
   • Black box testing

   • White box testing

   • Unit testing

   • Integration testing

   • Functional testing

   • System testing

   • Regression testing

   • Acceptance testing

   • Load testing

   • Alpha testing

   • Beta testing


5.3      Testing with JUnit using Eclipse
As our application has been developed using Java so we use JUnit to perform testing us-
ing Unit testing. the following steps are performed to make testing using JUnit in Eclipse.



5.3.1     Create a Java class
In the ”src” folder, we will create the digitalreader.test package and the following class we
will create the class for PDFViewer to check if encryption present and acertain password,
return true if content accessable.
Dept. of CS & E, GEC, Hassan                                                              44
Digital Reader                                                                 Chapter 5


package org.jpedal.examples.jpaneldemo;
     import java.awt.BorderLayout;
import java.awt.Component;
import java.awt.Container;
public class PDFViewer extends JFrame {
....
privatebooleancheckEncryption();
....}


5.3.2    Create a JUnit test
Right click on new class in the Package Explorer and select New then JUnit Test Case.
Select ”New JUnit 4 test” and set the source folder to ”test”, so that test class gets
created in this folder.
Press ”Next” and select the methods which have to be tested. If the JUnit library in not
part of the classpath, Eclipse will prompt you to do so.
Create a test with the following code.



package digitalreader.test;
import org.junit.Test;
import static org.junit.Assert.assertEquals;
public class PDFViewerTest
{
@T est
publicvoidtestcheckEncryption()
{
P DF V iewertester = newP DF V iewer();
assertEquals(”Result”, password, tester.checkEncryption());
}
}


5.3.3    Run test via Eclipse
Right click on new test class and select Run-As then JUnit Test The result of the tests
will be displayed in the JUnit View. The test should be failing (indicated via a red bar)
because it did not displayed any prompts to enter the password.



Dept. of CS & E, GEC, Hassan                                                          45
Digital Reader                                                                   Chapter 5

    After that we had fixed that problem and re-run the test we got a green bar. If you
have several tests we can combine them into a test suite. Running a test suite will execute
all tests in that suite.

To create a test suite, select the test classes then right click on it then New then Other
then JUnit then Test Suite. Select ”Next” and select the methods for to create a test.
Similarly we have done tests with other codes.




Dept. of CS & E, GEC, Hassan                                                            46
Chapter 6

Conclusions and Future
Enhancements

Digital Reader application enables users to view more document formats using a single
application of software. The application has two utilities File Search and Text Search
Using these the users can able search there required files and they can also search for
a particular word among large number of file contained in a directory The application
developed using Java adds an advantage of using it virtually in any platform.

    Finally we conclude that our application is unique as there are very few or none of the
applications available to view more than one document format using a single application
software.




                                            47
References

[1] Description- URL




                       48

More Related Content

Similar to Digital Reader

QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
Bhavesh Jangale
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Jason Cheung
 
IO_Report_Template (Readonly)
IO_Report_Template (Readonly)IO_Report_Template (Readonly)
IO_Report_Template (Readonly)
guestf8327e
 
Thesis-aligned-sc13m055
Thesis-aligned-sc13m055Thesis-aligned-sc13m055
Thesis-aligned-sc13m055
Mohan Kashyap
 

Similar to Digital Reader (20)

Thesis
ThesisThesis
Thesis
 
Sanskrit Parser Report
Sanskrit Parser ReportSanskrit Parser Report
Sanskrit Parser Report
 
Masters Thesis: A reuse repository with automated synonym support and cluster...
Masters Thesis: A reuse repository with automated synonym support and cluster...Masters Thesis: A reuse repository with automated synonym support and cluster...
Masters Thesis: A reuse repository with automated synonym support and cluster...
 
QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
 
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
Trinity Impulse - Event Aggregation to Increase Stundents Awareness of Events...
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel Belasker
 
Montero Dea Camera Ready
Montero Dea Camera ReadyMontero Dea Camera Ready
Montero Dea Camera Ready
 
Uml (grasp)
Uml (grasp)Uml (grasp)
Uml (grasp)
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security Manager
 
IO_Report_Template (Readonly)
IO_Report_Template (Readonly)IO_Report_Template (Readonly)
IO_Report_Template (Readonly)
 
Thesis-aligned-sc13m055
Thesis-aligned-sc13m055Thesis-aligned-sc13m055
Thesis-aligned-sc13m055
 
Ibm watson analytics
Ibm watson analyticsIbm watson analytics
Ibm watson analytics
 
IBM Watson Content Analytics Redbook
IBM Watson Content Analytics RedbookIBM Watson Content Analytics Redbook
IBM Watson Content Analytics Redbook
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
Slr kitchenham
Slr kitchenhamSlr kitchenham
Slr kitchenham
 
Aregay_Msc_EEMCS
Aregay_Msc_EEMCSAregay_Msc_EEMCS
Aregay_Msc_EEMCS
 
thesis_online
thesis_onlinethesis_online
thesis_online
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing Units
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital Libraries
 
document
documentdocument
document
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Digital Reader

  • 1. Government Engineering College, Hassan 573 201 Visvesvaraya Technological University, Belgaum Project Report on Digital Reader 4GH08CS010 Chethan.H.A 4GH09CS402 Gowtham.A.M 4GH08CS034 Pavan.P.Naik 4GH08CS058 Yogesh.K.S Under the Guidance of Mr. Annaiah HB.E.,M.T ech., Asst. Professor Dept. of Computer Science & Engineering GEC, Hassan Department of Computer Science & Engineering Government Engineering College Hassan June, 2012
  • 2. Government Engineering College, Hassan 573 201 Visvesvaraya Technological University, Belgaum Certificate This is to certify that the project work entitled “Digital Reader” is a bonafide work carried out by Chethan H.A(4GH08CS010), Gowtham A.M(4GH09CS402), Pavan P Naik(4GH08CS034), Yogesh K.S(4GH08CS058) in partial fulfillment of the award of the degree of Bachelor of Engineering in Computer Science & Engineering of Visvesvaraya Technological University, Belgaum, during the year 2011 - 2012. It is certified that all corrections / suggestions indicated during internal evaluation have been incorporated in the report. The project report has been approved as it satisfies the academic require- ments in respect of the project work prescribed for the Bachelor of Engineering Degree. Guide Head of the Department Mr. Annaiah H B.E.,M.T ech. Dr.K.C.RavishankarB.E.,M.T ech.,P h.D. Assistant Professor Professor and Head Dept of CS & E Dept of CS & E GEC, Hassan 573 201 GEC, Hassan 573 201 Principal Dr. KarisiddappaB.E.,M.T ech.,P h.D. Principal GEC, Hassan 573 201 Examiners : 1. 2. Date : Place : Hassan
  • 3. Acknowledgement At the outset we express our most sincere grateful thanks to our Guide Annaiah H,Asst.Professor Department of CS & E, for his continous support and advice not only during the course of our project but also during the period of our stay in GECH. We express our gratitude to Dr.K.C. Ravishankar, Professor and Head, Department of CS & E for his encouragement and support throughout our work. We wish to express our thanks to our beloved Principal, Dr.Karisiddappa, for encouragement throughout our studies. Finally we express our gratitude to all teaching and non-teaching staff of Dept. of CS & E, fellow classmates and our parents for their timely support and suggestions. Chethan H.A Gowtham A.M Pavan P Naik Yogesh K.S i
  • 4. Table of Contents Table of Contents ii List of Figures iii Abstract iv 1 Introduction 1 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Organization of the report . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Document Readers 4 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Requirement Analysis 6 3.1 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 Technologies Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 JPedal Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 Eclipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.5 Java Runtime Environment(JRE) . . . . . . . . . . . . . . . . . . . . . . 22 3.6 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.7 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4 Design and Implementation 25 4.1 Model-View-Controller(MVC) . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 Developing an HTML Viewer . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 Developing an XML Viewer . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.4 Developing a Text Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 Developing an RTF Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.6 PDF Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 ii
  • 5. 4.7 Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.8 File Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.9 Image Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5 Results and analysis 43 5.1 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.2 Types of Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.3 Testing with JUnit using Eclipse . . . . . . . . . . . . . . . . . . . . . . . 44 6 Conclusions and Future Enhancements 47 References 48 List of Figures 3.1 Workspace launcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2 Creating New Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Creating Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4 Create Java class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.5 Run Java Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.6 Creating the JAR file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.7 Adding a library(.jar) to project . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.1 MVC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 HTML Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3 Xml Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4 Text Viewer and editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.5 RTF Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.6 PDF Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.7 Text search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.8 File search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.9 Image Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 iii
  • 6. Abstract Digital Reader is a application software to view and edit and manage files in Text(.txt), Portable Document Format (.pdf), Rich Text Form(.rtf) ,Hyper- Text Markup Language(.html) and Extensible Markup Language (.xml) and image file formats like JPG.Digital Reader can be applicable to any operating system and supports users to view and edit .txt , .pdf, .rtf, .html , .xml text file extension fromats and also few image file formats. The grahical user in- terface of the Digital reader can be changed to the users wish.User can select his own text format to view or edit in this single platform and make easy of the viewing and editing process. Digital Reader representations are discrete, the information represented can be discrete, such as numbers, letters or computer icons,images.Digital reader is a multi-format digital document reader software that is built using platform independent Java language and designed User Interface with swings pack- ages.There are several readers available out there supporting one or the other readable document formats but there are none which can support multiple formats like pdf, rtf, html,xml text etc.,. But this software enables the user to read multiple document formats using the single software. Java is taken as a language for the development of digital reader because it is platform inde- pendent and supports rich user interface design with its swings package. This application makes it possible to save system resources in launching different applications for different documents as only one application can handle sev- eral document formats. iv
  • 7. Chapter 1 Introduction Digital documents have become more in usage nowadays as compared to regular paper usage since it save both time and money with the paper and is also environment friendly as use of paper involves cutting down trees to make them. So when there are digital documents created, there must be a digital reader for reading those documents and using them for other purposes like printing, editing, etc. For this purpose there are several applications available as its readers.This application makes it possible to save system re- sources in launching different applications for different documents as only one application can handle several document formats.This application makes it possible to save system resources in launching different applications for different documents as only one appli- cation can handle several document formats.There are digital documents created, there must be a digital reader for reading those documents.For this purpose there are several applications available as its readers.The readers available to view the documents differ from format to format.Hence there exist many applications for viewing the documents. A filename extension is a suffix (separated from the basefilename by a dot) to the name of a computer file applied to indicate the encoding (file format) of its contents or usage. Examples of filename extensions are .htm, .pdf, .xml and .txt.Filename extensions can be considered a type of metadata. They are commonly used to imply information about the way data might be stored in the file.Formatted text, styled text or rich text, as opposed to plain text, has styling information beyond the minimum of semantic elements: colours, styles (boldface, italic), sizes and special features (such as hyperlinks). Formatted text cannot rightly be identified with binary files or be distinct from ASCII text. This is because formatted text is not necessarily binary, it may be text-only, such as HTML, RTF or enriched text files, and it may be ASCII-only. Conversely, a plain text file may be non-ASCII (in an encoding such as Unicode UTF-8). Text-only formatted text is achieved by markup which too is textual, while some editors of formatted text like 1
  • 8. Digital Reader Chapter 1 Microsoft Word save in a binary format. For viewing Portable document format(pdf) document files there are readers like Adobe acrobat, foxit reader, Nitro pdf reader etc., For viewing (Rich text format) rtf documents, there are software like WordPad, ABWord, MS Word, etc.,. Similarly there are other various software for viewing other document formats. Adobe Acrobat is a family of application software developed by Adobe Systems to view, create, manipulate, print and manage files in Portable Document Format (PDF). All members of the family, except Adobe Reader (formerly Acrobat Reader), are proprietary commercial software, while the latter is available as freeware and can be downloaded from Adobe’s web site. Adobe Reader enables users to view and print PDF files but has negligible PDF creation capabilities. Acrobat and Reader are widely used as a way to present information with a fixed layout similar to a paper publication. Microsoft Office Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS (1983), the Apple Macintosh (1984), the ATT Unix PC (1985), Atari ST (1986), SCO UNIX, OS/2, and Microsoft Windows (1989). It is a component of the Microsoft Office software system; it is also sold as a standalone product and included in Microsoft Works Suite. 1.1 Problem Statement Always there is a need for an application to open virtually any document format including PDFs, TXTs, RTFs, HTMLs, XMLs, and Images that can used in any of the operat- ing systems provided there exists Java runtime environment(JRE) in the host Operating System. 1.2 Objectives Presenting a simple application called Digital Reader (DR) which supports the fol- lowing view modes and corresponding file formats which can work on different Operating systems and platforms. The aim and goal of this project is to create an application for reading multiple document formats. • Text/Binary/: all files (unlimited size) • RTF: RTF/ UTF-8 encoded files Dept. of CS & E, GEC, Hassan 2
  • 9. Digital Reader Chapter 1 • Images: JPG. • Internet: all file types supported by Web Browsers (HTML/ XML etc) • Portable document format(PDF) with suitable view. • Suppotable for Microsoft Excel viewer. Digital Reader is a Unicode-compatible and can be opened in a click of a mouse. 1.3 Motivation As there was no application that can support most of the document formats, this was the motivation for us to develop this kind of application. 1.4 Applications Some of the applications can be listed as follows: • Support for multiple document formats such as text,pdf,rtf,html,xml. • Additonal utilities such as Filesearch and Textsearch. • Comparators for editable and non-editable documents. • Editors with Enable/Disable option. 1.5 Organization of the report Breif description or bird view of the remaining report topics. Dept. of CS & E, GEC, Hassan 3
  • 10. Chapter 2 Document Readers 2.1 Overview A Document Reader is an application software that presents the data stored in a com- puter file in a human-friendly form. The file contents are generally displayed on the screen, or they may be printed. Also, they may be read aloud using speech synthesis. Document Readers may edit files, yet it is common for them to be able to save data in a different file format, or to copy information from the viewed file to the system- wide clipboard. They must have sufficient knowledge about the format of the file to be viewed. Even plain text files are not so simple file viewers may have to handle different code pages and newlinestyles.A fundamental type of Document Readers are filters that translate binary files into plain text (one example antiword). However, depending on the competence of the translating routines, some information may be lost. Disassemblers also fall in this category.Another common type of Document Reader is a picture viewer that can display picture files of various formats. Common features here are thumbnail preview and creation, and image zooming. For more complex or proprietary file formats, file viewers are usually provided by the same companies that make the editing software using those formats (viewers may be distributed free of charge, while editors have to be bought). A Document Reader is full-functionality software in the sense that it have a capabil- ity to create a file, or modify the content of an existing one. Instead, it is can be used to display or print the content.The primary reason behind this missing functionality is often due to marketing and control. For example, a popular software program, Adobe Acrobat, can be used to create content for most computer platforms, under various operating sys- tems. To ensure that people can access the documents created with Adobe Acrobat, the 4
  • 11. Digital Reader Chapter 2 software publisher created a viewer program, the Acrobat Reader, and made it available for free. This viewer application allows the content created by the proprietary authoring software to be readable on all supported operating-system platforms, free of charge, thus making it a more attractive solution. There are many products that can qualify as a file viewer: Microsoft Word viewer or Microsoft PowerPoint viewer, and the OpenOffice equivalents are examples. In a sense, a web browser is a type of Document Reader, which translates, or renders, the HTML markups into a human-friendly presentation. Although HTML is plain text, viewing an HTML file in a browser and in a text editor produces significantly different results. Although web browsers are arguably the best Document Reader, since they support many graphic, multimedia, and document formats, they are likely to always be some- what lacking in the output quality and feature performance of more specialized software packages. Creating and using alternative publishing systems and their accompanying Document Readers still makes a lot of business sense, given the content and presentation control they provide. Dept. of CS & E, GEC, Hassan 5
  • 12. Chapter 3 Requirement Analysis 3.1 Literature Survey To study and analyze more about the various document formats, the following literature survey has done and discussed in this chapter. 3.1.1 Portable Document Format(PDF) Portable Document Format (PDF) is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems. Each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, graphics, and other information needed to display it. While the PDF specification has been available free of charge since at least 2001, PDF was originally a proprietary format controlled by Adobe. It was officially released as an open standard on July 1, 2008, and published by the International Organization for Standardization as ISO 32000-1:2008. In 2008, Adobe published a Public Patent License to ISO 32000-1 granting a royalty-free rights for all patents owned by Adobe that are necessary to make, use, sell and distribute PDF compliant implementations. 3.1.2 Rich Text Format(RTF) The Rich Text Format (often abbreviated RTF) is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange. 6
  • 13. Digital Reader Chapter 3 Most word processors are able to read and write some versions of RTF. There are sev- eral different revisions of RTF specification and portability of files will depend on what version of RTF is being used. RTF specifications are changed and published with major Microsoft Word and Office versions. It should not be confused with enriched text (mimetype ”text/enriched” of RFC 1896) or its predecessor Rich Text (mimetype ”text/richtext” of RFC 1341 and 1521); nor with IBM’s RFT-DCA (Revisable Format Text-Document Content Architecture) which are completely different specifications. 3.1.3 Hypertext Markup Language (HTML) HTML is the markup language used for most web pages. E-books using HTML can be read using a Web browser. The specifications for the format are available without charge from the W3C. HTML adds specially marked meta-elements to otherwise plain text encoded using char- acter sets like ASCII or UTF-8. As such, suitably formatted files can be, and sometimes are, generated by hand using a plain text editor or programmer’s editor. Many HTML generator applications exist to ease this process and often require less intricate knowledge of the format details involved. HTML on its own is not a particularly efficient format to store information in, requiring more storage space for a given work than many other formats. However, several e-Book formats including the Amazon Kindle, Open eBook, Compressed HM, Mobipocket and EPUB store each book chapter in HTML format, then use ZIP compression to compress the HTML data, images, metadata and style sheets into a single, significantly smaller, file.HTML files encompass a wide range of standards and displaying HTML files correctly can be complicated. Additionally many of the features supported, such as forms, are not relevant to e-books. 3.1.4 Text Document(.txt) A text file (sometimes spelled ”textfile”: an old alternate name is ”flatfile”) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system. The end of a text file is often denoted by placing one or more special characters, known as an end-of-file marker, after the last line in a text file. Dept. of CS & E, GEC, Hassan 7
  • 14. Digital Reader Chapter 3 ”Text file” refers to a type of container, while plain text refers to a type of content. Text files can contain plain text, but they are not limited to such.At a generic level of description, there are two kinds of computer files: text files and binary files. 3.1.5 Extensible Markup Language (.xml) Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards. The design goals of XML emphasize simplicity, generality, and usability over the In- ternet. It is a textual data format with strong support via Unicode for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services. Many application programming interfaces (APIs) have been developed for software de- velopers to use to process XML data, and several schema systems exist to aid in the definition of XML-based languages. As of 2009, hundreds of XML-based languages have been developed, including RSS, Atom, SOAP, and XHTML. XML-based formats have become the default for many office- productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple’s iWork. XML has also been employed as the base language for communication protocols, such as XMPP. 3.1.6 Image viewer An image viewer or image browser is a computer program that can display stored graphi- cal image; it can often handle various graphics file formats. Such software usually renders the image according to properties of the display such as color depth, display resolution, and color profile. Although you may use a full-featured bitmap graphics editor (such as Photoshop or the GIMP or the StylePix) as an image viewer, these have many editing functionalities Dept. of CS & E, GEC, Hassan 8
  • 15. Digital Reader Chapter 3 which are not needed for just viewing images, and therefore usually start rather slowly. Also, most viewers have functionalities that editors usually lack, such as stepping through all the images in a directory (possibly as a slideshow). Image viewers give maximal flexibility to the user by providing a direct view of the directory structure available on a hard disk. Most image viewers do not provide any kind of automatic organization of pictures and therefore the burden remains on the user to create and maintain their folder structure (using tag- or folder-based methods). How- ever, some image viewers also have features for organizing images, especially an image database, and hence can also be used as image organizers. Some image viewers, such as Windows Photo Viewer that comes with Windows oper- ating systems, change a JPEG image if it is rotated, resulting in loss of image quality; others offer lossless rotation. 3.1.7 File Search Desktop search is the name for the field of search tools which search the contents of a user’s own computer files, rather than searching the Internet. These tools are designed to find information on the user’s PC, including web browser histories, e-mail archives, text documents, sound files, images and video. One of the main advantages of desktop search programs is that search results arrive in a few seconds; Windows search companion ”can be some help, but it searches through Windows files and folders only, not e-mail or contact databases, and unless you enable the Indexing Service (in Windows 2000 or XP), the Windows search tool is extremely slow.” Windows Vista enables the indexing service by default. Desktop search is emerging as a concern for large firms for two main reasons: untapped productivity and security. A commonly cited statistic states that 80 per cent of a com- pany’s data is locked up inside unstructured data the information stored on an end user’s PC, the files and directories they’ve created on a network, documents stored in repositories such as corporate intranets and a multitude of other locations. Moreover, many companies have structured or unstructured information stored in older file formats to which they don’t have ready access. Desktop search engines build and maintain an index database to achieve reasonable per- formance when searching several gigabytes of data. Indexing usually takes place when Dept. of CS & E, GEC, Hassan 9
  • 16. Digital Reader Chapter 3 the computer is idle and most search applications can be set to suspend it if a portable computer is running on batteries, in order to save power. When indexing the files, desktop search tools collect three types of information about files: • file and directory names. • metadata, such as titles, authors, comments in file types such as MP3, PDF and JPEG. • content of supported documents. Besides programs that use indexing, there are many programs that open and search files instantly. Their disadvantage is that they can search only a certain directory, not the entire computer, but their great advantage is that they do not load the resources of com- puter with indexing. Furthermore, they always use the current status of the documents. 3.1.8 Text Search A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find informa- tion and the amount of information which must be consulted, akin to other techniques for managing information overload. The most public, visible form of a search engine is a Web search engine which searches for information on the World Wide Web. Search engines provide an interface to a group of items that enables users to specify criteria about an item of interest and have the engine find the matching items. The criteria are referred to as a search query. In the case of text search engines, the search query is typically expressed as a set of words that identify the desired concept that one or moredocuments may contain. There are several styles of search query syntax that vary in strictness. It can also switch names within the search engines from previous sites. Whereas some text search engines require users to enter two or three words separated by white space, other search engines may enable users to specify entire documents, pictures, sounds, and various forms of natural language. Some search engines apply improvements to search queries to increase the likelihood of providing a quality set of items through a process known as query expansion. Dept. of CS & E, GEC, Hassan 10
  • 17. Digital Reader Chapter 3 3.1.9 File comparison File comparison in computing compares the contents of computer files, finding their com- mon contents and their differences. The result of the comparison may be presented in a graphic user interface or as part of larger tasks in networks, file systems, or revision control. Some widely-used file comparison programs are diff, cmp, FileMerge, Araxis Merge, Win- Merge, Beyond Compare, and Microsoft File Compare. Many text editors and word processors perform file comparison to highlight the changes to a document. 3.2 Technologies Used 3.2.1 Java programming language Java is a programming language originally developed by James Gosling at Sun Microsys- tems (which has since merged into Oracle Corporation) and released in 1995 as a core component of Sun Microsystems’ Java platform. Java applications are typically compiled to bytecode (class file) that can run on any Java Virtual Machine (JVM) regardless of computer architecture. Java is a general-purpose, concurrent, class-based, object-oriented language that is specifically designed to have as few implementation dependencies as pos- sible. It is intended to let application developers ”write once, run anywhere” (WORA), meaning that code that runs on one platform does not need to be recompiled to run on another. Java is currently one of the most popular programming languages in use, particularly for client-server web applications. There were five primary goals in the creation of the Java language: Dept. of CS & E, GEC, Hassan 11
  • 18. Digital Reader Chapter 3 It should be simple, object-oriented and familiar It should be robust and secure It should be architecture-neutral and portable It should execute with high performance It should be interpreted, threaded, and dynamic Java is an object-oriented programming language with a built-in application program- ming interface (API) that can handle graphics and user interfaces and that can be used to create applications or applets. Because of its rich set of API’s, similar to Macintosh and Windows, and its platform independence, Java can also be thought of as a platform in itself. Java also has standard libraries for doing mathematics. Much of the syntax of Java is the same as C and C++. One major difference is that Java does not have pointers. However, the biggest difference is that you must write object oriented code in Java. Procedural pieces of code can only be embedded in objects. In the following we assume that the reader has some familiarity with a programming language. In particular, some familiarity with the syntax of C/C++ is useful. Major release versions of Java, along with their release dates: JDK 1.0 (January 23, 1996) JDK 1.1 (February 19, 1997) J2SE 1.2 (December 8, 1998) J2SE 1.3 (May 8, 2000) J2SE 1.4 (February 6, 2002) J2SE 5.0 (September 30, 2004) Java SE 6 (December 11, 2006) Java SE 7 (July 28, 2011) Sun has defined and supports four editions of Java targeting different application envi- ronments and segmented many of its APIs so that they belong to one of the platforms. The platforms are: Java Card for smartcards. Java Platform, Micro Edition (Java ME) targeting environments with limited resources. Java Platform, Standard Edition (Java SE) targeting workstation environments. Java Platform, Enterprise Edition (Java EE) targeting large distributed enterprise or In- ternet environments. The original and reference implementation Java compilers, virtual machines, and class libraries were developed by Sun from 1995. As of May 2007, in compliance with the spec- Dept. of CS & E, GEC, Hassan 12
  • 19. Digital Reader Chapter 3 ifications of the Java Community Process, Sun relicensed most of its Java technologies under the GNU General Public License. In Java we distinguish between applications, which are programs that perform the same functions as those written in other programming languages, and applets, which are pro- grams that can be embedded in a Web page and accessed over the Internet. Our initial focus will be on writing applications. When a program is compiled, a byte code is pro- duced that can be read and executed by any platform that can run Java. 3.2.2 Swings Swing is far from a simple component toolkit, however. It includes rich undo support, a highly customizable text package, integrated internationalization and accessibility sup- port. To truly leverage the cross-platform capabilities of the Java platform, Swing sup- ports numerous look and feels, including the ability to create your own look and feel. The ability to create a custom look and feel is made easier with Synth, a look and feel specifically designed to be customized. Swing wouldn’t be a component toolkit without the basic user interface primitives such as drag and drop, event handling, customizable painting, and window management. Dept. of CS & E, GEC, Hassan 13
  • 20. Digital Reader Chapter 3 The Swing toolkit includes a rich set of components for building GUIs and adding inter- activity to Java applications. Swing includes all the components you would expect from a modern toolkit: table controls, list controls, tree controls, buttons, and labels. Swing is part of the Java Foundation Classes (JFC). The JFC also include other fea- tures important to a GUI program, such as the ability to add rich graphics functionality and the ability to create a program that can work in different languages and by users with different input devices. The following list shows some of the features that Swing and the Java Foundation Classes provide. Swing GUI Components The Swing toolkit includes a rich array of components: from basic components, such as buttons and check boxes, to rich and complex components, such as tables and text. Even deceptively simple components, such as text fields, offer sophisticated functionality, such as formatted text input or password field behavior. There are file browsers and dialogs to suit most needs, and if not, customization is possible. If none of Swing’s provided Dept. of CS & E, GEC, Hassan 14
  • 21. Digital Reader Chapter 3 components are exactly what you need, you can leverage the basic Swing component functionality to create your own. Java 2D API To make your application stand out; convey information visually; or add figures, images, or animation to your GUI, you’ll want to use the Java 2D API. Because Swing is built on the 2D package, it’s trivial to make use of 2D within Swing components. Adding images, drop shadows, compositing it’s easy with Java 2D. Pluggable Look-and-Feel Support Any program that uses Swing components has a choice of look and feel. The classes shipped by Oracle provide a look and feel that matches that of the platform. The Synth package allows you to create your own look and feel. The GTK+ look and feel makes hundreds of existing look and feels available to Swing programs. A program can specify the look and feel of the platform it is running on, or it can specify to always use the Java look and feel, and without recompiling, it will just work. Or, you can ignore the issue and let the UI manager sort it out. Data Transfer Data transfer, via cut, copy, paste, and drag and drop, is essential to almost any applica- tion. Support for data transfer is built into Swing and works between Swing components within an application, between Java applications, and between Java and native applica- tions. Internationalization This feature allows developers to build applications that can interact with users world- wide in their own languages and cultural conventions. Applications can be created that accept input in languages that use thousands of different characters, such as Japanese, Chinese, or Korean. Swing’s layout managers make it easy to honor a particular orientation required by the UI. For example, the UI will appear right to left in a locale where the text flows right to Dept. of CS & E, GEC, Hassan 15
  • 22. Digital Reader Chapter 3 left. This support is automatic: You need only code the UI once and then it will work for left to right and right to left, as well as honor the appropriate size of components that change as you localize the text. Accessibility API People with disabilities use special software assistive technologies that mediates the user experience for them. Such software needs to obtain a wealth of information about the running application in order to represent it in alternate media: for a screen reader to read the screen with synthetic speech or render it via a Braille display, for a screen magnifier to track the caret and keyboard focus, for on-screen keyboards to present dynamic key- boards of the menu choices and toolbar items and dialog controls, and for voice control systems to know what the user can control with his or her voice. The accessibility API enables these assistive technologies to get the information they need, and to program- matically manipulate the elements that make up the graphical user interface. Undo Framework API Swing’s undo framework allows developers to provide support for undo and redo. Undo support is built in to Swing’s text component. For other components, Swing supports an unlimited number of actions to undo and redo, and is easily adapted to an applica- tion. For example, you could easily enable undo to add and remove elements from a table. Flexible Deployment Support If you want your program to run within a browser window, you can create it as an applet and run it using Java Plug-in, which supports a variety of browsers, such as Internet Explorer, Firefox, and Safari. If you want to create a program that can be launched from a browser, you can do this with Java Web Start. Of course, your application can also run outside of browser as a standard desktop application. 3.3 JPedal Library JPedal is a Java library for viewing and manipulating PDF files. It aims to provide Java developers with complete PDF reader functionality as well as PDF content extraction. The full version of JPedal is distributed under a commercial license with a cut down version available under a LGPL license. Dept. of CS & E, GEC, Hassan 16
  • 23. Digital Reader Chapter 3 The full version of JPedal provides a PDF viewer with multiple page viewing modes, PDF to image conversion, printing, searching and text/image extraction. It can be used as part of a client or server Swing or SWT application, thin client, applet, JavaFX, JSP or webstart.The LGPL versions lacks some of the functionality of the commercial version such as printing, extraction and different page display options. 3.4 Eclipse Eclipse is a multi-language software development environment comprising an integrated development environment (IDE) and an extensible plug-insystem. It is written mostly in Java. It can be used to develop applications in Java and, by means of various plug- ins, other programming languagesincluding Ada, C, C++, COBOL, Haskell, Perl, PHP, Python, R, Ruby (including Ruby on Rails framework), Scala, Clojure, Groovy and Scheme. It can also be used to develop packages for the software Mathematica. Develop- ment environments include the Eclipse Java development tools (JDT) for Java, Eclipse CDT for C/C++, and Eclipse PDT for PHP, among others. 3.4.1 Running Eclipse After installing the Eclipse SDK in a directory, you can start the Workbench by running the Eclipse executable included with the release (you also need a 1.4.2 JRE). On Win- dows, the executable file is called eclipse.exe, and is located in the eclipse sub-directory of the install. To start Eclipse double-click on the file eclipse.exe (Microsoft Windows) or eclipse (Linux / Mac) in the directory where you unpacked Eclipse. The system will prompt you for a workspace. The workspace is the place where you store your Java projects (more on workspaces later).Select an empty directory and press Ok.Figure 3.1 shows workspace launcher. 3.4.2 Creating New Project Select from the menu File then New then project.Enter Digital Reader as the project name.Select the flag Create separate folders for sources and class files.Figure 3.2 shows creating of new project. Dept. of CS & E, GEC, Hassan 17
  • 24. Digital Reader Chapter 3 Figure 3.1: Workspace launcher Figure 3.2: Creating New Project Press finish to create the project. A new project is created and displayed as a folder. Open the Digital Reader folder and explore the content of this folder. Dept. of CS & E, GEC, Hassan 18
  • 25. Digital Reader Chapter 3 3.4.3 Create package Create a new package. A good convention is to use the same name for the top package and the project. Create therefore the package controller. Select the folder src, right click on it and select New then Package.Figure 3.3 shows cre- ating package. Figure 3.3: Creating Package 3.4.4 Create Java class We will now create a Java class. Right click on your package and select New then Class. Enter in a class name and click finish. Eclipse may yell at you and say ”The use of the default package is discouraged.” Packages in Java provide a nice means for organizing your classes and this is a good thing. Nevertheless, for just getting this example working, we can skip thinking about packages.Figure 3.4 creating java class. Dept. of CS & E, GEC, Hassan 19
  • 26. Digital Reader Chapter 3 Figure 3.4: Create Java class 3.4.5 Run your project in Eclipse Now run your code. Right click on your Java class and select Run-as then Java applica- tion.Figure 3.5 shows this.Finished! You can see the output in the console. Figure 3.5: Run Java Project Dept. of CS & E, GEC, Hassan 20
  • 27. Digital Reader Chapter 3 3.4.6 Prepare to run program outside Eclipse (create JAR file) To run your Java program outside of Eclipse you need to export it as a jar file. A jar file is the standard distribution format for Java applications. Select your project, right click on it and select Export.Figure 3.6 shows creating of JAR file. Figure 3.6: Creating the JAR file Select JAR file, select Next and then Select your project and maintain the export desti- nation and a name for the jar file. we named it Digital Reader.jar. 3.4.7 Adding a library (.jar ) to your project The following describes how to add Java libraries to your project. Java libraries are distributed via ”jar” files. It assumes that you have a jar file available; if not feel free to skip this step. Create a new Java project Digital Reader Then, create a new folder called lib, by right clicking on your project and selecting New then Folder. From the menu select File then Import then General then File System. Select your Dept. of CS & E, GEC, Hassan 21
  • 28. Digital Reader Chapter 3 jar and select the folder lib as target. Alternatively, just copy and paste your jar file into the ”lib” folder. Right click on your project and select Properties. Under Java Build Path then Libraries select the button ”Add JARs”.The following example shows how the result would look like, if the junit-4.4.jar had been added to the project.Figure 3.7 shows the adding of library to the project. Figure 3.7: Adding a library(.jar) to project Afterwards you can use the classes contained in the jar file in your Java source code.Press finish. This creates a jar file in your selected output directory. 3.5 Java Runtime Environment(JRE) The Java Runtime Environment (JRE) provides the libraries, the Java Virtual Machine, and other components to run applets and applications written in the Java programming language. In addition, two key deployment technologies are part of the JRE: Java Plug- in, which enables applets to run in popular browsers; and Java Web Start, whichdeploys standalone applications over a network.The Java Runtime Environment (JRE) includes the JVM, as the JRE provides some standard libraries and the JVM which can be used Dept. of CS & E, GEC, Hassan 22
  • 29. Digital Reader Chapter 3 to execute a Java program.A Java virtual machine (JVM) is a virtual machine that can execute Java bytecode. It is the code execution component of the Java software platform.A Java virtual ma- chine is software that is implemented on virtual and non-virtual hardware and on stan- dard operating systems. A JVM provides an environment in which Java bytecode can be executed, enabling such features as automated exception handling, which provides root- cause debugging information for every software error (exception), independent of the source code. A JVM is distributed along with a set of standard class libraries that implement the Java application programming interface (API). Appropriate APIs bundled together with JVM form the Java Runtime Environment (JRE).JVMs are available for many hardware and software platforms.Thus, the JVM is a crucial component of the Java platform. The Java Development Kit (JDK) is an Oracle Corporation product aimed at Java developers. Since the introduction of Java, it has been by far the most widely used Java Software Development Kit (SDK). The JDK has as its primary components a collection of programming tools, including: java :The loader for Java applications. This tool is an interpreter and can interpret the class files generated by the javac compiler. javac :The compiler, which converts source code into Java bytecode. appletviewer :This tool can be used to run and debug Java applets without a web browser. The JDK also comes with a complete Java Runtime Environment, usually called a private runtime, due to the fact that it is separated from the regular JRE and has extra contents. It consists of a Java Virtual Machine and all of the class libraries present in the production environment, as well as additional libraries only useful to developers, such as the internationalization libraries and the IDL libraries. The Java Developers Kit (JDK) also includes the JVM, standard class libraries, and sev- eral other tools that a developer needs in order to create a Java program.JRE contains the runtime environment such as JVM and other Java classes (AWT, SWING), but does not contain any development tools such as a compiler or a debugger.Both JDK and JRE contains the JVM. 3.6 Hardware Requirements Processor :Pentium III onwords Dept. of CS & E, GEC, Hassan 23
  • 30. Digital Reader Chapter 3 RAM:128MB Hard disk space:5GB Best view in 1024x768 screen resolution 3.7 Software Requirements Operating System: Windows/Linux/Mac Runtime Environment:JRE or JDK Dept. of CS & E, GEC, Hassan 24
  • 31. Chapter 4 Design and Implementation 4.1 Model-View-Controller(MVC) Model-view-controller (MVC) design is most commonly used with swings because of their connection and since weve implemented this project in swings, we chose MVC for our design of User Interface. MVC was first introduced by Trygve Reenskaug, a Smalltalk developer at the Xerox Palo Alto Research Center in 1979, and helps to decouple data access and business logic from the manner in which it is displayed to the user. More precisely, MVC can be broken down into three elements: • Model: The model represents data and the rules that govern access to and updates of this data. In enterprise software, a model often serves as a software approximation of a real-world process. • View: The view renders the contents of a model. It specifies exactly how the model data should be presented. If the model data changes, the view must update its presentation as needed. This can be achieved by using a push model, in which the view registers itself with the model for change notifications, or a pull model in which the view is responsible for calling the model when it needs to retrieve the most current data. • Controller: The controller translates the user’s interactions with the view into actions that the model will perform. In a stand-alone GUI client, user interactions could be button clicks or menu selections, whereas in an enterprise web application, they appear as GET and POST HTTP requests. Depending on the context, a controller may also select a new view for example, a web page of results to present back to the user.The figure 4.1 shows the MVC architecture. 25
  • 32. Digital Reader Chapter 4 Figure 4.1: MVC Architecture 4.2 Developing an HTML Viewer The HTML Viewer can be developed using JTextPane and HTMLEditorKit to display and edit HTML documents. The following features are included: • Creating a new HTML document • Opening an existing HTML document • Saving changes • Saving the document under a new name/location • Prompting the user to save changes before loading a new document or exiting the application Class HtmlViewer This class extends JFrame to provide the supporting frame for this example. Several instance variables are declared: • JTextPane m editor: main text component. Dept. of CS & E, GEC, Hassan 26
  • 33. Digital Reader Chapter 4 • StyleContext m context: a group of styles and their associated resources for the documents in this example. • HTMLDocument m doc: current document model. • HTMLEditorKit m kit: editor kit that knows how to read/write HTML documents. • SimpleFilter m HTMLFilter: file filter for .HTML files. • JToolBar m toolBar: toolbar containing New, Open, Save buttons. • JFileChooser m chooser: file chooser used to load and save HTML files. • File m currentFile: currently opened HTML file (if any). • boolean m textChanged: keeps track of whether any changes have been made since the document was last saved. The HtmlViewer constructor first instantiates our JTextPane and HTMLEditorKit, and assigns the editor kit to the text pane (it is important that this is done before any docu- ments are created). The editor component is then placed in a JScrollPane which is placed in the center of the frame. The JFileChooser component is created and an instance of our SimpleFilter class is used as a filter to only allow the choice of HTML documents. A WindowListener is added to call our custom promptToSave() method to ensure that the user has the opportunity to save any changes before closing the application. This WindowListener also ensures that our editor component automatically receives the focus when this application regains the focus. The createMenuBar() method creates a menu bar with a single menu titled File and a tool bar with three buttons. Actions for New, Open, Save, Save As and Exit are cre- ated and added to the File menu. The New, Open and Save actions are also added to the tool bar The important difference is that we use InputStreams and OutputStreams rather than Readers and Writers. The reason for this is that HTML uses 1-byte encoding which is incompatible with the 2-byte encoding used by readers and writers. The getDocumentName() method simply returns the name of the file correspond- ing to the current document, or untitled if it hasnt been saved to disk.The newDocu- ment() method is responsible for creating a new HTMLDocument instance using HTM- LEditorKits createDefaultDocument() method. Once created our StyleContext variable, m context, is assiged to this new documents stylesheet with HTMLDocuments get- StyleSheet() method. The title of the frame is then updated and a Runnable instance is created and sent to SwingUtilities.invokeLater() method to scroll the document to the beginning when it is finished loading. Finally, an instance of our custom UpdateListener Dept. of CS & E, GEC, Hassan 27
  • 34. Digital Reader Chapter 4 class (described below) is added as a DocumentListener, and the m textChanged variable is set to false to indicate that no changes to the document have been made yet. The openDocument() is similar to the newDocument() method but uses the JFileChooser to allow selection of an existing HTML file to load, and uses an InputStream object to read the contents of that file. The saveFile() method takes a boolean parameter specify- ing whether the method should act as a Save As process or just a regular Save. If true, indicating a Save As process, the JFileChooser is displayed to allow the user to specify the file and location to save the document to. An OutputStream is used to write the contents of the document to the destination File. The promptToSave() method checks the m textChanged flag and, if true, displays a JOptionPane asking whether or not the current document should be saved. This method is called before a new document is created, a document is opened or the application is closed to ensure that the user has a chance to save any changes to the current document before losing them. The showError() method is used to display error messages in a JOptionPane. It is often useful to display exceptions to users so that they know an error happened and so that they may eventually report errors back to you if they are in fact bugs. Class UpdateListener This DocumentListener subclass is used to modify the state of our m textChanged variable. Whenever an insertion, removal or document change is made this variable is set to true. This allows HtmlProcessors promptToSave() method to ensure the user has the option of saving any changes before loading a new document or exiting the application. The figure 4.2 shows the snapshot of HTML Viewer. 4.3 Developing an XML Viewer Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications ,all gratis open standards.The design goals of XML emphasize simplicity, generality, and usability over the Internet. It is a textual data format with strong sup- port via Unicode for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for Dept. of CS & E, GEC, Hassan 28
  • 35. Digital Reader Chapter 4 Figure 4.2: HTML Viewer example in web services.Many application programming interfaces (APIs) have been de- veloped for software developers to use to process XML data, and several schema systems exist to aid in the definition of XML-based languages.As of 2009, hundreds of XML-based languages have been developed,including RSS, Atom, SOAP, and XHTML. XML-based formats have become the default for many office-productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org and LibreOffice (OpenDocument), and Apple’s iWork. XML has also been employed as the base language for communication protocols, such as XMPP.This shows how to display an XML document using a JTree and Javas built-in XML support. Because XML documents are heirarchical in nature, JTree is a natural fit for the task. Class XmlViewer Two packages that are required to implement this viewer: • javax.xml.parsers: This package consists of classes used to process XML documents. • org.w3c.dom: This package consists of a set of interfaces that define the DOM (Document Object Model) which is an API that allows dynamic access to the structure and data of XML documents. Examples include Document, Element, Node, Attr, etc. Dept. of CS & E, GEC, Hassan 29
  • 36. Digital Reader Chapter 4 XmlViewer extends JFrame and represents the main application frame for this example. Five instance variables are defined: • Document m doc: the current XML document (note that this is an instance of org.w3c.dcom.Document; not a text document [javax.swing.text.Document]). • JTree m tree: the tree component used to display the current XML file. • DefaultTreeModel m model: tree model constructed to mimic the XML file. • JFileChooser m chooser: File chooser used for opening and saving XML files. • File m currentFile: reference to the current XML file. The XmlViewer constructor creates and installs a toolbar with our createToolbar() method, instantiates tree model m model with a top node containing No XML loaded as user data, and instantiates tree m tree with m model. Selection is set to SINGLE TREE SELECTION and the tree is set to noneditable. A custom tree cell renderer is created to display an appropriate icon based on whether a node represents an XML document element or not. This renderer is assign to our tree and the tree is then placed in a scroll pane which is added to the center of the frame. File chooser m chooser is instantiated and an xml file filter is applied to it so only XML files will be displayed. The createToolbar() method creates a JToolBar with an Open button that invokes our openDocument() method. The getDocumentName() method retrieves the name of the current file referenced with our m currentFile variable. The openDocument() method shows our JFileChooser in a separate thread to allow se- lection of an XML file to open for viewing. javax.xml.parsers.DocumentBuilderFactorys static newInstance() is used to create a new instance of DocumentBuilderFactory which is used to create an instance of Document- Builder.DocumentBuilder is used to parse the selected XML file with its parse() method, storing the resulting Document instance in our m doc variable. The root element of this document is retrieved with Documents getDocumentElement() method. This element is used as the root node for our tree model, and our custom createTreeNode() creates the tree node heirarchy corresponding to the node passed to it. The resulting tree node is then set as the root node of our tree and our custom expandTree() method is used to expand all nodes to display the entire XML document. Dept. of CS & E, GEC, Hassan 30
  • 37. Digital Reader Chapter 4 The createTreeNode() method takes a root node (instance of org.w3c.dom.Node) as pa- rameter [note that org.w3c.dom.Element is a subinterface of Node]. Our canDisplayN- ode() method is used to find out whether the root node is either an element or a text node. If so an instance of our custom XmlViewerNode is created to represent that node. Then a NodeList is created representing all child nodes of the root node. For each child node a tree node is created by passing it recursively to createTreeNode(). The canDisplayNode() method checks whether a given Node is of type ELEMENT NODE or TEXT NODE. If it isnt one of these types it should not be displayed in our tree. The showError() method is used to display exceptions in a JOptionPane dialog. The static expandTree() methods are responsible for expanding each parent node so that the entire tree is expanded and visible in the viewer. Class XmlViewerNode This class extends DefaultMutableTreeNode to represent the XML nodes in our viewer. The main customization is in the overriden toString() implementation which returns a textual representation of the node depending on whether it is of type ELEMENT NODE or TEXT NODE. The figure 4.3 shows the snapshot of XML Viewer. 4.4 Developing a Text Viewer Class BasicTextEditor This class extends JFrame and provides the parent frame for our example. Two class variables are declared: • String APP NAME: name of this example used in title bar. • String FONTS[]: an array of font family names. Instance variables: • Font[]m fonts: an array of Font instances which can be used to render our JTextArea editor. • JTextArea m editor: used as our text editor. • JMenuItem[] m fontMenus: an array of menu items representing available fonts. • JCheckBoxMenuItem m bold: menu item which sets/unsets the bold property of the current font. Dept. of CS & E, GEC, Hassan 31
  • 38. Digital Reader Chapter 4 Figure 4.3: Xml Viewer • JCheckBoxMenuItem m italic: menu item which sets/unsets the italic property of the current font. • JFileChooser m chooser: used to load and save simple text files. • File m currentFile: the current File instance corresponding to the current docu- ment. • boolean m textChanged: will be set to true if the current document has been changed; will be set to false if the document was just opened or saved. This flag is used in combination with a DocumentListener to determine whether or not to save the current document before dismissing it. The BasicTextEditor constructor populates our m fonts array with Font instances corre- sponding to the names provided in FONTS[]. The m editor JTextArea is then created and placed in a JScrollPane. This scroll pane is added to the center of our frames con- tent pane and we append some simple text to m editor for display at startup. Our Dept. of CS & E, GEC, Hassan 32
  • 39. Digital Reader Chapter 4 createMenuBar() method is called to create the menu bar to manage this application, and this menu bar is then added to our frame using the setJMenuBar() method. The createMenuBar() method creates and returns a JMenuBar. Each menu item receives an ActionListener to handle it’s selection. Two menus are added titled File and Font. The File menu is assigned a mnemonic character, f, and by pressing ALT+F while the ap- plication frame is active, its popup will be displayed allowing navigation with either the mouse or keyboard. The Font menu is assigned the mnemonic character o. The New menu item in the File menu is responsible for creating a new (empty) document. It doesnt really replace JTextAreas Document. Instead it simply clears the contents of our editor component. Before it does so, however, it calls our custom promptToSave() method to determine whether we want to continue without saving the current changes (if any) or not. Note that an icon is used for this menu item. Also note that this menu item can be selected with the keyboard by pressing n when the File menus popup is visible, because we assigned it n as a mnemonic. We also assigned it the accelerator CTRL+N. Therefore, this menus action will be directly invoked whenever that key combination is pressed. (All other menus and menu items in this example also receive appropriate mnemonics and accelerators.) The Open menu item brings up our m chooser JFileChooser component to allow se- lection of a text file to open. Once a text file is selected, we open a FileReader on it and invoke read() on our JTextArea component to read the file’s content (which creates a new PlainDocument containing the selected files content to replace the current JTextArea document ). The Save menu item brings up m chooser to select a destination and file name to save the current text to (if previously not set). Once a text file is selected, we open a FileWriter on it and invoke write() on our JTextArea component to write its con- tent to the destination file. The Save As menu is similar to the Save menu, but prompts the user to select a new file. The Exit menu item terminates program execution. This is separated from the first three menu items with a menu separator to create a more logical display. The Font menu consists of several menu items used to select the font and font style used in our editor. All of these items receive the same ActionListener which invokes our updateEditor() method. To give the user an idea of how each font looks, each font is used to render the corresponding menu item text. Since only one font can be selected at any given time, we use JRadioButtonMenuItems for these menu items, and add them all to a ButtonGroup instance which manages a single selection. To create each menu item we iterate through our FONTS array and create a JRadioButtonMenuItem corre- sponding to each entry. Each item is set to unselected (except for the first one), assigned Dept. of CS & E, GEC, Hassan 33
  • 40. Digital Reader Chapter 4 a numerical mnemonic corresponding to the current FONTS array index, assigned the appropriate Font instance for rendering its text, assigned our multi-purpose ActionLis- tener, and added to our ButtonGroup along with the others. The two other menu items in the Font menu manage the bold and italic font properties. They are implemented as JCheckBoxMenuItems since these properties can be selected or unselected independently. These items also are assigned the same ActionListener as the radio button items to process changes in their selected state. The updateEditor() method updates the current font used to render the editing com- ponent by checking the state of each check box item and determining which radio button item is currently selected. The m bold and m italic components are disabled and uns- elected if the Courier font is selected, and enabled otherwise. The appropriate m fonts array element is selected and a Font instance is derived from it corresponding to the current state of the check box items using Fonts deriveFont() method. The figure 4.4 shows the snapshot of Text Viewer. 4.5 Developing an RTF Viewer Using Styles to manage a set of attributes as a single named entity can greatly simplify text editing. The user only has to apply a known style to a selected region of text rather than selecting all appropriate text attributes from the provided toolbar components. By adding a combo box allowing the choice of styles, we can not only save the user time and effort, but we can also provide more uniform text formatting throughout the resulting document (or potentially set of documents). In this section we’ll add style management to our word processor. Well also show how it is possible to create a new style, modify an existing style, or reapply a style to modified text. Class WordProcessor One new instance variable has been added: JComboBox m cbStyles: toolbar component to manage styles. Note that a new custom method showStyles() (see below) is now called after creating a new document or after loading an existing one. The createMenuBar() method creates a new menu with two new menu items for up- dating and reapplying styles, and a new combo box for style selection. The editable styles combobox, m cbStyles, will hold a list of styles declared in the current document. Dept. of CS & E, GEC, Hassan 34
  • 41. Digital Reader Chapter 4 Figure 4.4: Text Viewer and editor It receives an ActionListener which checks whether the currently selected style name is present among the existing styles. If not, we add it to the drop-down list and retrieve a new Style instance for the selected name using StyledDocuments addStyle() method. This new Style instance is associated with the text attributes of the character element at the current caret position. Otherwise, if the given style name is known already, we retrieve the selected style using StyledDocuments getStyle() method and apply it to the selected text by passing it to our custom setAttributeSet() method. An ambiguous situation occurs when the user selects a style for text which already has the same style, but whose attributes have been modified. The user may either want to update the selected style using the selected text as a base, or reapply the existing style to the selected text. To resolve this situation we need to ask the user what to do. We chose to add two menu items which allow the user to either update or reapply the current selection. Dept. of CS & E, GEC, Hassan 35
  • 42. Digital Reader Chapter 4 The menu items to perform these tasks are titled Update and Reapply, and are grouped into the Style menu. The Style menu is added to the Format menu. The Update menu item receives an ActionListener which retrieves the text attributes of the character element at the current caret position, and assigns them to the selected style. The Reapply menu item receives an ActionListener which applies the selected style to the selected text (one might argue that this menu item would be more appropriately titled Apply – the implications are ambiguous either way). Our showAttributes() method receives additional code to manage the new styles com- bobox, m cbStyles, when the caret moves through the document. It retrieves the style cor- responding to the current caret position with StyledDocuments getLogicalStyle() method, and selects the appropriate entry in the combobox. The new showStyles() method is called to populate the m cbStyles combobox with the style names from a newly created or loaded document. First it removes the current content of the combobox if it is not empty (another work around due to the fact that if you call removeAllItems() on an empty JComboBox, an exception will be thrown). An Enumeration of style names is then retrieved with StyledDocuments getStyleNames() method, and these names are added to the combobox. Open an existing RTF file, and note how the styles combobox is populated by the style names defined in this document. Verify that the selected style is automatically updated while the caret moves through the document. Select a portion of text and select a different style from the styles combobox. Note how all text properties are updated according to the new style. Try selecting a portion of text and modifying its attributes (for instance, foreground color). Type a new name in the styles combobox and press Enter. This will create a new style which can be applied to any other document text. The figure 4.5 shows the snapshot of RTF Viewer. 4.6 PDF Viewer Portable Document Format (PDF) is a file format used to represent documents in a man- ner independent of application software, hardware, and operating systems. Each PDF file encapsulates a complete description of a fixed-layout flat document, including the Dept. of CS & E, GEC, Hassan 36
  • 43. Digital Reader Chapter 4 Figure 4.5: RTF Viewer text, fonts, graphics, and other information needed to display it. In 1991, Adobe Systems co-founder John Warnock outlined a system called ”Camelot” that evolved into PDF. A Java PDF Viewer which is powerful and fully-featured, but simple to setup, inte- grate and customise. JPedal offers everything you need from a multi-platform Java PDF viewer, including PDF Search and extraction. It also offers full PDF to Print support. Viewing PDF is not an add-on, nor an afterthought with the JPedal Java PDF library. The PDF Viewer is the central feature, and we have spent over 10 years developing our fully-featured PDF viewer in Java. Class PDFViewer construct a pdf viewer, passing in the full file name using JPanelDemo(String name). FontMappings.setFontReplacements() Ensure non-embedded font map to sensible replace- ments and store file name for use in page changer after that it opens the PDF and reads its internal details. check if encryption present and acertain password, return true if Dept. of CS & E, GEC, Hassan 37
  • 44. Digital Reader Chapter 4 content accessable using boolean checkEncryption() method. check if file is encrypted if file has a null password it will have been decoded and is- FileViewable will return true and popup window if password needed after that try and reopen with new password and create a standalone program. User may pass in name of file as option. The figure 4.6 shows the snapshot of PDF Viewer. Figure 4.6: PDF Viewer 4.7 Text Search The Text Search utility is constructed by making use of java regular expression library called Swinggrep. Dept. of CS & E, GEC, Hassan 38
  • 45. Digital Reader Chapter 4 Functinality 1.Showing each file path only once, giving more spaces for the lines found. 2.Searching all subdirectories recursively (this feature can be disabled with a checkbox. 3.Highlighting every match in a line, not just the first one. 4.Better colors. 5.Statistics about matches. 6.Use of swing rather than awt for graphics. 7.The ability to save results. Class GrepFrame First we will create a main class to provide a platform input actions through Jbut- tons,Jfilechooser and Jtextarea. A frame to search the contents of a set of files with pathnames based on the current directory (as per ’grep’) and display the results. Regex SearchPattern; The above statement indicates that The pattern we’re searching for.after that we will create a main window to perform search action. GrepFrame MainFrame; String FilePathnameString = new String(); The above staement indicates that The pathname of the file currently being used.which provides the absolute path of the file. GrepDirectoryWalker GrepSubDirWalker = new GrepDirectoryWalker(); The walker that performs the search on each directory.The document into which the re- sults of our search are placed. Note that this must be declared after GrepSubDirWalker, because it uses the latter’s JTextPane ResultsTextPane = new JTextPane(ResultsDocument); The results display the area itself.Container buildFirstLine() method Construct a con- tainer for the main components of the first line of the return A new container of all the main first line components. Container buildSecondLine() method Construct a container for the main components of the second line of the screen. return A new container of all the main second line components. setResultsFont() method Set the font used for displaying the results. The size of the font is determined from that of the current metal user text font. Construct file name of the form ’grepyyyymmddpattern.txt’. return Default file name into which to save the current results using createDefaultSaveFileName() method. Dept. of CS & E, GEC, Hassan 39
  • 46. Digital Reader Chapter 4 Pop up a dialog to display the given exception. param InputException. The excep- tion whose message is to be displayed using showExceptionDialog(Exception InputEx- ception) method.Get the results from the last search in the form of a string. return A string representation of the results using String getResultsString() method.create a class GrepFrameListener to receive frame events, to allow us the opportunity to set the focus to the right field. The figure 4.7 shows the snapshot of Text Search. Figure 4.7: Text search 4.8 File Search The file search utility searches only related files which the user has requested it also proves the absolute path of the file. Dept. of CS & E, GEC, Hassan 40
  • 47. Digital Reader Chapter 4 Class FileSearch A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find informa- tion and the amount of information which must be consulted, akin to other techniques for managing information overload. First we will create a main class to provide a platform input actions through Jbut- tons,Jfilechooser and Jtextarea.we will set the main path for the file to be searched. The figure 4.8 shows the snapshot of File Search. Figure 4.8: File search Dept. of CS & E, GEC, Hassan 41
  • 48. Digital Reader Chapter 4 4.9 Image Viewer Class ImageViewer An image viewer or image browser is a computer program that can display stored graphi- cal image; it can often handle various graphics file formats. Such software usually renders the image according to properties of the display such as color depth, display resolution, and color profile. Image image; boolean scaled; private JTextArea log; These are the variables which are necessary to display images. filter = new ExtensionFileFilter(new String[] ”gif”, ”GIF”, ”jpg”, ”JPG”, ”jpeg”, ”JPEG”, ”GIF and JPG image files”); chooser.addChoosableFileFilter(filter); chooser.removeChoosableFileFilter(chooser.getAcceptAllFileFilter()); The figure 4.9 shows the snapshot of Image Viewer. Figure 4.9: Image Viewer Dept. of CS & E, GEC, Hassan 42
  • 49. Chapter 5 Results and analysis As the digital reader supports multiple document formats under a single platform that is working in architecture independent nature. This is because of the language we have used to develop this application is Java.inorder to analyse our application the has to undergo various testing and analysis on the application. 5.1 Testing Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not limited to, the process of executing a program or application with the intent of finding software bugs (errors or other defects). Software testing can be stated as the process of validating and verifying that a software program/application/product: 1.meets the requirements that guided its design and development. 2.works as expected. 3.can be implemented with the same characteristics. 4.satisfies the needs of stakeholders. Software testing, depending on the testing method employed, can be implemented at any time in the development process. However, most of the test effort traditionally occurs after the requirements have been defined and the coding process has been com- pleted. Although in the Agile approaches most of the test effort is, conversely, on-going. As such, the methodology of the test is governed by the software development method- ology adopted. 43
  • 50. Digital Reader Chapter 5 Different software development models will focus the test effort at different points in the development process. Newer development models, such as Agile, often employ test driven development and place an increased portion of the testing in the hands of the developer, before it reaches a formal team of testers. In a more traditional model, most of the test execution occurs after the requirements have been defined and the coding process has been completed. 5.2 Types of Testing • Black box testing • White box testing • Unit testing • Integration testing • Functional testing • System testing • Regression testing • Acceptance testing • Load testing • Alpha testing • Beta testing 5.3 Testing with JUnit using Eclipse As our application has been developed using Java so we use JUnit to perform testing us- ing Unit testing. the following steps are performed to make testing using JUnit in Eclipse. 5.3.1 Create a Java class In the ”src” folder, we will create the digitalreader.test package and the following class we will create the class for PDFViewer to check if encryption present and acertain password, return true if content accessable. Dept. of CS & E, GEC, Hassan 44
  • 51. Digital Reader Chapter 5 package org.jpedal.examples.jpaneldemo; import java.awt.BorderLayout; import java.awt.Component; import java.awt.Container; public class PDFViewer extends JFrame { .... privatebooleancheckEncryption(); ....} 5.3.2 Create a JUnit test Right click on new class in the Package Explorer and select New then JUnit Test Case. Select ”New JUnit 4 test” and set the source folder to ”test”, so that test class gets created in this folder. Press ”Next” and select the methods which have to be tested. If the JUnit library in not part of the classpath, Eclipse will prompt you to do so. Create a test with the following code. package digitalreader.test; import org.junit.Test; import static org.junit.Assert.assertEquals; public class PDFViewerTest { @T est publicvoidtestcheckEncryption() { P DF V iewertester = newP DF V iewer(); assertEquals(”Result”, password, tester.checkEncryption()); } } 5.3.3 Run test via Eclipse Right click on new test class and select Run-As then JUnit Test The result of the tests will be displayed in the JUnit View. The test should be failing (indicated via a red bar) because it did not displayed any prompts to enter the password. Dept. of CS & E, GEC, Hassan 45
  • 52. Digital Reader Chapter 5 After that we had fixed that problem and re-run the test we got a green bar. If you have several tests we can combine them into a test suite. Running a test suite will execute all tests in that suite. To create a test suite, select the test classes then right click on it then New then Other then JUnit then Test Suite. Select ”Next” and select the methods for to create a test. Similarly we have done tests with other codes. Dept. of CS & E, GEC, Hassan 46
  • 53. Chapter 6 Conclusions and Future Enhancements Digital Reader application enables users to view more document formats using a single application of software. The application has two utilities File Search and Text Search Using these the users can able search there required files and they can also search for a particular word among large number of file contained in a directory The application developed using Java adds an advantage of using it virtually in any platform. Finally we conclude that our application is unique as there are very few or none of the applications available to view more than one document format using a single application software. 47