This document summarizes research on script and language identification techniques for handwritten document images. It discusses the challenges of identifying scripts in handwritten versus printed documents. It then reviews various approaches that have been used, which are generally categorized as structure-based (using features like character geometry) or visual appearance-based (using texture features). Specific techniques discussed include those using linear discriminant analysis, K-nearest neighbors, neural networks, support vector machines, Gabor filters, and steerable pyramids. The document analyzes and compares the performance of different methods. It notes that better pre-processing and larger, standardized datasets are still needed to fully evaluate the techniques.
Wavelet Packet Based Features for Automatic Script IdentificationCSCJournals
Â
In a multi script environment, an archive of documents having the text regions printed in different scripts is in practice. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this paper, a novel texture-based approach is presented to identify the script type of the collection of documents printed in seven scripts, to categorize them for further processing. The South Indian documents printed in the seven scripts - Kannada, Tamil, Telugu, Malayalam, Urdu, Hindi and English are considered here. The document images are decomposed through the Wavelet Packet Decomposition using the Haar basis function up to level two. The texture features are extracted from the sub bands of the wavelet packet decomposition. The Shannon entropy value is computed for the set of sub bands and these entropy values are combined to use as the texture features. Experimentation conducted involved 2100 text images for learning and 1400 text images for testing. Script classification performance is analyzed using the K-nearest neighbor classifier. The average success rate is found to be 99.68%.
WRITER RECOGNITION FOR SOUTH INDIAN LANGUAGES USING STATISTICAL FEATURE EXTRA...ijnlc
Â
The comprehension of whole manually written records is a testing issue which incorporates various
difficult undertakings. Given a written by hand archive, its format needs to be dissected to detach different
content sorts in a first step. These different content sorts can then be coordinated to specific frameworks,
including writer style, image, or table recognizers. Research in programmed author recognizable proof has
principally centred around the measurable methodology. This has prompted the particular and extraction
of factual elements, for example, run-length appropriations, incline dissemination, entropy, and edge-pivot
conveyance. The edge-pivot conveyance highlight out flanks all other measurable elements. Edge-pivot
circulation is an element that portrays the adjustments in bearing of a written work stroke in written by
hand content. The edge- pivot circulation is extricated by method for a window that is slid over an edgerecognized
on offline scanned images. At whatever point the focal pixel of the window is on, the two edge
pieces (i.e. associated successions of pixels) rising up out of this focal pixel are considered. Their bearings
are measured and put away as sets. A joint likelihood dissemination is gotten from an extensive recognition
SCRIPTS AND NUMERALS IDENTIFICATION FROM PRINTED MULTILINGUAL DOCUMENT IMAGEScscpconf
Â
Identification of scripts from multi-script document is one of the important steps in the design
of an OCR system for successful analysis and recognition. Most optical character recognition
(OCR) systems can recognize at most a few scripts. But for large archives of document images
containing different scripts, there must be some way to automatically categorize these
documents before applying the proper OCR on them. Much work has already been reported in
this area. In the Indian context, though some results have been reported, the task is still at its
infancy. This paper presents a research in the identification of Tamil, English and Hindi
scripts at word level irrespective of their font faces and sizes. It also identifies English
numerals from multilingual document images. The proposed technique performs document
vectorization method which generates vectors from the nine zones segmented over the
characters based on their shape, density and transition features. Script is then determined by
using Rule based classifiers and its sub classifiers containing set of classification rules which
are raised from the vectors. The proposed system identifies scripts from document images
even if it suffers from noise and other kinds of distortions. Results from experiments,
simulations, and human vision encounter that the proposed technique identifies scripts and numerals with minimal pre-processing and high accuracy. In future, this can also be extended for other scripts.
Dimension Reduction for Script Classification - Printed Indian Documentsijait
Â
Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. This paper provides a comparison study of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR)
and principal component analysis (PCA), and evaluates the relative performance of classification procedures incorporating those methods. For given script we extracted different features like Gray Level Co-occurrence Method (GLCM) and Scale invariant feature transform (SIFT) features. The features are
extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Extracted features are reduced using various dimension reduction techniques. The reduced features are fed into Nearest Neighbor classifier. Thus the proposed
scheme is efficient and can be used for many practical
pplications which require processing large volumes
of data. The scheme has been tested on 10 Indian scripts and found to be robust in the process of scanning and relatively insensitive to change in font size. This proposed system achieves good classification accuracy on a large testing data set.
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESijnlc
Â
Large amount of information is lying dormant in historical documents and manuscripts. This information would go futile if not stored in digital form. Searching some relevant information from these scanned images would ideally require converting these document images to text form by doing optical character
recognition (OCR). For indigenous scripts of India, there are very few OCRs that can successfully recognize printed text images of varying quality, size, style and font. An alternate approach using word spotting can be effective to access large collections of document images. We propose a word spotting
technique based on codes for matching the word images of Devanagari script. The shape information is utilised for generating integer codes for words in the document image and these codes are matched for final retrieval of relevant documents. The technique is illustrated using Marathi document images.
Script Identification of Text Words from a Tri-Lingual Document Using Voting ...CSCJournals
Â
In a multi script environment, majority of the documents may contain text information printed in more than one script/language forms. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this context, this paper proposes to develop a model to identify and separate text words of Kannada, Hindi and English scripts from a printed tri-lingual document. The proposed method is trained to learn thoroughly the distinct features of each script. The binary tree classifier is used to classify the input text image. Experimentation conducted involved 1500 text words for learning and 1200 text words for testing. Extensive experimentation has been carried out on both manually created data set and scanned data set. The results are very encouraging and prove the efficacy of the proposed model. The average success rate is found to be 99% for manually created data set and 98.5% for data set constructed from scanned document images.
Wavelet Packet Based Features for Automatic Script IdentificationCSCJournals
Â
In a multi script environment, an archive of documents having the text regions printed in different scripts is in practice. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this paper, a novel texture-based approach is presented to identify the script type of the collection of documents printed in seven scripts, to categorize them for further processing. The South Indian documents printed in the seven scripts - Kannada, Tamil, Telugu, Malayalam, Urdu, Hindi and English are considered here. The document images are decomposed through the Wavelet Packet Decomposition using the Haar basis function up to level two. The texture features are extracted from the sub bands of the wavelet packet decomposition. The Shannon entropy value is computed for the set of sub bands and these entropy values are combined to use as the texture features. Experimentation conducted involved 2100 text images for learning and 1400 text images for testing. Script classification performance is analyzed using the K-nearest neighbor classifier. The average success rate is found to be 99.68%.
WRITER RECOGNITION FOR SOUTH INDIAN LANGUAGES USING STATISTICAL FEATURE EXTRA...ijnlc
Â
The comprehension of whole manually written records is a testing issue which incorporates various
difficult undertakings. Given a written by hand archive, its format needs to be dissected to detach different
content sorts in a first step. These different content sorts can then be coordinated to specific frameworks,
including writer style, image, or table recognizers. Research in programmed author recognizable proof has
principally centred around the measurable methodology. This has prompted the particular and extraction
of factual elements, for example, run-length appropriations, incline dissemination, entropy, and edge-pivot
conveyance. The edge-pivot conveyance highlight out flanks all other measurable elements. Edge-pivot
circulation is an element that portrays the adjustments in bearing of a written work stroke in written by
hand content. The edge- pivot circulation is extricated by method for a window that is slid over an edgerecognized
on offline scanned images. At whatever point the focal pixel of the window is on, the two edge
pieces (i.e. associated successions of pixels) rising up out of this focal pixel are considered. Their bearings
are measured and put away as sets. A joint likelihood dissemination is gotten from an extensive recognition
SCRIPTS AND NUMERALS IDENTIFICATION FROM PRINTED MULTILINGUAL DOCUMENT IMAGEScscpconf
Â
Identification of scripts from multi-script document is one of the important steps in the design
of an OCR system for successful analysis and recognition. Most optical character recognition
(OCR) systems can recognize at most a few scripts. But for large archives of document images
containing different scripts, there must be some way to automatically categorize these
documents before applying the proper OCR on them. Much work has already been reported in
this area. In the Indian context, though some results have been reported, the task is still at its
infancy. This paper presents a research in the identification of Tamil, English and Hindi
scripts at word level irrespective of their font faces and sizes. It also identifies English
numerals from multilingual document images. The proposed technique performs document
vectorization method which generates vectors from the nine zones segmented over the
characters based on their shape, density and transition features. Script is then determined by
using Rule based classifiers and its sub classifiers containing set of classification rules which
are raised from the vectors. The proposed system identifies scripts from document images
even if it suffers from noise and other kinds of distortions. Results from experiments,
simulations, and human vision encounter that the proposed technique identifies scripts and numerals with minimal pre-processing and high accuracy. In future, this can also be extended for other scripts.
Dimension Reduction for Script Classification - Printed Indian Documentsijait
Â
Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. This paper provides a comparison study of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR)
and principal component analysis (PCA), and evaluates the relative performance of classification procedures incorporating those methods. For given script we extracted different features like Gray Level Co-occurrence Method (GLCM) and Scale invariant feature transform (SIFT) features. The features are
extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Extracted features are reduced using various dimension reduction techniques. The reduced features are fed into Nearest Neighbor classifier. Thus the proposed
scheme is efficient and can be used for many practical
pplications which require processing large volumes
of data. The scheme has been tested on 10 Indian scripts and found to be robust in the process of scanning and relatively insensitive to change in font size. This proposed system achieves good classification accuracy on a large testing data set.
A NOVEL APPROACH FOR WORD RETRIEVAL FROM DEVANAGARI DOCUMENT IMAGESijnlc
Â
Large amount of information is lying dormant in historical documents and manuscripts. This information would go futile if not stored in digital form. Searching some relevant information from these scanned images would ideally require converting these document images to text form by doing optical character
recognition (OCR). For indigenous scripts of India, there are very few OCRs that can successfully recognize printed text images of varying quality, size, style and font. An alternate approach using word spotting can be effective to access large collections of document images. We propose a word spotting
technique based on codes for matching the word images of Devanagari script. The shape information is utilised for generating integer codes for words in the document image and these codes are matched for final retrieval of relevant documents. The technique is illustrated using Marathi document images.
Script Identification of Text Words from a Tri-Lingual Document Using Voting ...CSCJournals
Â
In a multi script environment, majority of the documents may contain text information printed in more than one script/language forms. For automatic processing of such documents through Optical Character Recognition (OCR), it is necessary to identify different script regions of the document. In this context, this paper proposes to develop a model to identify and separate text words of Kannada, Hindi and English scripts from a printed tri-lingual document. The proposed method is trained to learn thoroughly the distinct features of each script. The binary tree classifier is used to classify the input text image. Experimentation conducted involved 1500 text words for learning and 1200 text words for testing. Extensive experimentation has been carried out on both manually created data set and scanned data set. The results are very encouraging and prove the efficacy of the proposed model. The average success rate is found to be 99% for manually created data set and 98.5% for data set constructed from scanned document images.
International Journal of Engineering Research and DevelopmentIJERD Editor
Â
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
Â
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Robust Text Watermarking Technique for Authorship Protection of Hindi Languag...CSCJournals
Â
Digital text documents have become a significantly important part on the Internet. A large number of users are attracted towards this digital form of text documents. But some security threats also arise concurrently. The digital libraries offer effective ways to access educational materials, government e-documents, financial documents, social media contents and many others. However content authorship and tamper detection of all these digital text documents require special attention. Till now, considerably very few digital watermarking techniques exist for text documents. In this paper, we propose a method for effective watermarking of Hindi language text documents. Hindi stands second among all languages across the world. It has widespread availability of its digital contents of various types. In proposed technique, the watermark is logically embedded in the text using 'swar' (vowel) as a special feature of the Hindi language, supported by suitable encryption. In extraction phase the Certificate Authority (CA) plays an important role in the authorship protection process as a trusted third party. The text is decrypted and watermark is extracted to prove genuine authorship. Our technique has been tested for various types of feasible text attacks with different embedding frequency.
A SURVEY OF LANGUAGE-DETECTION, FONTDETECTION AND FONT-CONVERSION SYSTEMS FOR...IJCI JOURNAL
Â
A large amount of data in Indian languages stored digitally is in ASCII-based font formats. ASCII has 128
character-set, therefore it is unable to represent all the characters necessary to deal with the variety of
scripts available worldwide. Moreover, these ASCII-based fonts are not based on a single standard
mapping between the character-codes and the individual characters, for a particular Indian script, unlike
the English language fonts based on the standard ASCII mapping. Therefore, it is required that the fonts for
a particular script must be available on the system to accurately represent the data in that script. Also, the
conversion of data in one font into another is a difficult task. The non-standard ASCII-based fonts also pose
problems in performing search on texts in Indian languages available over web. There are 25 official
languages in India, and the amount of digital text available in ASCII-based fonts is much larger than the
text available in the standard ISCII (Indian Script Code for Information Interchange) or Unicode formats.
This paper discusses the work done in the field of font-detection (to identify the font of the given text) and
font-converters (to convert the ASCII-format text into the corresponding Unicode text).
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...IJNSA Journal
Â
This paper introduces a multi-layer hybrid text steganography approach by utilizing word tagging and recoloring. Existing approaches are planned to be either progressive in getting imperceptibility, or high hiding limit, or robustness. The proposed approach does not use the ordinary sequential inserting process and overcome issues of the current approaches by taking a careful of getting imperceptibility, high hiding limit, and robustness through its hybrid work by using a linguistic technique and a format-based technique. The linguistic technique is used to divide the cover text into embedding layers where each layer consists of a sequence of words that has a single part of speech detected by POS tagger, while the format-based technique is used to recolor the letters of a cover text with a near RGB color coding to embed 12 bits from the secret message in each letter which leads to high hidden capacity and blinds the embedding, moreover, the robustness is accomplished through a multi-layer embedding process, and the generated stego key significantly assists the security of the embedding messages and its size. The experimental results comparison shows that the purpose approach is better than currently developed approaches in providing an ideal balance between imperceptibility, high hiding limit, and robustness criteria.
An Optical Character Recognition for Handwritten Devanagari ScriptIJERA Editor
Â
Optical Character Recognition is process of recognition of character from scanned document and lots of OCR now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters . There are no sufficient number of work on Indian language script like Devanagari so this paper present a review on optical character recognition on handwritten Devanagari script
Robust extended tokenization framework for romanian by semantic parallel text...ijnlc
Â
Tokenization is considered a solved problem when reduced to just word borders identification, punctuation
and white spaces handling. Obtaining a high quality outcome from this process is essential for subsequent
NLP piped processes (POS-tagging, WSD). In this paper we claim that to obtain this quality we need to use
in the tokenization disambiguation process all linguistic, morphosyntactic, and semantic-level word-related
information as necessary. We also claim that semantic disambiguation performs much better in a bilingual
context than in a monolingual one. Then we prove that for the disambiguation purposes the bilingual text
provided by high profile on-line machine translation services performs almost to the same level with
human-originated parallel texts (Gold standard). Finally we claim that the tokenization algorithm
incorporated in TORO can be used as a criterion for on-line machine translation services comparative
quality assessment and we provide a setup for this purpose.
Language Identifier for Languages of Pakistan Including Arabic and PersianWaqas Tariq
Â
Language recognizer/identifier/guesser is the basic application used by humans to identify the language of a text document. It takes simply a file as input and after processing its text, decides the language of text document with precision using LIJ-I, LIJ-II and LIJ-III. LIJ-I results in poor accuracy and strengthen with the use of LIJ-II which is further boosted towards a higher level of accuracy with the use of LIJ-III. It also helps in calculating the probability of digrams and the average percentages of accuracy. LIJ-I considers the complete character sets of each language while the LIJ-II considers only the difference. A JAVA based language recognizer is developed and presented in this paper in detail.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
DISCRIMINATION OF ENGLISH TO OTHER INDIAN LANGUAGES (KANNADA AND HINDI) FOR O...IJCSEA Journal
Â
India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in English and Telugu script. For Optical Character Recognition (OCR) of such a bilingual document, it is necessary to identify the script before feeding the text words to the OCRs of individual scripts. In this paper, we are introducing a simple and efficient technique of script identification for Kannada, English and Hindi text words of a printed document. The proposed approach is based on the horizontal and vertical projection profile for the discrimination of the three scripts. The feature extraction is done based on the horizontal projection profile of each text words. We analysed 700 different words of Kannada, English and Hindi in order to extract the discrimination features and for the development of knowledge base. We use the horizontal projection profile of each text word and based on the horizontal projection profile we extract the appropriate features. The proposed system is tested on 100 differentdocument images containing more than 1000 text words of each script and a classification rate of 98.25%, 99.25% and 98.87% is achieved for Kannada, English and Hindi respectively.
final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P
Review on feature-based method performance in text steganographyjournalBEEI
Â
The implementation of steganography in text domain is one the crutial issue that can hide an essential message to avoid the intruder. It is caused every personal information mostly in medium of text, and the steganography itself is expectedly as the solution to protect the information that is able to hide the hidden message that is unrecognized by human or machine vision. This paper concerns about one of the categories in steganography on medium of text called text steganography that specifically focus on feature-based method. This paper reviews some of previous research effort in last decade to discover the performance of technique in the development the feature-based on text steganography method. Then, ths paper also concern to discover some related performance that influences the technique and several issues in the development the feature-based on text steganography method.
Uncompressed Image Steganography using BPCS: Survey and AnalysisIOSR Journals
Â
Abstract: Steganography is the art and science of hide secret information in some carrier data without leaving
any apparent evidence of data alternation. In the past, people use hidden tattoos, invisible ink or punching on
papers to convey stenographic data. Now, information is first hide in digital image, text, video and audio. This
paper discusses existing BPCS (Bit Plane Complexity Segmentation) steganography techniques and presences
of some modification. BPCS technique makes use of the characteristics of the human visible system. BPCS
scheme allows for large capacity of embedded secret data and is highly customized. This algorithm offers higher
hiding capacity due to that it exploits the variance of complex regions in each bit plane. In contrast, the BPCS
algorithm provided a much more effective method for obtaining a 50% capacity since visual attacks did not
suffice for detection.
Keywords: BPCS, Data security, Information hiding, Steganography, Stego image
Surface Morphological and Electrical Properties of Sputtered Tio2 Thin FilmsIOSR Journals
Â
Titanium dioxide films were formed on quartz and crystalline p-Si (100) substrates by DC reactive magnetron sputtering method. Pure titanium target was sputtered at a constant oxygen partial pressure of 5x10-2 Pa, and at different sputtering powers in the range 80 â 200 W. The as-deposited films were annealed in air for 1 hour at 1023 K. The deposited films were characterized by studying the surface morphology by atomic force microscopy (AFM), electrical and dielectric properties from current-voltage and capacitance-voltage measurements. Atomic force micrographs of the films showed that the Rrms and Ra increased with the increase of sputter power from 80 to 200 W. The leakage current density was increased by increasing the sputtering power.
Membrane Stabilizing And Antimicrobial Activities Of Caladium Bicolor And Che...IOSR Journals
Â
The crude methanol extracts of whole plant of Caladium bicolor (Aiton) Vent. and leaf of Chenopodium album L. as well as their pet-ether, carbon tetrachloride, chloroform and aqueous soluble fractions were evaluated for membrane stabilizing and antimicrobial activities. At concentration 1.0 mg/ml, the carbon tetrachloride soluble fraction of C. bicolor inhibited 43.92±1.63% and 38.08±0.83 % hypotonic solution and heat induced haemolysis of RBCs, respectively. Among the extractives of C. album, the aqueous soluble fraction inhibited 47.11±0.49 % and 36.73±0.76 % hypotonic solution and heat induced haemolysis of RBCs as compared to 72.79 % and 42.12 % by acetyl salicylic acid (0.10 mg/ml), respectively. C. bicolor test samples demonstrated zone of inhibition ranging from 6.0 to 20.0 mm. The chloroform soluble fraction showed the highest zone of inhibition (20.0 mm) against Staphylococcus aureus. The test samples of C. album displayed zone of inhibition ranging from 7.0 to 13.0 mm. The highest zone of inhibition (13.0 mm) was showed by the chloroform soluble fraction against Salmonella paratyphi
Synthesis, Characterization and Application of Some Polymeric Dyes Derived Fr...IOSR Journals
Â
In this study, Some Monoazo disperse dyes namely, 4-arylazoaminophenols (AAPs) were synthesized via diazotization and coupling reactions and later, polycondensation of these dyes with formaldehyde in the presence of aqueous oxalic acid was carried out. The resulting polymeric dyes namely, (4-arylazoaminophenol-formaldehyde)s (PAAP-F)s as well as their low-molecular weight precursors were characterized by yield, melting point, color, solubility, viscosimetry, Proton Nuclear Magnetic Resonance spectroscopy, UV-visible spectroscopy and Infra red spectroscopy. Their dyeing performance on nylon and polyester were assessed using standard methods. The products were obtained in good yield and had low melting points The dyes were found to be soluble in chloroform and acetone, some were found to dissolve in ethanol and methanol, and generally insoluble in water. The dyeing on nylon and polyester had yellow shades with moderate to good light and wash fastness. Their rubbing fastnesses on nylon and polyester were very good. Polymerizations of the monomeric dyes on dyed nylon and polyester have also been carried out. The dyeing properties of the monomeric and polymeric dyes were compared with the dyes polymerized in situ on nylon and polyester and the fastness properties were found to increase on polymerization and even better with the dyes polymerized inside the fibers
International Journal of Engineering Research and DevelopmentIJERD Editor
Â
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
Â
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Robust Text Watermarking Technique for Authorship Protection of Hindi Languag...CSCJournals
Â
Digital text documents have become a significantly important part on the Internet. A large number of users are attracted towards this digital form of text documents. But some security threats also arise concurrently. The digital libraries offer effective ways to access educational materials, government e-documents, financial documents, social media contents and many others. However content authorship and tamper detection of all these digital text documents require special attention. Till now, considerably very few digital watermarking techniques exist for text documents. In this paper, we propose a method for effective watermarking of Hindi language text documents. Hindi stands second among all languages across the world. It has widespread availability of its digital contents of various types. In proposed technique, the watermark is logically embedded in the text using 'swar' (vowel) as a special feature of the Hindi language, supported by suitable encryption. In extraction phase the Certificate Authority (CA) plays an important role in the authorship protection process as a trusted third party. The text is decrypted and watermark is extracted to prove genuine authorship. Our technique has been tested for various types of feasible text attacks with different embedding frequency.
A SURVEY OF LANGUAGE-DETECTION, FONTDETECTION AND FONT-CONVERSION SYSTEMS FOR...IJCI JOURNAL
Â
A large amount of data in Indian languages stored digitally is in ASCII-based font formats. ASCII has 128
character-set, therefore it is unable to represent all the characters necessary to deal with the variety of
scripts available worldwide. Moreover, these ASCII-based fonts are not based on a single standard
mapping between the character-codes and the individual characters, for a particular Indian script, unlike
the English language fonts based on the standard ASCII mapping. Therefore, it is required that the fonts for
a particular script must be available on the system to accurately represent the data in that script. Also, the
conversion of data in one font into another is a difficult task. The non-standard ASCII-based fonts also pose
problems in performing search on texts in Indian languages available over web. There are 25 official
languages in India, and the amount of digital text available in ASCII-based fonts is much larger than the
text available in the standard ISCII (Indian Script Code for Information Interchange) or Unicode formats.
This paper discusses the work done in the field of font-detection (to identify the font of the given text) and
font-converters (to convert the ASCII-format text into the corresponding Unicode text).
A MULTI-LAYER HYBRID TEXT STEGANOGRAPHY FOR SECRET COMMUNICATION USING WORD T...IJNSA Journal
Â
This paper introduces a multi-layer hybrid text steganography approach by utilizing word tagging and recoloring. Existing approaches are planned to be either progressive in getting imperceptibility, or high hiding limit, or robustness. The proposed approach does not use the ordinary sequential inserting process and overcome issues of the current approaches by taking a careful of getting imperceptibility, high hiding limit, and robustness through its hybrid work by using a linguistic technique and a format-based technique. The linguistic technique is used to divide the cover text into embedding layers where each layer consists of a sequence of words that has a single part of speech detected by POS tagger, while the format-based technique is used to recolor the letters of a cover text with a near RGB color coding to embed 12 bits from the secret message in each letter which leads to high hidden capacity and blinds the embedding, moreover, the robustness is accomplished through a multi-layer embedding process, and the generated stego key significantly assists the security of the embedding messages and its size. The experimental results comparison shows that the purpose approach is better than currently developed approaches in providing an ideal balance between imperceptibility, high hiding limit, and robustness criteria.
An Optical Character Recognition for Handwritten Devanagari ScriptIJERA Editor
Â
Optical Character Recognition is process of recognition of character from scanned document and lots of OCR now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters . There are no sufficient number of work on Indian language script like Devanagari so this paper present a review on optical character recognition on handwritten Devanagari script
Robust extended tokenization framework for romanian by semantic parallel text...ijnlc
Â
Tokenization is considered a solved problem when reduced to just word borders identification, punctuation
and white spaces handling. Obtaining a high quality outcome from this process is essential for subsequent
NLP piped processes (POS-tagging, WSD). In this paper we claim that to obtain this quality we need to use
in the tokenization disambiguation process all linguistic, morphosyntactic, and semantic-level word-related
information as necessary. We also claim that semantic disambiguation performs much better in a bilingual
context than in a monolingual one. Then we prove that for the disambiguation purposes the bilingual text
provided by high profile on-line machine translation services performs almost to the same level with
human-originated parallel texts (Gold standard). Finally we claim that the tokenization algorithm
incorporated in TORO can be used as a criterion for on-line machine translation services comparative
quality assessment and we provide a setup for this purpose.
Language Identifier for Languages of Pakistan Including Arabic and PersianWaqas Tariq
Â
Language recognizer/identifier/guesser is the basic application used by humans to identify the language of a text document. It takes simply a file as input and after processing its text, decides the language of text document with precision using LIJ-I, LIJ-II and LIJ-III. LIJ-I results in poor accuracy and strengthen with the use of LIJ-II which is further boosted towards a higher level of accuracy with the use of LIJ-III. It also helps in calculating the probability of digrams and the average percentages of accuracy. LIJ-I considers the complete character sets of each language while the LIJ-II considers only the difference. A JAVA based language recognizer is developed and presented in this paper in detail.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
DISCRIMINATION OF ENGLISH TO OTHER INDIAN LANGUAGES (KANNADA AND HINDI) FOR O...IJCSEA Journal
Â
India is a multilingual multi-script country. In every state of India there are two languages one is state local language and the other is English. For example in Andhra Pradesh, a state in India, the document may contain text words in English and Telugu script. For Optical Character Recognition (OCR) of such a bilingual document, it is necessary to identify the script before feeding the text words to the OCRs of individual scripts. In this paper, we are introducing a simple and efficient technique of script identification for Kannada, English and Hindi text words of a printed document. The proposed approach is based on the horizontal and vertical projection profile for the discrimination of the three scripts. The feature extraction is done based on the horizontal projection profile of each text words. We analysed 700 different words of Kannada, English and Hindi in order to extract the discrimination features and for the development of knowledge base. We use the horizontal projection profile of each text word and based on the horizontal projection profile we extract the appropriate features. The proposed system is tested on 100 differentdocument images containing more than 1000 text words of each script and a classification rate of 98.25%, 99.25% and 98.87% is achieved for Kannada, English and Hindi respectively.
final Year Projects, Final Year Projects in Chennai, Software Projects, Embedded Projects, Microcontrollers Projects, DSP Projects, VLSI Projects, Matlab Projects, Java Projects, .NET Projects, IEEE Projects, IEEE 2009 Projects, IEEE 2009 Projects, Software, IEEE 2009 Projects, Embedded, Software IEEE 2009 Projects, Embedded IEEE 2009 Projects, Final Year Project Titles, Final Year Project Reports, Final Year Project Review, Robotics Projects, Mechanical Projects, Electrical Projects, Power Electronics Projects, Power System Projects, Model Projects, Java Projects, J2EE Projects, Engineering Projects, Student Projects, Engineering College Projects, MCA Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, Wireless Networks Projects, Network Security Projects, Networking Projects, final year projects, ieee projects, student projects, college projects, ieee projects in chennai, java projects, software ieee projects, embedded ieee projects, "ieee2009projects", "final year projects", "ieee projects", "Engineering Projects", "Final Year Projects in Chennai", "Final year Projects at Chennai", Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, Final Year Java Projects, Final Year ASP.NET Projects, Final Year VB.NET Projects, Final Year C# Projects, Final Year Visual C++ Projects, Final Year Matlab Projects, Final Year NS2 Projects, Final Year C Projects, Final Year Microcontroller Projects, Final Year ATMEL Projects, Final Year PIC Projects, Final Year ARM Projects, Final Year DSP Projects, Final Year VLSI Projects, Final Year FPGA Projects, Final Year CPLD Projects, Final Year Power Electronics Projects, Final Year Electrical Projects, Final Year Robotics Projects, Final Year Solor Projects, Final Year MEMS Projects, Final Year J2EE Projects, Final Year J2ME Projects, Final Year AJAX Projects, Final Year Structs Projects, Final Year EJB Projects, Final Year Real Time Projects, Final Year Live Projects, Final Year Student Projects, Final Year Engineering Projects, Final Year MCA Projects, Final Year MBA Projects, Final Year College Projects, Final Year BE Projects, Final Year BTech Projects, Final Year ME Projects, Final Year MTech Projects, Final Year M.Sc Projects, IEEE Java Projects, ASP.NET Projects, VB.NET Projects, C# Projects, Visual C++ Projects, Matlab Projects, NS2 Projects, C Projects, Microcontroller Projects, ATMEL Projects, PIC Projects, ARM Projects, DSP Projects, VLSI Projects, FPGA Projects, CPLD Projects, Power Electronics Projects, Electrical Projects, Robotics Projects, Solor Projects, MEMS Projects, J2EE Projects, J2ME Projects, AJAX Projects, Structs Projects, EJB Projects, Real Time Projects, Live Projects, Student Projects, Engineering Projects, MCA Projects, MBA Projects, College Projects, BE Projects, BTech Projects, ME Projects, MTech Projects, M.Sc Projects, IEEE 2009 Java Projects, IEEE 2009 ASP.NET Projects, IEEE 2009 VB.NET Projects, IEEE 2009 C# Projects, IEEE 2009 Visual C++ Projects, IEEE 2009 Matlab Projects, IEEE 2009 NS2 Projects, IEEE 2009 C Projects, IEEE 2009 Microcontroller Projects, IEEE 2009 ATMEL Projects, IEEE 2009 PIC Projects, IEEE 2009 ARM Projects, IEEE 2009 DSP Projects, IEEE 2009 VLSI Projects, IEEE 2009 FPGA Projects, IEEE 2009 CPLD Projects, IEEE 2009 Power Electronics Projects, IEEE 2009 Electrical Projects, IEEE 2009 Robotics Projects, IEEE 2009 Solor Projects, IEEE 2009 MEMS Projects, IEEE 2009 J2EE P
Review on feature-based method performance in text steganographyjournalBEEI
Â
The implementation of steganography in text domain is one the crutial issue that can hide an essential message to avoid the intruder. It is caused every personal information mostly in medium of text, and the steganography itself is expectedly as the solution to protect the information that is able to hide the hidden message that is unrecognized by human or machine vision. This paper concerns about one of the categories in steganography on medium of text called text steganography that specifically focus on feature-based method. This paper reviews some of previous research effort in last decade to discover the performance of technique in the development the feature-based on text steganography method. Then, ths paper also concern to discover some related performance that influences the technique and several issues in the development the feature-based on text steganography method.
Uncompressed Image Steganography using BPCS: Survey and AnalysisIOSR Journals
Â
Abstract: Steganography is the art and science of hide secret information in some carrier data without leaving
any apparent evidence of data alternation. In the past, people use hidden tattoos, invisible ink or punching on
papers to convey stenographic data. Now, information is first hide in digital image, text, video and audio. This
paper discusses existing BPCS (Bit Plane Complexity Segmentation) steganography techniques and presences
of some modification. BPCS technique makes use of the characteristics of the human visible system. BPCS
scheme allows for large capacity of embedded secret data and is highly customized. This algorithm offers higher
hiding capacity due to that it exploits the variance of complex regions in each bit plane. In contrast, the BPCS
algorithm provided a much more effective method for obtaining a 50% capacity since visual attacks did not
suffice for detection.
Keywords: BPCS, Data security, Information hiding, Steganography, Stego image
Surface Morphological and Electrical Properties of Sputtered Tio2 Thin FilmsIOSR Journals
Â
Titanium dioxide films were formed on quartz and crystalline p-Si (100) substrates by DC reactive magnetron sputtering method. Pure titanium target was sputtered at a constant oxygen partial pressure of 5x10-2 Pa, and at different sputtering powers in the range 80 â 200 W. The as-deposited films were annealed in air for 1 hour at 1023 K. The deposited films were characterized by studying the surface morphology by atomic force microscopy (AFM), electrical and dielectric properties from current-voltage and capacitance-voltage measurements. Atomic force micrographs of the films showed that the Rrms and Ra increased with the increase of sputter power from 80 to 200 W. The leakage current density was increased by increasing the sputtering power.
Membrane Stabilizing And Antimicrobial Activities Of Caladium Bicolor And Che...IOSR Journals
Â
The crude methanol extracts of whole plant of Caladium bicolor (Aiton) Vent. and leaf of Chenopodium album L. as well as their pet-ether, carbon tetrachloride, chloroform and aqueous soluble fractions were evaluated for membrane stabilizing and antimicrobial activities. At concentration 1.0 mg/ml, the carbon tetrachloride soluble fraction of C. bicolor inhibited 43.92±1.63% and 38.08±0.83 % hypotonic solution and heat induced haemolysis of RBCs, respectively. Among the extractives of C. album, the aqueous soluble fraction inhibited 47.11±0.49 % and 36.73±0.76 % hypotonic solution and heat induced haemolysis of RBCs as compared to 72.79 % and 42.12 % by acetyl salicylic acid (0.10 mg/ml), respectively. C. bicolor test samples demonstrated zone of inhibition ranging from 6.0 to 20.0 mm. The chloroform soluble fraction showed the highest zone of inhibition (20.0 mm) against Staphylococcus aureus. The test samples of C. album displayed zone of inhibition ranging from 7.0 to 13.0 mm. The highest zone of inhibition (13.0 mm) was showed by the chloroform soluble fraction against Salmonella paratyphi
Synthesis, Characterization and Application of Some Polymeric Dyes Derived Fr...IOSR Journals
Â
In this study, Some Monoazo disperse dyes namely, 4-arylazoaminophenols (AAPs) were synthesized via diazotization and coupling reactions and later, polycondensation of these dyes with formaldehyde in the presence of aqueous oxalic acid was carried out. The resulting polymeric dyes namely, (4-arylazoaminophenol-formaldehyde)s (PAAP-F)s as well as their low-molecular weight precursors were characterized by yield, melting point, color, solubility, viscosimetry, Proton Nuclear Magnetic Resonance spectroscopy, UV-visible spectroscopy and Infra red spectroscopy. Their dyeing performance on nylon and polyester were assessed using standard methods. The products were obtained in good yield and had low melting points The dyes were found to be soluble in chloroform and acetone, some were found to dissolve in ethanol and methanol, and generally insoluble in water. The dyeing on nylon and polyester had yellow shades with moderate to good light and wash fastness. Their rubbing fastnesses on nylon and polyester were very good. Polymerizations of the monomeric dyes on dyed nylon and polyester have also been carried out. The dyeing properties of the monomeric and polymeric dyes were compared with the dyes polymerized in situ on nylon and polyester and the fastness properties were found to increase on polymerization and even better with the dyes polymerized inside the fibers
Simultaneous Triple Series Equations Involving Konhauser Biorthogonal Polynom...IOSR Journals
Â
Biorthogonal polynomials are of great interest for Physicists.Spencer and Fano [9] used the biorthogonal polynomials (for the case k = 2) in carrying out calculations involving penetration of gamma rays through matter.In the present paper an exact solution of simultaneous triple series equations involving Konhauser-biorthogonal polynomials of first kind of different indices is obtained by multiplying factor technique due to Noble.[4] This technique has been modified by Thakare [10, 11] to solve dual series equations involving orthogonal polynomials which led to disprove a possible conjecture of Askey [1] that a dual series equation involving Jacobi polynomials of different indices can not be solved. In this paper the solution of simultaneous triple series equations involving generalized Laguerre polynomials also have been discussed as a charmfull particular case.
50 Hz Frequency Magnetic Field Effects On Pseudomonas Aeruginosa And Bacillus...IOSR Journals
Â
The effect of electromagnetic field of different intensities on Pseudomonas aeruginosa (as gram-negative
bacteria) and Bacillus subtilis (as gram-positive bacteria) was investigated to find out the effective magnetic field strength that alters the running physiological processes of every microorganism. Equal volumes of P. aeruginosa and B. subtilis suspensions were exposed for one hour at their maximum rate of active growth to the electromagnetic field (2 - 10 mT, 50 Hz). The results indicated that no remarkable differences were found in the growth of exposed P. aeruginosa. Moreover, a remarkable inhibition in the growth of exposed relative to unexposed B. subtilis cells was achieved at (4 mT) as compared with other intensities which may indicate that this magnetic field induction had a great effect on the biological activity of the cells, so more investigations were made at this magnetic field induction. Remarkable changes in the growth characteristics could be easily detected as the absorbance decreased which indicate a decrease in the cells number and consequently an
inhibition case for the bacteria. Also, the antibiotic sensitivity test of B. subtilis cells indicated either inhibition or stimulation case for the bacteria depending on the drug mode of action
De-Noising Corrupted ECG Signals By Empirical Mode Decomposition (EMD) With A...IOSR Journals
Â
The electrocardiogram (ECG) signals which are extensively used for heart disease diagnosis and patient monitoring are usually corrupted with various sources of noise. In this paper, an algorithm is developed to de-noise ECG signals based on Empirical Mode Decomposition (EMD) with application of Higher Order Statistics (HOS). The algorithm is applied on several ECG signals for different levels of Signal to Noise Ratio (SNR). The SNR improvement (SNRimp) and Percent Root mean square Difference (PRD (%)) are analyzed. The results show that the developed algorithm is a reasonable one to de-noise ECG signals.
IOSR Journal of Mathematics(IOSR-JM) is an open access international journal that provides rapid publication (within a month) of articles in all areas of mathemetics and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in mathematics. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Recent Developments and Analysis of Electromagnetic Metamaterial with all of ...IOSR Journals
Â
Abstract: Recent advances in metamaterials (MMs) research have highlighted the possibility to create novel
devices with electromagnetic functionality. The metamaterial have the power which can easily construct
materials with a user-designed EM response with a particular target frequency. This is the important
phenomena of THz frequency region that can make a considerable progress in design fabrication, and define the
characteristics of MMs at THz frequencies. This article illustrates the latest advancements of THz MMs
research.
Key word: Metamaterials (MMs), Terahertz (THz).
A study of serum Cadmium and lead in Iraqi postmenopausal women with osteopor...IOSR Journals
Â
Postmenopausal status is an independent risk factor for osteoporosis. Several studies have reported that heavy metals, including lead, mercury, cadmium, and arsenic, have harmful effects on bone. The aim of this study was to evaluate the effect of heavy metals, including Cadmium and Lead on osteoporosis in postmenopausal Iraqi women. This prospective study included a total of 70 postmenopausal women divided as 40patients with osteoporosis compared to 30 apparently healthy women as controls during 2011. Serum levels of Cadmium and Lead were measured using atomic absorption while serum Calcium, Phosphorus and Alkaline phosphatase were measured by spectrophotometry.The results showed that there was no significant difference between patients and controls regarding age, Body Mass Index, Calcium, Phosphorous, and Alkaline phosphatase. Serum levels of Cadmium and Lead were higher in patients compared to controls, p < 0.001 and p< 0.01 respectively. It is concluded that increased serum levels of cadmium and lead maybe associated with higher risk of osteoporosis in postmenopausal women.
Periodic Table Gets Crowded In Year 2011.IOSR Journals
Â
Abstract: Year 2011, has been specially important for teachers and students of chemistry, as after a gap of about 14 years at least five new elements were named and included in the periodic table. All these elements are synthetic and radioactive and some were actually made in 1999, but got their name and status by IUPAC, in July 2011. The total number of elements now in periodic table is 112, and scientists are trying their best to prepare elements with atomic numbers 118, 119 and 120 as well.
On The Origin of Electromagnetic Waves from Lightning DischargesIOSR Journals
Â
Interaction of up going ion beam forming current flow in the pre-ionized stepped leader plasma and
the way, how the kinetic energy of the beam particles is converted into electromagnetic energy have been
discussed. The ion beam interaction with the plasma wave modes in the stepped leader channel produces
perturbations in the return stroke current flow and changes its uniformity and becomes non-uniform. In the
present study, the return current is taken to be deeply modulated at a given modulation frequency, and
considered that it behaves like an antenna for electromagnetic radiation. In this paper the total amount of
energy associated with return stroke is given to electromagnetic waves is estimated.
Effects of Aqueous and Methanolic Leaf Extracts of Vitex doniana on Lipid Pro...IOSR Journals
Â
The effect of aqueous and methanolic extracts of Vitex doniana leaves in serum lipid profile and liver enzymes in normal and alloxan-induced diabetic rats were investigated using standard analytical protocols. A total of 35 albino rats divided into seven groups of five rats each comprising one normal untreated group as animal control, one diabetic untreated group as diabetic control, one normal treated with 750mg/kg body weight as reference group, three diabetic groups treated with 250, 500 and 750mg/kg body weight respectively and one diabetic group treated with 5mg/kg Glibenclamide as standard. The result of acute toxicity test obtained indicated lethal dose (LD50) of greater than 5000mg/kg extract. The results showed that induction of diabetes caused significant (P<0.05)><0.05)><0.05)><0.05) increase in high density lipoprotein in the reference and diabetic groups when compared to normal and diabetic control groups respectively after oral administration of Vitex doniana leaf extracts. It could therefore be concluded that Vitex doniana leaf extract is safe, medicinal and have anti-lipidemia properties and hepato-protective effects.
DIMENSION REDUCTION FOR SCRIPT CLASSIFICATION- PRINTED INDIAN DOCUMENTSijait
Â
Automatic identification of a script in a given document image facilitates many important applications such
as automatic archiving of multilingual documents, searching online archives of document images and for
the selection of script specific OCR in a multilingual environment. This paper provides a comparison study
of three dimension reduction techniques, namely partial least squares (PLS), sliced inverse regression (SIR)
and principal component analysis (PCA), and evaluates the relative performance of classification
procedures incorporating those methods
A New Method for Identification of Partially Similar Indian ScriptsCSCJournals
Â
In this paper, the texture symmetry/non symmetry factor has been exploited to get the script texture by using the Bi Wavelants which give the factor of symmetry/non symmetry in terms of the third cumulant and the Bi-spectra gives the quadratically coupled frequencies. The envelope of Bi-spectra (Bi-Wavelant) provides an accurate behavior of the symmetry/non symmetry factor of the script texture. Classification has been better performed by SVM with training set of roots of the envelope found using the Newton-Raphson technique. The method could successfully identify 8 Indian scripts like Devanagari, Urdu, Gujrati, Telugu, Assamese, Gurmukhi, Kannada, and Bangla. The method can segment any kind of document with very good results. The identification results are excellent.
The paper addresses the automation of the task of an epigraphist in reading and deciphering inscriptions.
The automation steps include Pre-processing, Segmentation, Feature Extraction and Recognition. Preprocessing
involves, enhancement of degraded ancient document images which is achieved through Spatial
filtering methods, followed by binarization of the enhanced image. Segmentation is carried out using Drop
Fall and Water Reservoir approaches, to obtain sampled characters. Next Gabor and Zonal features are
extracted for the sampled characters, and stored as feature vectors for training. Artificial Neural Network
(ANN) is trained with these feature vectors and later used for classification of new test characters. Finally
the classified characters are mapped to characters of modern form. The system showed good results when
tested on the nearly 150 samples of ancient Kannada epigraphs from Ashoka and Hoysala periods. An
average Recognition accuracy of 80.2% for Ashoka period and 75.6% for Hoysala period is achieved.
An Empirical Study on Identification of Strokes and their Significance in Scr...IJMER
Â
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and AssessmentâŠ. And many more.
AN APPORACH FOR SCRIPT IDENTIFICATION IN PRINTED TRILINGUAL DOCUMENTS USING T...ijaia
Â
In this work, we review the outcome of texture features for script classification. Rectangular White Space
analysis algorithm is used to analyze and identify heterogeneous layouts of document images. The texture
features, namely the color texture moments, Local binary pattern (LBP) and responses of Gabor, LM-filter,
S-filter, R-filter are extracted, and combinations of these are considered in the classification. In this work,
a probabilistic neural network and Nearest Neighbor are used for classification. To corrabate the
adequacy of the proposed strategy, an experiment was operated on our own data set. To study the effect of
classification accuracy, we vary the database sizes and the results show that the combination of multiple
features vastly improves the performance.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Survey of Various Methods for Text SummarizationIJERD Editor
Â
Document summarization means retrieved short and important text from the source document. In this paper, we studied various techniques. Plenty of techniques have been developed on English summarization and other Indian languages but very less efforts have been taken for Hindi language. Here, we discusses various techniques in which so many features are included such as time and memory consumption, efficiency, accuracy, ambiguity, redundancy.
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34Ă16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
An Efficient Segmentation Technique for Machine Printed Devanagiri Script: Bo...iosrjce
Â
Segmentation technique plays a major role in scripting the documents for extraction of various
features. Many researchers are doing various research works in this field to make the segmenting process
simple as well as efficient. In this paper a simple segmentation technique for both the line and word
segmentation of a script document has been proposed. The main objective of this technique is to recognize the
spaces that separate two text lines.For the Word segmentation technique also similar procedure is followed. In
this work ,three different scanned document have been taken as input images for both line and word
segmentation techniques. The results found were outstanding with average accuracy for both line and word. It
provides 100% accuracy for line segmentation and 100% for line segmentation as well. Evaluation results show
that our method outperforms several competing methods.
Recognition of Words in Tamil Script Using Neural NetworkIJERA Editor
Â
In this paper, word recognition using neural network is proposed. Recognition process is started with the partitioning of document image into lines, words, and characters and then capturing the local features of segmented characters. After classifying the characters, the word image is transferred into unique code based on character code. This code ideally describes any form of word including word with mixed styles and different sizes. Sequence of character codes of the word form input pattern and word code is a target value of the pattern. Neural network is used to train the patterns of the words. Trained network is tested with word patterns and is recognized or unrecognized based on the network error value. Experiments have been conducted with a local database to evaluate the performance of the word recognizing system and obtained good accuracy. This method can be applied for any language word recognition system as the training is based on only unique code of the characters and words belonging to the language.
AN AUTHORSHIP IDENTIFICATION EMPIRICAL EVALUATION OF WRITING STYLE FEATURES I...CSIT8
Â
In this paper, an investigation was done to identify writing style features that can be used for cross-topic
and cross-genre documents in the Authorship Identification task from 2003 to 2015. Different writing style
features were empirically evaluated that were previously used in single topic and single genre documents
for Authorship Identification to determine whether they can be used effectively for cross-topic and crossgenre Authorship Identification using an ablation process. The dataset used was taken from the 2015 PAN
CLEF Forum English collection consisting of 100 sets. Furthermore, it was investigated whether
combining some of these feature sets can help improve the authorship identification task. Three different
classifiers were used: NaĂŻve Bayes, Support Vector Machine, and Random Forest. The results suggest that
a combination of a lexical, syntactical, structural, and content feature set can be used effectively for cross
topic and cross genre authorship identification, as it achieved an AUC result of 0.837.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
Â
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more âmechanicalâ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
Â
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Â
Are you looking to streamline your workflows and boost your projectsâ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, youâre in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part âEssentials of Automationâ series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Hereâs what youâll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
Weâll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Donât miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Â
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But thereâs more:
In a second workflow supporting the same use case, youâll see:
Your campaign sent to target colleagues for approval
If the âApproveâ button is clicked, a Jira/Zendesk ticket is created for the marketing design team
Butâif the âRejectâ button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Â
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Â
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Â
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Â
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Â
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overviewâ
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
Â
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Â
Clients donât know what they donât know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clientsâ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
Â
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 2, Ver. V (Mar â Apr. 2015), PP 105-109
www.iosrjournals.org
DOI: 10.9790/0661-1725105109 www.iosrjournals.org 105 | Page
A survey on Script and Language identification for
Handwritten document images
Prasanthkumar P V1
, Midhun T P2
, Archana Kurian3
1
(Computer Science & Engineering Department, Vimal Jyothi Engineering College,Kannur,Kerala India)
2
(Computer Science & Engineering Department, Vimal Jyothi Engineering College,Kannur,Kerala India)
3
(Computer Science & Engineering Department, Vimal Jyothi Engineering College,Kannur,Kerala India)
Abstract: Offline Script and language identification serves as a forerunner for a multi lingual Optical
Character Recognizer (OCR). OCR is a software used for digitizing the handwritten or printed document
mages. Most of the OCRs are designed for dealing with a single script. So it canât convert a document which
contain more than one scripts. A country having multi lingual culture like India, multi lingual OCR is an
essential software requirement for automation of handwritten documents processing. Script Identification
consists of three steps Pre-processing, Features extraction and Classification. Based on the features extracted
from the document images the classifier discriminates the script of the documents. This survey is an overview of
different script identification methods for handwritten document images. They are classified into different
categories based on the features and classification algorithm used.
Keywords: Handwritten documents, Optical Character Recognition, Script Identification
I. Introduction
Script and Language identification is an interesting area of research in the domain of document
processing. Optical Character recognition (OCR) is a process in which that optically scanned the paper
document and then converted into computer processable electronics format. Many OCR algorithms have been
developed over years are script specific in the sense that they can read characters written in one particular script
only. It is essential to identify the Script and Language of the document in the multi-script environment before
processing the document using OCR. Both printed and handwritten documents served as a medium of
communication as well as a medium for recording facts. For example in Post Offices, Libraries, Railway
Stations, Government Offices etc where we have to handle such documents which are written in multiple
languages.
Script is defined as the graphic form of the writing system used to write statements expressible in
language [1]. A script may be used by only one language or may be shared by many languages, sometimes with
slight variations from one language to other. For example, Devanagari is used for writing a number of Indian
languages like Sanskrit, Hindi, Konkani, Marathi, etc. Script identification relies on the fact that each script has
unique spatial distribution and visual attributes that make it possible to distinguish it from other
scripts[1][2]. So, the basic task involved in script identification is to devise a technique to discover these
features from a given document and then classify the documents script accordingly. In general, features
necessary for recognition of different script characters depend on the structural properties, style and nature of
writing which generally differs from one script to another.
This review on script and language identification for handwritten document images gives an overview
of different methodologies developed so far. In section II challenges faced in the research of handwritten
document images is presented. Section III discussed the review of the related works. Last section is for
comparative analysis of the identified models as well as the inferences made from them.
II. Challenges
Script identification for handwritten document images is more challenging than printed document
images [4]. These challenges made it impossible to directly apply algorithm for printed document to handwritten
document images. Three main challenges are first, some scripts resemble each other more when handwritten
than when printed. Second, handwriting styles are more diverse than printed fonts. Cultural differences,
individual differences, and even differences in the way that people write at different times, enlarge the inventory
of possible character and word shapes seen in handwritten documents. Third, problems typically addressed in
pre-processing, such as ruling lines and character fragmentation due to low contrast, are common in handwritten
documents due to the variety of papers and writing instruments used. Moreover Text lines in handwritten
documents are curvilinear and the gaps between neighbouring words and lines are far from uniform. It is
difficult to extend the methods to new languages, because they employ a combination of handpicked and
trainable features and variety of decision rules.
2. A survey on Script and Language identification for Handwritten document images
DOI: 10.9790/0661-1725105109 www.iosrjournals.org 106 | Page
III. Script Recognition Methodologies
D. Ghosh et al [1], divide the classification method into two broad categories based on the nature of
the approaches and features used. They are structure-based and visual appearance based methods. Script
identification methods that are based on extraction and analysis of connected components fall under the category
of structure based methods. Visual appearance methods are often related to texture, a block of text
corresponding to each script class forms a distinct texture pattern. Each of these categories further classified into
page-wise, paragraph-wise, textline-wise and word-wise on the basis of the level at which they are applied.
Script identification methods that use segment-wise analysis of character structure is also regarded as local
approach. On the other hand, visual appearance based methods that are designed to identify script by analyzing
the overall look of a text-block may be regarded as a global approach.
3.1 Linear Discriminant based
Hochberg, et al [4] proposed an algorithm for script and language identification from handwritten
document images using statistical features based on connected component analysis. Documents are
characterized by five connected component features which are relative vertical centroid, relative horizontal
centroid, number of holes, sphericity and aspect ratio of the connected components in a document page. For
each of the five connected component features, three document summary statistics mean, standard deviation,
and skew are calculated. This created a fifteen-element vector for each document. A separate Fisher Linear
Discriminant (FLD) is trained to separate each possible pair of script in the dataset. The classifier is tested
through writer-sensitive cross validation and achieved a accuracy of 88% across six scripts Arabic, Chinese,
Cyrillic, Devanagari, Japanese, and Roman. They have used the same method to discriminating Roman script
English and German documents with 85% accuracy.
3.2 K-Nearest Neighbor Based Techniques
Dhandra and Hangarge [5][7] used nearest neighbor and K-nearest neighbor (KNN) algorithms to
classify word images belonging to Kannada, Roman, and Devanagari scripts. By decomposing the word image
in two directions at two levels using morphological transformation seven global features are obtained and other
three dominant local features are computed based on connected components. These features are passed to KNN
classifier for classification of the scripts. The Average and maximum recognition accuracy achieved is 96.05%
and 99% respectively. This algorithm is insensitive to writing style, ink, size, noise and characters slant.
Recently a Gaussian Mixture Model(GMM) was introduced by Mallikarjun Hangarge [6] to identify the script
of handwritten words of Roman, Devanagari, Kannada and Telugu scripts. GMM is modeled using a set of six
novel features derived from directional energy distributions of the underlying image. The standard deviation of
directional energy distributions are computed by decomposing an image matrix into right and left diagonals.
Furthermore, deviation of horizontal and vertical distributions of energies is also built-in to GMM. The model is
tested on biscript, tri-script and multi-script level and achieved script identification accuracies in percentage as
98.7, 98.16 and 96.91 respectively.
To identify eight major scripts, namely Latin, Devanagari, Gujarati, Gurumukhi, Kannada, Malayalam,
Tamil, and Telugu at block level, Rajput and Anita [8] proposed a scheme based upon features extracted using
Discrete Cosine Transform (DCT) and Discrete Wavelets Transform (DWT). A KNN classifier is then
employed for the identification purpose. They achieved an average accuracy rate of 96.4% for tri-script
documents images.
Hiremath et al [9] proposed an approach for script identification using texture features. The scripts
considered for the work was Bangla, Latin, Devanagari, Kannada, Malayalam, Tamil, Telugu, and Urdu. The
texture features are extracted using the co-occurrence histograms of wavelet decomposed images. The
correlation between the sub-bands at the same resolution exhibits a strong relationship and is significant in
characterizing a texture. A KNN classifier is used for the identification of scripts. Average classification
accuracy achieved is 97.5% for a single writer document with full text coverage, which decreases slightly with
the increase in angle of orientation and decrease significantly with the increase in the number of writers.
3.3 Steerable Pyramid based
A method for Arabic and Latin text-block differentiation in both printed and handwritten scripts was
proposed in Benjelil et al[10]. Literature explained an accurate and suitable designed system for script
identification at word level which is based on steerable pyramid transform. The Steerable Pyramid(SP)[11], is a
linear multi-scale, multi-orientation image decomposition, that provides a useful front-end for image processing
and computer vision applications. The SP can capture the variation of a texture in both intensity and orientation.
The overall Handwritten Arabic and Latin identification rate obtained is about 97.5%
3. A survey on Script and Language identification for Handwritten document images
DOI: 10.9790/0661-1725105109 www.iosrjournals.org 107 | Page
3.4 Gabor function-based
Gabor function-based script recognition schemes have shown good performance, their application is
limited to machine-printed documents only. Variations in writing style, character size, and inter-line and inter-
word spacing make the recognition process difficult and unreliable when these techniques are applied directly
on handwritten documents. Therefore, it is necessary to pre-process the document images prior to the
application of Gabor filter so as to compensate for the different variations present. This has been addressed in
the texture-based script identification scheme proposed in [12]. In the pre-processing stage, the algorithm
employs denoising, thinning, pruning, mconnectivity, and text size normalization in sequence. Texture features
are then extracted using a multichannel Gabor filter. Finally, different scripts are classified using fuzzy
classification. In this proposed system, an overall accuracy of 91.6% is achieved in classifying handwritten
documents written in four different scripts, namely Latin, Devanagari, Bengali and Telugu.
3.5 Neural Network Based Techniques
Fractal-based features, busy-zone based features, and Topological features along with an ANN
classifier is used for word-wise Bangla, English, and Devanagari scripts identification by Roy and Majumder
[13]. Multi-Layer Perceptron (MLP) based classifier for script separation, trained with 8 different word level
features. Two equal sized data-sets, one with Bangla and Roman scripts and the other with Devanagri and
Roman scripts, were prepared for the system evaluation and achieved accuracies of 99.29% and 98.43%.
Roy and Das[14] proposed a scheme for identification of scripts written by any of the 6 official
languages via Bangla, Devnagari, Malayalam, Urdu, Oriya and Roman script of India using different features,
namely, fractal dimension based features, component based features, topological features, and a Neural Network
(NN) classifier is used for script identification. The scheme is independent of text size and there is no need
for any normalization. The overall accuracy of the developed system was 89.48 percentages on the test set
without rejection. S M Obaidullah et al [15] have proposed a work which is on six official languages of India.
They have used very simple and efficient features at document level categorized under Abstract/Mathematical
features, Structure based features and Script dependent features. Series of classifiers is used for classification.
Overall accuracy of the proposed system is at present 92.8n% on the test set without rejection.
3.6 Support Vector Based Techniques
To identify the script of handwritten postal codes, Basu et al[16] grouped similar shaped digit patterns
of Bangla, Urdu, Latin, and Devanagari in 25 clusters. A script independent unified support vector machine
(SVM) based pattern classifier is then designed to classify the numeric postal codes into one of these 25 clusters.
Based on these classification decisions a rule-based script assumption engine is designed to assume the script of
the numeric postal code.
IV. Comparative Analysis
Most of the available works on recognition of Indian scripts are based on small databases collected in
laboratory environments. Since the experiments were conducted independently using different data-sets so they
do not reflect the comparative performance of these methods. The Table 1 shown below summarizes some of
the benchmark work in script recognition handwritten document images. Various script features and classifiers
used by different researchers are also listed in the table.
4.1 Better Pre-Processing for Higher Accuracy
The pre-processing steps generally comprises of binarization, gray level normalization, foreground and
background noise removal, size normalization, removal of irrelevant information, skew and slant corrections,
etc. Handwritten character samples are usually collected from individuals belonging to different age groups
having different writing styles, professions, state of mind, writing medium, etc. So, pre-processing of the
character image is an important step before the feature extraction and classification step. Better preprocessing
leads to high performance
4.2 Structure-based Features
Structure-based features like character height distribution, character bounding box profiles, horizontal
projections and several other statistical features do not depend on the document quality and resolution but on the
overall size of the connected components [1]. However, these features are not invariant to character size and
font and offer high performance only in separating distinctly different oriental scripts from others. Several
different structural features like character geometry, occurrence of certain stroke structures and structural
primitives, stroke orientations, measure of cavity regions, side profiles, etc that directly relate to the character
shape have also been used for script characterization. One disadvantage with structure-based methods is that
they require complex pre-processing involving connected component extraction. Also, extraction of structural
4. A survey on Script and Language identification for Handwritten document images
DOI: 10.9790/0661-1725105109 www.iosrjournals.org 108 | Page
features is highly susceptible to noise and poor quality document images. Presence of noise or significant image
degradation adversely affects the location and segmentation of these features, making them difficult or
sometimes impossible to extract. Scripts having similar character shapes may be distinguished by their visual
appearances.
4.3 Gabor filter based
Gabor filter offers a powerful tool to extract out visual attributes from a document. This has motivated
many researchers to employ Gabor filter for script determination. Since texture feature gives the general
appearance of a script, it can be derived from any script class of any nature. Accordingly, this feature may be
considered a universal one. The discriminating power of a multichannel Gabor filter can be varied by having
more channels with different radial frequencies and closely spaced orientation angles. Thus, this system is
flexible compared to all other methods and can be effectively used in discriminating scripts that are quite close
in appearance. The main criticism with this approach is that it cannot be applied with confidence to small text
regions as in word-wise script recognition. Also, Gabor filters are not capable of handling variations in script
size and font, inter-line spacing, etc [3].
4.4 Data collection
One major concern with most of the reported works in script recognition is the lack of any comparative
analysis of the results. Experimental results given for every proposed method have not been compared with
other benchmark works in the field. Moreover, the datasets used in experiments are all different. This is mainly
due to the lack of availability of a standard database for script recognition research. Consequently, it is hard to
assess the results reported in the literature. Hence, a standard evaluation test-bed containing documents written
in only one script type as well as multi-script documents with mix of different scripts within a document is
necessary. One important consideration in selecting the data-set for a script class is that it should reflect the
global probability of occurrence of the characters in texts written in that particular script. Another problem of
concern is for languages that constantly undergo spelling modifications over the years
V. Conclusion
This paper presents a comprehensive survey on script and language identification for offline
handwritten document images. Different approaches proposed are classified into two categories, structure-based
and visual appearance based. Most of the works are in Devanagari and Bangla scripts. To the best of my
knowledge no works are reported for identifying south Indian scripts and languages using support vector
machines up to word level.
Table: I Script Identification Methods
Research/
Authors
Features selected Classifier Used Script Classified Result obtained
Hochberg et al Relative Y centroid,
Relative X centroid ,
number of white holes,
sphericity, and
aspect ratio
Linear
Discriminant
Analysis
Arabic, Chinese, Cyrillic,
Devanagari, Japanese, Latin
Accuracy
of 88% is achieved
Hangare and
Dhandra
Vertical stroke density,
Horizontal Stroke Density,
Right diagonal stroke density,
Left diagonal stroke density
KNN Classifier English, Devanagari, Urdu Accuracy of 97.83%
(English),
93.00% (Devanagari) and
95.78% ( Urdu)
Rajput and Anitha Discrete cosine Transform and Discrete
wavelet transform
KNN Classifier Latin, Devanagari, Gujarati,
Gurumukhi, Kannada,
Malayalam, Tamil, and
Telugu
Accuracy of
98% (KEH),
99.2%(MEH),
93%(PEH),
99.2%(TEH),
90% (GEH) and
99% (TeEH)
Hiremath et al Texture features extracted by using DWT KNN Classifier Bangla, Latin, Devanagari,
Kannada, Malayalam, Tamil,
Telugu, and Urdu
Accuracy of 97.5%
is achieved
Roy and Majumder Fractal-based features,
busy-zone based features,
and Topological features
ANN classifier Bangla, English, and
Devanagari
Accuracy of 99.29% (BR)
and 98.43%(DR)
Roy and Das Fractal dimension based features,
component based features,
Neural
Network
Bangla, Devanagari, Urdu,
Malayalam, Oriya and Roman
Accuracy of 89.48%
5. A survey on Script and Language identification for Handwritten document images
DOI: 10.9790/0661-1725105109 www.iosrjournals.org 109 | Page
topological features Classifier
Mohamed Benjelil
et al
Mean, Standard Deviation, Kurtosis,
Homogeneity, Energy, Correlation
Steerable
Pyramid based
Arabic and Latin Accuracy of 97.5%
Vivek Singhal et al Gabor filter-based texture feature Fuzzy Classifier Devanagari, Bengali, Telugu,
Latin
Accuracy of 91.6%
Sk.Md Obaidullah
et,al
Structure Based, Mathematical based
and Script dependent features
MLP Classifier Bangla, Devanagari, Urdu,
Oriya, Malayalam and Roman
Accuracy of 92.8%
References
Proceedings Papers:
[1]. Debashis Ghosh, Tulika Dube, and Adamane P Shivaprasad Script recognitiona review, Pattern Analysis and Machine Intelligence,
IEEE Transactions on 32 (2010).
[2]. Pal, Umapada and Jayadevan, Ramachandran and Sharma, Nabin Handwriting Recognition in Indian Regional Scripts: A Survey of
Offline Techniques, ACM, March 2012
[3]. D GHOSH and AP SHIVAPRASAD, Handwritten script identification using probabilistic approach for cluster analysis, Journal of
the Indian Institute of Science 80 (2013)
[4]. Judith Hochberg, Kevin Bowers, Michael Cannon, and Patrick Kelly, Script and language identification for handwritten document
images, International Journal on Document Analysis and Recognition 2 (1999)
[5]. BV Dhandra and Mallikarjun Hangarge, Global and local features based handwritten text words and numerals script identification,
Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on, vol. 2, IEEE, 2007
[6]. Mallikarjun Hangarge, Gaussian mixture model for handwritten script identification, arXiv preprint arXiv:1303.2751 (2013).
[7]. Mallikarjun Hangarge and BV Dhandra, Offline handwritten script identification in document images, International Journal of
Computer Applications 4 (2010)
[8]. GG Rajput and HB Anita, Handwritten script recognition using dct and wavelet features at block level, IJCA Special Issue on:
Recent Trends in Image Processing and Pattern Recognition, RTIPPR (2010)
[9]. PS Hiremath, S Shivashankar, Jagdeesh D Pujari, and V Mouneswara, Script identification in a handwritten document image using
texture features, Advance Computing Conference (IACC), 2010 IEEE 2nd International, IEEE, 2010
[10]. Mohamed Benjelil, Slim Kanoun, R my Mullot, and Adel M Alimi, Arabic and latin script identification in printed and handwritten
types based on steerable pyramid features, Document Analysis and Recognition, 2009. ICDAR09. 10th International Conference
on, IEEE, 2009
[11]. Eero P Simoncelli and William T Freeman, The steerable pyramid: A flexible architecture for multi-scale derivative computation,
ImageProcessing, 1995. Proceedings., International Conference on, vol. 3, IEEE, 1995
[12]. Vivek Singhal, Nishant Navin, and D Ghosh, Script-based classification of hand-written text documents in a multilingual
environment, Research Issues in Data Engineering: Multi-lingual Information Management, 2003. RIDE-MLIM 2003. Proceedings.
13th International Workshop on, IEEE, 2003
[13]. Kaushik Roy and Kinshuk Majumder, Trilingual script separation of handwritten postal document, Computer Vision, Graphics &
Image Processing, 2008.ICVGIP08. Sixth Indian Conference on, IEEE, 2008,
[14]. K Roy, S Kundu Das, and Sk Md Obaidullah, Script identification from handwritten document, Computer Vision, Pattern
Recognition, Image Processing and Graphics (NCVPRIPG), 2011 Third National Conference on, IEEE, 2011
[15]. Kaushik Roy Sk Md Obaidullah, Supratik Kundu Das, A system for handwritten script identification from indian document, Journal
of Pattern Recognition Research 8 (2013)
[16]. Subhadip Basu, Nibaran Das, Ram Sarkar, Mahantapas Kundu, Mita Nasipuri, and Dipak Kumar Basu, A novel framework for
automatic sorting of postal documents with multi-script address blocks, Pattern Recognition 43 (2010)
[17]. Gao, Yangdong and Ding, Xiaoqing and Liu, Changsong, A Multiscale Text Line Segmentation Method in Freestyle Handwritten
Documents Document Analysis and Recognition (ICDAR), 2011 International Conference on, 2011, IEEE
[18]. Nicolaou, Anguelos and Gatos, Basilios, Handwritten text line segmentation by shredding text into its lines, Document Analysis and
Recognition, 2009. ICDARâ09. 10th International Conference on, IEEE, 2009.