SlideShare a Scribd company logo
The International Journal Of Engineering And Science (IJES)
|| Volume || 3 || Issue || 5 || Pages || 13-16 || 2014 ||
ISSN (e): 2319 – 1813 ISSN (p): 2319 – 1805
www.theijes.com The IJES Page 13
An Advanced Optimistic Approach for String Transformation
1,
Amith Kumar V Pujari, 2,
Prof. H R Shashidhar
1,
M.Tech Scholar
RNS Institute of Technology Bangalore, Karnataka, India
2,
Asst. Prof. Dept of CSE
RNS Institute of Technology Bangalore, Karnataka, India
-------------------------------------------------------------ABSTRACT---------------------------------------------------
String transformation can be formalized as the part of coding bio-informatics, information retrieval, and in data
mining. However, in part of our paper we focus generating the k most likely output strings corresponding to the
input string. This concept proposes a novel and probabilistic approach to string transformation, which is both
accurate and efficient. The approach mainly focuses on three methods dynamic programming, modeling and
pruning. In the dynamic programming we use the concept of the edit distance which is dynamic programming
tool that estimates the score to each transformation so that the probability in the linear model can be estimated
well. In the linear model, a modeling method for training the model, and an algorithm for generating the top k
candidates, whether there is or is not a predefined dictionary. The linear model is defined as a conditional
probability distribution of an output string and a rule set for the transformation conditioned on an input string.
The learning method employs maximum likelihood estimation for parameter estimation. The string generation
algorithm based on pruning is guaranteed to generate the optimal top k candidates. The proposed method is
applied to correction of spelling errors in queries as well as reformulation of queries on dataset. Experimental
results on large scale data show that the proposed approach is good and efficient improving upon existing
methods in terms of accuracy and efficiency in different settings
KEYWORDS—String Transformation, Edit distance algorithm , Linear Model, Aho-Corasick trees, Spelling
Error Correction, Query Reformulation.
---------------------------------------------------------------------------------------------------------------------------------------
Date of Submission: 19 May 2014 Date of Publication: 30 May 2014
---------------------------------------------------------------------------------------------------------------------------------------
I. INTRODUCTION
The string transformation is an essential problem in many applications. In natural language processing,
pronunciation generation, spelling error correction, word transliteration, and word stemming can all be
formalized as string transformation. String transformation can also be used in query reformulation and query
suggestion in search. In data mining, string transformation can be employed in the mining of synonyms and
database record matching.
As many of the above are online applications, the transformation must be conducted not only
accurately but also efficiently. String transformation can be defined in the following way. Given an input string
and a set of operators, one can transform the input string to the k most likely output strings by applying a
number of operators. Here the strings can be strings of words, characters, or any type of tokens. Each operator is
a transformation rule that defines the replacement of a substring with another substring.
The likelihood of transformation can represent similarity, relevance, and association between two
strings in a specific application. Although certain progress has been made, further investigation of the task is
still necessary, particularly from the viewpoint of enhancing both accuracy and efficiency, which is precisely the
goal of this work.
String transformation can be conducted at two different settings, depending on whether or not a
Dictionary is used. When a dictionary is used, the output strings must exist in the given dictionary, while the
size of the dictionary can be very large. Without loss of generality, one can specifically studies correction of
spelling errors in queries as well as reformulation of queries in web search in this Concept.
An Advanced Optimistic Approach for String Transformation
www.theijes.com The IJES Page 14
In the first task, a string consists of characters. In the second task, a string is comprised of words. The
former needs to exploit a dictionary while the latter does not. Correcting spelling errors in queries usually
consists of two steps: candidate generation and candidate selection. Candidate generation is used to find the
most likely corrections of a misspelled word from the dictionary. In such a case, a string of characters is input
and the operators represent insertion, deletion, and substitution of characters with or without surrounding
characters, for example, “a”!“e” and “lly”!“ly”.
Obviously candidate generation is an example of string transformation. Note that candidate generation
is concerned with a single word; after candidate generation, the words in the context (i.e., in the query) can be
further leveraged to make the final candidate selection. Query reformulation in search is aimed at dealing with
the term mismatch problem. For example, if the query is “NY Times” and the document only contains “New
York Times”, then the query and document do not match well and the document will not be ranked high.
Query reformulation attempts to transform “NY Times” to “New York Times” and thus make a better
matching between the query and document. In the task, given a query (a string of words), one needs to generate
all similar queries from the original query (strings of words).The operators are transformations between words
in queries such as “tx”!“texas” and meaning of ” ! “ definition of.
II. RELATED WORK
Arasu et al. proposed a method which can learn a set of transformation rules that explain most of the
given examples. Increasing the coverage of the rule set was the primary focus. Tejada et al. proposed an active
learning method that can estimate the weights of transformation rules with limited user input. The types of the
transformation rules are predefined such as stemming, prefix, suffix and acronym. Okazaki et al. incorporated
rules into an L1-regularized logistic regression model and utilized the model for string transformation.
Later, Brill and Moore developed a generative model including contextual substitution rules.
Toutanova and Moore further improved the model by adding pronunciation factors into the model. Duan and
Hsu also proposed a generative approach to spelling correction using a noisy channel model. They also
considered efficiently generating candidates by using a tree. However these works on string transformation can
be categorized into two groups.
Some work mainly considered efficient generation of strings. Other work tried to learn the model with
different approaches. However, efficiency is not an important factor taken into consideration in these methods.
The existing work is not focusing on enhancement of both accuracy and efficiency of string transformation.
III. PROPOSED WORK
Input String
Training
(conditional
Probability)
Expanding the
Rules
Generating the
Rules
Edit Distance
Algorithm
Calculate the
Weights by Log
Linear Model
Implement the
Aho-corasick tree
Obtain the relevant
string
Figure 1 : Proposed Architecture
This paper addresses string transformation, which is an essential problem, in many applications. In
natural language processing, pronunciation generation, spelling error correction, word transliteration, and word
stemming can all be formalized as string transformation. String transformation can also be used in query
reformulation and query suggestion in search. String transformation can be defined in the following way.
An Advanced Optimistic Approach for String Transformation
www.theijes.com The IJES Page 15
Given an input string and a set of operators, we are able to transform the input string to the k most
likely output strings by applying a number of operators. Here the strings can be strings of words, characters, or
any type of tokens. Each operator is a transformation rule that defines the replacement of a substring with
another substring.
Initially we align the string by using the number of iterations obtained by using edit distance algorithm
followed by the generation of the rules and expanding its rules without the loss of generality. Then a conditional
probability of identifying the maximum likelihood of the strings generated for the respective strings are obtained
and their weights are calculated by the linear method. Later in the generation phase, aho-corasick tree is
generated arranging the rules and weights.
The linear model is defined as a conditional probability distribution of an output string and a rule set
for the transformation given an input string. The learning method is based on maximum likelihood estimation.
Thus, the model is trained toward the objective of generating strings with the largest likelihood given input
strings.
The generation algorithm efficiently performs the top k candidate’s generation using top k pruning. It is
guaranteed to find the best k candidates without enumerating all the possibilities. An Aho-Corasick tree is
employed to index transformation rules in the model. When a dictionary is used in the transformation, a tree is
used to efficiently retrieve the strings in the dictionary.
The likelihood of transformation can represent similarity, relevance, and association between two
strings in a specific application. Although certain progress has been made, further investigation of the task is
still necessary, particularly from the viewpoint of enhancing both accuracy and efficiency, which is precisely the
goal of this work.
IV. RESULTS AND DISCUSSIONS
Figure 2: Accuracy Comparison of our model with a constant dataset
The above graph shows that there is significant improvement in the accuracy when the concept was
implemented on a constant dataset and hence proves that the concept is very much applicable for a real-time
usage. The date-set size that I have used consists of two thousand words. The above results are based on the
same data-set.
An Advanced Optimistic Approach for String Transformation
www.theijes.com The IJES Page 16
V. CONCLUSION
This paper addresses string transformation, which is an essential problem, in many applications. In
natural language processing, spelling error correction, word transliteration, and word stemming can all be
formalized as string transformation. Our concept in general poses an additional concept of the abbreviation
expansion thus covering all the functional and non-functional requirements. This concept has been implemented
using a friendly interface on a pre-defined dataset and thus results obtained are as expected.
VI. ACKNOWLEDGEMENT
I take this Opportunity to express my profound gratitude and deep regards to my guide Prof. H R
Shashidhar , Assistant Professor , RNS Institute of technology, Bangalore, for his exempla nary guidance, and
constant encouragement throughout.
REFERENCES
[1]. Ziqi Wang, Gu Xu, Hang Li, and Ming Zhang, “A Probabilistic Approach to String Transformation “ , ieee transactions on
knowledge and data engineering vol:pp no:99 year 2013
[2]. M. Li, Y. Zhang, M. Zhu, and M. Zhou, “Exploring distributional similarity based models for query spelling correction,” in
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for
Computational Linguistics, ser. ACL ’06. Morristown, NJ, USA: Association for Computational Linguistics, 2006, pp. 1025–
1032.
[3]. R. Golding and D. Roth, “A window-based approach to context-sensitive spelling correction,” Mach. Learn., vol. 34, pp. 107–130,
February 1999.
[4]. J. Guo, G. Xu, H. Li, and X. Cheng, “A unified and discriminative model for query refinement,” in Proceedings of the 31st annual
international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR ’08. New York, NY,
USA: ACM, 2008, pp. 379–386.
[5]. Behm, S. Ji, C. Li, and J. Lu, “Space-constrained gram-based indexing for efficient approximate string search,” in Proceedings of
the 2009 IEEE International Conference on Data Engineering, ser. ICDE ’09. Washington, DC, USA: IEEE Computer Society,
2009, pp. 604–615.
[6]. E. Brill and R. C. Moore, “An improved error model for noisy channel spelling correction,” in Proceedings of the 38th Annual
Meeting on Association for Computational Linguistics, ser. ACL ’00. Morristown, NJ, USA: Association for Computational
Linguistics, 2000, pp. 286–293.
[7]. N. Okazaki, Y. Tsuruoka, S. Ananiadou, and J. Tsujii, “A discriminative candidate generator for string transformations,” in
Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP ’08. Morristown, NJ, USA:
Association for Computational Linguistics, 2008, pp. 447–456.
[8]. M. Dreyer, J. R. Smith, and J. Eisner, “Latent-variable modeling of string transductions with finite-state methods,” in Proceedings
of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP ’08. Stroudsburg, PA, USA: Association
for Computational Linguistics, 2008, pp. 1080–1089.
[9]. Arasu, S. Chaudhuri, and R. Kaushik, “Learning string transformations from examples,” Proc. VLDB Endow., vol. 2, pp. 514–
525, August 2009.S. Tejada, C. A. Knoblock, and S. Minton, “Learning domainindependent string transformation weights for high
accuracy object identification,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and
data mining, ser. KDD ’02. New York, NY, USA: ACM, 2002, pp. 350–359.
[10]. A. V. Aho and M. J. Corasick, “Efficient string matching: an aid to bibliographic search,” Commun. ACM, vol. 18, pp. 333–340,
June 1975.
[11]. J. Xu and G. Xu, “Learning similarity function for rare queries,” in Proceedings of the fourth ACM international conference on
Web search and data mining, ser. WSDM ’11. New York, NY, USA: ACM, 2011, pp. 615–624

More Related Content

What's hot

158822 article text-413072-1-10-20170718
158822 article text-413072-1-10-20170718158822 article text-413072-1-10-20170718
158822 article text-413072-1-10-20170718
GetnetAschale1
 
3108 3389-1-pb
3108 3389-1-pb3108 3389-1-pb
3108 3389-1-pb
GetnetAschale1
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine Learning
IRJET Journal
 
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
ijcseit
 
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
Yafi Azhari
 
A novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankA novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankIAEME Publication
 
IRJET-Semantic Based Document Clustering Using Lexical Chains
IRJET-Semantic Based Document Clustering Using Lexical ChainsIRJET-Semantic Based Document Clustering Using Lexical Chains
IRJET-Semantic Based Document Clustering Using Lexical Chains
IRJET Journal
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
IJDKP
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
ijnlc
 
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATIONSEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
ijnlc
 
Cp1 v2-018
Cp1 v2-018Cp1 v2-018
Cp1 v2-018
GetnetAschale1
 
A novel method for arabic multi word term extraction
A novel method for arabic multi word term extractionA novel method for arabic multi word term extraction
A novel method for arabic multi word term extraction
ijdms
 
A885115 030
A885115 030A885115 030
A885115 030
GetnetAschale1
 
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAEFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
csandit
 
593176
593176593176
Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.
IRJET Journal
 
Semantic extraction of arabic
Semantic extraction of arabicSemantic extraction of arabic
Semantic extraction of arabic
csandit
 
An efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-documentAn efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-document
SaleihGero
 
IRJET- A Survey on MSER Based Scene Text Detection
IRJET-  	  A Survey on MSER Based Scene Text DetectionIRJET-  	  A Survey on MSER Based Scene Text Detection
IRJET- A Survey on MSER Based Scene Text Detection
IRJET Journal
 

What's hot (19)

158822 article text-413072-1-10-20170718
158822 article text-413072-1-10-20170718158822 article text-413072-1-10-20170718
158822 article text-413072-1-10-20170718
 
3108 3389-1-pb
3108 3389-1-pb3108 3389-1-pb
3108 3389-1-pb
 
Text Segmentation for Online Subjective Examination using Machine Learning
Text Segmentation for Online Subjective Examination using Machine   LearningText Segmentation for Online Subjective Examination using Machine   Learning
Text Segmentation for Online Subjective Examination using Machine Learning
 
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
 
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
Measuring word alignment_quality_for_statistical_machine_translation_tcm17-29663
 
A novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rankA novel approach towards developing a statistical dependent and rank
A novel approach towards developing a statistical dependent and rank
 
IRJET-Semantic Based Document Clustering Using Lexical Chains
IRJET-Semantic Based Document Clustering Using Lexical ChainsIRJET-Semantic Based Document Clustering Using Lexical Chains
IRJET-Semantic Based Document Clustering Using Lexical Chains
 
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACHTEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
TEXT CLUSTERING USING INCREMENTAL FREQUENT PATTERN MINING APPROACH
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
 
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATIONSEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
 
Cp1 v2-018
Cp1 v2-018Cp1 v2-018
Cp1 v2-018
 
A novel method for arabic multi word term extraction
A novel method for arabic multi word term extractionA novel method for arabic multi word term extraction
A novel method for arabic multi word term extraction
 
A885115 030
A885115 030A885115 030
A885115 030
 
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAEFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
 
593176
593176593176
593176
 
Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.Text Extraction from Image Using GAMMA Correction Method.
Text Extraction from Image Using GAMMA Correction Method.
 
Semantic extraction of arabic
Semantic extraction of arabicSemantic extraction of arabic
Semantic extraction of arabic
 
An efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-documentAn efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-document
 
IRJET- A Survey on MSER Based Scene Text Detection
IRJET-  	  A Survey on MSER Based Scene Text DetectionIRJET-  	  A Survey on MSER Based Scene Text Detection
IRJET- A Survey on MSER Based Scene Text Detection
 

Viewers also liked

D0371016018
D0371016018D0371016018
D0371016018
theijes
 
I0366059067
I0366059067I0366059067
I0366059067
theijes
 
Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
	Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...	Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
theijes
 
D0381022027
D0381022027D0381022027
D0381022027
theijes
 
A0350300108
A0350300108A0350300108
A0350300108
theijes
 
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
theijes
 
D03503019024
D03503019024D03503019024
D03503019024
theijes
 
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
theijes
 
B037208020
B037208020B037208020
B037208020
theijes
 
G0366046050
G0366046050G0366046050
G0366046050
theijes
 
K03502056061
K03502056061K03502056061
K03502056061
theijes
 
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
theijes
 
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
theijes
 
F03504037046
F03504037046F03504037046
F03504037046
theijes
 
Advanced Embedded Automatic Car Parking System
	Advanced Embedded Automatic Car Parking System	Advanced Embedded Automatic Car Parking System
Advanced Embedded Automatic Car Parking System
theijes
 
H03502040043
H03502040043H03502040043
H03502040043
theijes
 
F0371025033
F0371025033F0371025033
F0371025033
theijes
 
Bail Decision Support System
Bail Decision Support SystemBail Decision Support System
Bail Decision Support System
theijes
 
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
theijes
 

Viewers also liked (19)

D0371016018
D0371016018D0371016018
D0371016018
 
I0366059067
I0366059067I0366059067
I0366059067
 
Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
	Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...	Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
Performance Analysis of Dispersion Compensation in Long Haul Optical Fiber u...
 
D0381022027
D0381022027D0381022027
D0381022027
 
A0350300108
A0350300108A0350300108
A0350300108
 
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
Comparative Analysis of low area and low power D Flip-Flop for Different Logi...
 
D03503019024
D03503019024D03503019024
D03503019024
 
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
An Experimental Study of Low Velocity Impact (Lvi) On Fibre Glass Reinforced ...
 
B037208020
B037208020B037208020
B037208020
 
G0366046050
G0366046050G0366046050
G0366046050
 
K03502056061
K03502056061K03502056061
K03502056061
 
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
The Study of Impact Damage on C-Type and E-Type of Fibreglass Subjected To Lo...
 
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
Characterization of the Mechanical Properties of Aluminium Alloys with SiC Di...
 
F03504037046
F03504037046F03504037046
F03504037046
 
Advanced Embedded Automatic Car Parking System
	Advanced Embedded Automatic Car Parking System	Advanced Embedded Automatic Car Parking System
Advanced Embedded Automatic Car Parking System
 
H03502040043
H03502040043H03502040043
H03502040043
 
F0371025033
F0371025033F0371025033
F0371025033
 
Bail Decision Support System
Bail Decision Support SystemBail Decision Support System
Bail Decision Support System
 
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
Soil fertility improvement by Tithonia diversifolia (Hemsl.) A Gray and its e...
 

Similar to C03504013016

2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
IEEEFINALYEARSTUDENTPROJECT
 
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
IEEEMEMTECHSTUDENTSPROJECTS
 
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
IEEEGLOBALSOFTTECHNOLOGIES
 
A probabilistic approach to string transformation
A probabilistic approach to string transformationA probabilistic approach to string transformation
A probabilistic approach to string transformation
IEEEFINALYEARPROJECTS
 
Semantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical ChainsSemantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical Chains
IRJET Journal
 
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
IEEEMEMTECHSTUDENTPROJECTS
 
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
IEEEMEMTECHSTUDENTSPROJECTS
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code executionAlexander Decker
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code execution
Alexander Decker
 
An effective adaptive approach for joining data in data
An effective adaptive approach for joining data in dataAn effective adaptive approach for joining data in data
An effective adaptive approach for joining data in data
eSAT Publishing House
 
Automated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language ModelsAutomated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language Models
Nat Rice
 
Regular Expression to Deterministic Finite Automata
Regular Expression to Deterministic Finite AutomataRegular Expression to Deterministic Finite Automata
Regular Expression to Deterministic Finite Automata
IRJET Journal
 
Hybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithmHybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithm
eSAT Publishing House
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Design
ijtsrd
 
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET Journal
 
Cg4201552556
Cg4201552556Cg4201552556
Cg4201552556
IJERA Editor
 
E43022023
E43022023E43022023
E43022023
IJERA Editor
 
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyConceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Zac Darcy
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[
Zac Darcy
 

Similar to C03504013016 (20)

2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
 
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
IEEE 2014 JAVA DATA MINING PROJECTS A probabilistic approach to string transf...
 
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
2014 IEEE JAVA DATA MINING PROJECT A probabilistic approach to string transfo...
 
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
JAVA 2013 IEEE DATAMINING PROJECT A probabilistic approach to string transfor...
 
A probabilistic approach to string transformation
A probabilistic approach to string transformationA probabilistic approach to string transformation
A probabilistic approach to string transformation
 
Semantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical ChainsSemantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical Chains
 
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
IEEE 2014 DOTNET DATA MINING PROJECTS A probabilistic approach to string tran...
 
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
2014 IEEE DOTNET DATA MINING PROJECT A probabilistic approach to string trans...
 
11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution11.query optimization to improve performance of the code execution
11.query optimization to improve performance of the code execution
 
Query optimization to improve performance of the code execution
Query optimization to improve performance of the code executionQuery optimization to improve performance of the code execution
Query optimization to improve performance of the code execution
 
An effective adaptive approach for joining data in data
An effective adaptive approach for joining data in dataAn effective adaptive approach for joining data in data
An effective adaptive approach for joining data in data
 
Automated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language ModelsAutomated Essay Scoring Using Efficient Transformer-Based Language Models
Automated Essay Scoring Using Efficient Transformer-Based Language Models
 
Regular Expression to Deterministic Finite Automata
Regular Expression to Deterministic Finite AutomataRegular Expression to Deterministic Finite Automata
Regular Expression to Deterministic Finite Automata
 
Hybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithmHybrid approach for generating non overlapped substring using genetic algorithm
Hybrid approach for generating non overlapped substring using genetic algorithm
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Design
 
IRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET- A Novel Approch Automatically Categorizing Software Technologies
IRJET- A Novel Approch Automatically Categorizing Software Technologies
 
Cg4201552556
Cg4201552556Cg4201552556
Cg4201552556
 
E43022023
E43022023E43022023
E43022023
 
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyConceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[
 

Recently uploaded

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

C03504013016

  • 1. The International Journal Of Engineering And Science (IJES) || Volume || 3 || Issue || 5 || Pages || 13-16 || 2014 || ISSN (e): 2319 – 1813 ISSN (p): 2319 – 1805 www.theijes.com The IJES Page 13 An Advanced Optimistic Approach for String Transformation 1, Amith Kumar V Pujari, 2, Prof. H R Shashidhar 1, M.Tech Scholar RNS Institute of Technology Bangalore, Karnataka, India 2, Asst. Prof. Dept of CSE RNS Institute of Technology Bangalore, Karnataka, India -------------------------------------------------------------ABSTRACT--------------------------------------------------- String transformation can be formalized as the part of coding bio-informatics, information retrieval, and in data mining. However, in part of our paper we focus generating the k most likely output strings corresponding to the input string. This concept proposes a novel and probabilistic approach to string transformation, which is both accurate and efficient. The approach mainly focuses on three methods dynamic programming, modeling and pruning. In the dynamic programming we use the concept of the edit distance which is dynamic programming tool that estimates the score to each transformation so that the probability in the linear model can be estimated well. In the linear model, a modeling method for training the model, and an algorithm for generating the top k candidates, whether there is or is not a predefined dictionary. The linear model is defined as a conditional probability distribution of an output string and a rule set for the transformation conditioned on an input string. The learning method employs maximum likelihood estimation for parameter estimation. The string generation algorithm based on pruning is guaranteed to generate the optimal top k candidates. The proposed method is applied to correction of spelling errors in queries as well as reformulation of queries on dataset. Experimental results on large scale data show that the proposed approach is good and efficient improving upon existing methods in terms of accuracy and efficiency in different settings KEYWORDS—String Transformation, Edit distance algorithm , Linear Model, Aho-Corasick trees, Spelling Error Correction, Query Reformulation. --------------------------------------------------------------------------------------------------------------------------------------- Date of Submission: 19 May 2014 Date of Publication: 30 May 2014 --------------------------------------------------------------------------------------------------------------------------------------- I. INTRODUCTION The string transformation is an essential problem in many applications. In natural language processing, pronunciation generation, spelling error correction, word transliteration, and word stemming can all be formalized as string transformation. String transformation can also be used in query reformulation and query suggestion in search. In data mining, string transformation can be employed in the mining of synonyms and database record matching. As many of the above are online applications, the transformation must be conducted not only accurately but also efficiently. String transformation can be defined in the following way. Given an input string and a set of operators, one can transform the input string to the k most likely output strings by applying a number of operators. Here the strings can be strings of words, characters, or any type of tokens. Each operator is a transformation rule that defines the replacement of a substring with another substring. The likelihood of transformation can represent similarity, relevance, and association between two strings in a specific application. Although certain progress has been made, further investigation of the task is still necessary, particularly from the viewpoint of enhancing both accuracy and efficiency, which is precisely the goal of this work. String transformation can be conducted at two different settings, depending on whether or not a Dictionary is used. When a dictionary is used, the output strings must exist in the given dictionary, while the size of the dictionary can be very large. Without loss of generality, one can specifically studies correction of spelling errors in queries as well as reformulation of queries in web search in this Concept.
  • 2. An Advanced Optimistic Approach for String Transformation www.theijes.com The IJES Page 14 In the first task, a string consists of characters. In the second task, a string is comprised of words. The former needs to exploit a dictionary while the latter does not. Correcting spelling errors in queries usually consists of two steps: candidate generation and candidate selection. Candidate generation is used to find the most likely corrections of a misspelled word from the dictionary. In such a case, a string of characters is input and the operators represent insertion, deletion, and substitution of characters with or without surrounding characters, for example, “a”!“e” and “lly”!“ly”. Obviously candidate generation is an example of string transformation. Note that candidate generation is concerned with a single word; after candidate generation, the words in the context (i.e., in the query) can be further leveraged to make the final candidate selection. Query reformulation in search is aimed at dealing with the term mismatch problem. For example, if the query is “NY Times” and the document only contains “New York Times”, then the query and document do not match well and the document will not be ranked high. Query reformulation attempts to transform “NY Times” to “New York Times” and thus make a better matching between the query and document. In the task, given a query (a string of words), one needs to generate all similar queries from the original query (strings of words).The operators are transformations between words in queries such as “tx”!“texas” and meaning of ” ! “ definition of. II. RELATED WORK Arasu et al. proposed a method which can learn a set of transformation rules that explain most of the given examples. Increasing the coverage of the rule set was the primary focus. Tejada et al. proposed an active learning method that can estimate the weights of transformation rules with limited user input. The types of the transformation rules are predefined such as stemming, prefix, suffix and acronym. Okazaki et al. incorporated rules into an L1-regularized logistic regression model and utilized the model for string transformation. Later, Brill and Moore developed a generative model including contextual substitution rules. Toutanova and Moore further improved the model by adding pronunciation factors into the model. Duan and Hsu also proposed a generative approach to spelling correction using a noisy channel model. They also considered efficiently generating candidates by using a tree. However these works on string transformation can be categorized into two groups. Some work mainly considered efficient generation of strings. Other work tried to learn the model with different approaches. However, efficiency is not an important factor taken into consideration in these methods. The existing work is not focusing on enhancement of both accuracy and efficiency of string transformation. III. PROPOSED WORK Input String Training (conditional Probability) Expanding the Rules Generating the Rules Edit Distance Algorithm Calculate the Weights by Log Linear Model Implement the Aho-corasick tree Obtain the relevant string Figure 1 : Proposed Architecture This paper addresses string transformation, which is an essential problem, in many applications. In natural language processing, pronunciation generation, spelling error correction, word transliteration, and word stemming can all be formalized as string transformation. String transformation can also be used in query reformulation and query suggestion in search. String transformation can be defined in the following way.
  • 3. An Advanced Optimistic Approach for String Transformation www.theijes.com The IJES Page 15 Given an input string and a set of operators, we are able to transform the input string to the k most likely output strings by applying a number of operators. Here the strings can be strings of words, characters, or any type of tokens. Each operator is a transformation rule that defines the replacement of a substring with another substring. Initially we align the string by using the number of iterations obtained by using edit distance algorithm followed by the generation of the rules and expanding its rules without the loss of generality. Then a conditional probability of identifying the maximum likelihood of the strings generated for the respective strings are obtained and their weights are calculated by the linear method. Later in the generation phase, aho-corasick tree is generated arranging the rules and weights. The linear model is defined as a conditional probability distribution of an output string and a rule set for the transformation given an input string. The learning method is based on maximum likelihood estimation. Thus, the model is trained toward the objective of generating strings with the largest likelihood given input strings. The generation algorithm efficiently performs the top k candidate’s generation using top k pruning. It is guaranteed to find the best k candidates without enumerating all the possibilities. An Aho-Corasick tree is employed to index transformation rules in the model. When a dictionary is used in the transformation, a tree is used to efficiently retrieve the strings in the dictionary. The likelihood of transformation can represent similarity, relevance, and association between two strings in a specific application. Although certain progress has been made, further investigation of the task is still necessary, particularly from the viewpoint of enhancing both accuracy and efficiency, which is precisely the goal of this work. IV. RESULTS AND DISCUSSIONS Figure 2: Accuracy Comparison of our model with a constant dataset The above graph shows that there is significant improvement in the accuracy when the concept was implemented on a constant dataset and hence proves that the concept is very much applicable for a real-time usage. The date-set size that I have used consists of two thousand words. The above results are based on the same data-set.
  • 4. An Advanced Optimistic Approach for String Transformation www.theijes.com The IJES Page 16 V. CONCLUSION This paper addresses string transformation, which is an essential problem, in many applications. In natural language processing, spelling error correction, word transliteration, and word stemming can all be formalized as string transformation. Our concept in general poses an additional concept of the abbreviation expansion thus covering all the functional and non-functional requirements. This concept has been implemented using a friendly interface on a pre-defined dataset and thus results obtained are as expected. VI. ACKNOWLEDGEMENT I take this Opportunity to express my profound gratitude and deep regards to my guide Prof. H R Shashidhar , Assistant Professor , RNS Institute of technology, Bangalore, for his exempla nary guidance, and constant encouragement throughout. REFERENCES [1]. Ziqi Wang, Gu Xu, Hang Li, and Ming Zhang, “A Probabilistic Approach to String Transformation “ , ieee transactions on knowledge and data engineering vol:pp no:99 year 2013 [2]. M. Li, Y. Zhang, M. Zhu, and M. Zhou, “Exploring distributional similarity based models for query spelling correction,” in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, ser. ACL ’06. Morristown, NJ, USA: Association for Computational Linguistics, 2006, pp. 1025– 1032. [3]. R. Golding and D. Roth, “A window-based approach to context-sensitive spelling correction,” Mach. Learn., vol. 34, pp. 107–130, February 1999. [4]. J. Guo, G. Xu, H. Li, and X. Cheng, “A unified and discriminative model for query refinement,” in Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR ’08. New York, NY, USA: ACM, 2008, pp. 379–386. [5]. Behm, S. Ji, C. Li, and J. Lu, “Space-constrained gram-based indexing for efficient approximate string search,” in Proceedings of the 2009 IEEE International Conference on Data Engineering, ser. ICDE ’09. Washington, DC, USA: IEEE Computer Society, 2009, pp. 604–615. [6]. E. Brill and R. C. Moore, “An improved error model for noisy channel spelling correction,” in Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ser. ACL ’00. Morristown, NJ, USA: Association for Computational Linguistics, 2000, pp. 286–293. [7]. N. Okazaki, Y. Tsuruoka, S. Ananiadou, and J. Tsujii, “A discriminative candidate generator for string transformations,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP ’08. Morristown, NJ, USA: Association for Computational Linguistics, 2008, pp. 447–456. [8]. M. Dreyer, J. R. Smith, and J. Eisner, “Latent-variable modeling of string transductions with finite-state methods,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, ser. EMNLP ’08. Stroudsburg, PA, USA: Association for Computational Linguistics, 2008, pp. 1080–1089. [9]. Arasu, S. Chaudhuri, and R. Kaushik, “Learning string transformations from examples,” Proc. VLDB Endow., vol. 2, pp. 514– 525, August 2009.S. Tejada, C. A. Knoblock, and S. Minton, “Learning domainindependent string transformation weights for high accuracy object identification,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’02. New York, NY, USA: ACM, 2002, pp. 350–359. [10]. A. V. Aho and M. J. Corasick, “Efficient string matching: an aid to bibliographic search,” Commun. ACM, vol. 18, pp. 333–340, June 1975. [11]. J. Xu and G. Xu, “Learning similarity function for rare queries,” in Proceedings of the fourth ACM international conference on Web search and data mining, ser. WSDM ’11. New York, NY, USA: ACM, 2011, pp. 615–624