In the field of proteomics because of more data is added, the computational methods need to be more
efficient. The part of molecular sequences is functionally more important to the molecule which is more
resistant to change. To ensure the reliability of sequence alignment, comparative approaches are used. The
problem of multiple sequence alignment is a proposition of evolutionary history. For each column in the
alignment, the explicit homologous correspondence of each individual sequence position is established. The
different pair-wise sequence alignment methods are elaborated in the present work. But these methods are
only used for aligning the limited number of sequences having small sequence length. For aligning
sequences based on the local alignment with consensus sequences, a new method is introduced. From NCBI
databank triticum wheat varieties are loaded. Phylogenetic trees are constructed for divided parts of
dataset. A single new tree is constructed from previous generated trees using advanced pruning technique.
Then, the closely related sequences are extracted by applying threshold conditions and by using shift
operations in the both directions optimal sequence alignment is obtained.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISijcseit
HMM has found its application in almost every field. Applying Hmm to biological sequences has its own
advantages. HMM’s being more systematic and specific, yield a result better than consensus techniques.
Profile HMMs use position specific scoring for the matching & substitution of a residue and for the
opening or extension of a gap. HMMs apply a statistical method to estimate the true frequency of a residue
at a given position in the alignment from its observed frequency while standard profiles use the observed
frequency itself to assign the score for that residue. This means that a profile HMM derived from only 10 to
20 aligned sequences can be of equivalent quality to a standard profile created from 40 to 50 aligned
sequences.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...IJAEMSJORNAL
The current trend in the industry is to analyze large data sets and apply data mining, machine learning techniques to identify a pattern. But the challenges with huge data sets are the high dimensions associated with it. Sometimes in data analytics applications, large amounts of data produce worse performance. Also, most of the data mining algorithms are implemented column wise and too many columns restrict the performance and make it slower. Therefore, dimensionality reduction is an important step in data analysis. Dimensionality reduction is a technique that converts high dimensional data into much lower dimension, such that maximum variance is explained within the first few dimensions. This paper focuses on multivariate statistical and artificial neural networks techniques for data reduction. Each method has a different rationale to preserve the relationship between input parameters during analysis. Principal Component Analysis which is a multivariate technique and Self Organising Map a neural network technique is presented in this paper. Also, a hierarchical clustering approach has been applied to the reduced data set. A case study of Air quality measurement has been considered to evaluate the performance of the proposed techniques.
Data reduction techniques for high dimensional biological dataeSAT Journals
Abstract
High dimensional biological datasets in recent years has been growing rapidly. Extracting the knowledge and analyzing highdimensional
biological data is one the key challenges in which variety and veracity are the two distinct characteristics. The
question that arises now is, how to perform dimensionality reduction for this heterogeneous data and how to develop a high
performance platform to efficiently analyze high dimensional biological data and how to find the useful things from this data. To
deeply discuss this issue, this paper begins with a brief introduction to data analytics available for biological data, followed by
the discussions of big data analytics and then a survey on various data reduction methods for biological data. We propose a dense
clustering algorithm for standard high dimensional biological data.
Keywords: Big Data Analytics, Dimensionality Reduction
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISijcseit
HMM has found its application in almost every field. Applying Hmm to biological sequences has its own
advantages. HMM’s being more systematic and specific, yield a result better than consensus techniques.
Profile HMMs use position specific scoring for the matching & substitution of a residue and for the
opening or extension of a gap. HMMs apply a statistical method to estimate the true frequency of a residue
at a given position in the alignment from its observed frequency while standard profiles use the observed
frequency itself to assign the score for that residue. This means that a profile HMM derived from only 10 to
20 aligned sequences can be of equivalent quality to a standard profile created from 40 to 50 aligned
sequences.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Influence over the Dimensionality Reduction and Clustering for Air Quality Me...IJAEMSJORNAL
The current trend in the industry is to analyze large data sets and apply data mining, machine learning techniques to identify a pattern. But the challenges with huge data sets are the high dimensions associated with it. Sometimes in data analytics applications, large amounts of data produce worse performance. Also, most of the data mining algorithms are implemented column wise and too many columns restrict the performance and make it slower. Therefore, dimensionality reduction is an important step in data analysis. Dimensionality reduction is a technique that converts high dimensional data into much lower dimension, such that maximum variance is explained within the first few dimensions. This paper focuses on multivariate statistical and artificial neural networks techniques for data reduction. Each method has a different rationale to preserve the relationship between input parameters during analysis. Principal Component Analysis which is a multivariate technique and Self Organising Map a neural network technique is presented in this paper. Also, a hierarchical clustering approach has been applied to the reduced data set. A case study of Air quality measurement has been considered to evaluate the performance of the proposed techniques.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
A Magnified Application of Deficient Data Using Bolzano Classifierjournal ijrtem
Abstract: Deficient and Inconsistent knowledge are a pervasive and lasting problem in heavy information sets. However, basic accusation is attractive often used to impute missing data, whereas multiple imputation generates right value to replace. Consequently, variety of machine learning (ML) techniques are developed to reprocess the incomplete information. It is estimated multiple imputation of missing information in large datasets and the performance of MI focuses on several unsupervised ML algorithms like mean, median, standard deviation and Supervised ML techniques for probabilistic algorithm like NBI classifier. Such research is carried out employing a comprehensive range of databases, for which missing cases are first filled in by various sets of plausible values to create multiple completed datasets, then standard complete- data operations are applied to each completed dataset, and finally the multiple sets of results combine to generate a single inference. The main aim of this report is to offer general guidelines for selection of suitable data imputation algorithms based on characteristics of the data. Implementing Bolzano Weierstrass theorem in machine learning techniques to assess the functioning of every sequence of rational and irrational number has a monotonic subsequence and specify every sequence always has a finite boundary. For estimate imputation of missing data, the standard machine learning repository dataset has been applied. Tentative effects manifest the proposed approach to have good accuracy and the accuracy measured in terms of percent. Keywords: Bolzano Classifier, Maximum Likelihood, NBI classifier, posterior probability, predictor probability, prior probability, Supervised ML, Unsupervised ML.
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...ijiert bestjournal
In Natural Scene Image,Text detection is important tasks which are used for many content based image analysis. A maximally stable external region based method is us ed for scene detection .This MSER based method incl udes stages character candidate extraction,text candida te construction,text candidate elimination & text candidate classification. Main limitations of this method are how to detect highly blurred text in low resolutio n natural scene images. The current technology not focuses on any t ext extraction method. In proposed system a Conditi onal Random field (CRF) model is used to assign candidat e component as one of the two classes (text& Non Te xt) by Considering both unary component properties and bin ary contextual component relationship. For this pur pose we are using connected component analysis method. The proposed system also performs a text extraction usi ng OCR
Presentation summarizes main content of Farrelly, C. M. (2017). Extensions of Morse-Smale Regression with Application to Actuarial Science. arXiv preprint arXiv:1708.05712.
Paper was accepted December 2017 by Casualty Actuarial Society.
Survey on Supervised Method for Face Image Retrieval Based on Euclidean Dist...Editor IJCATR
Content-based image retrieval is a technique which uses visual contents to search images from large scale image databases
according to users' interests. Given a query face image, content-based face image retrieval tries to find similar face images from a large
image database. Initially face of the image is detected from the query image. After the removal of noise present in the image, it is
separated as patches. For each patch, the Local binary pattern (LBP) is extracted which improves the detection performance. LBP is a
type of feature used for classification in computer vision. The LBP operator assigns a label to every pixel of a gray level image. The
label mapping to a pixel is affected by the relationship between this pixel and its eight neighbors. Support Vector Machine (SVM) is
used then which will produce a model (based on the training data) that predicts the target values of the test data given only the test data
attributes. When the feature values are provided to the SVM classifier, it will train about the feature. Finally it will classify about the
result. SVM maps input vectors to a higher dimensional vector space where an optimal hyper plane is constructed. Among the
available hyper planes, there is one hyper plane alone that maximizes the distance between itself and the nearest data vectors of each
category. The Euclidean distance between the query image and database image is calculated and the index of the Euclidean distance is
sorted.The indexing scheme used for this purpose provides an efficient way to search the image. Then the corresponding image from
the database is retrieved based upon the index. This SVM classifier mainly improves the detection performance and the rate of
accuracy.
A comparative study of clustering and biclustering of microarray dataijcsit
There are subsets of genes that have similar behavior under subsets of conditions, so we say that they
coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can
be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of
utmost importance to make a simultaneous clustering of genes and conditions to identify clusters of genes
that are coexpressed under clusters of conditions. This type of clustering is called biclustering.
Biclustering is an NP-hard problem. Consequently, heuristic algorithms are typically used to approximate
this problem by finding suboptimal solutions. In this paper, we make a new survey on clustering and
biclustering of gene expression data, also called microarray data.
A fast clustering based feature subset selection algorithm for high-dimension...IEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Extending superlearner framework to survival analysis. Includes boosted regression, random forest, decision trees, Bayesian model average, and Morse-Smale regression.
Comparative analysis of dynamic programming algorithms to find similarity in ...eSAT Journals
Abstract There exist many computational methods for finding similarity in gene sequence, finding suitable methods that gives optimal similarity is difficult task. Objective of this project is to find an appropriate method to compute similarity in gene/protein sequence, both within the families and across the families. Many dynamic programming algorithms like Levenshtein edit distance; Longest Common Subsequence and Smith-waterman have used dynamic programming approach to find similarities between two sequences. But none of the method mentioned above have used real benchmark data sets. They have only used dynamic programming algorithms for synthetic data. We proposed a new method to compute similarity. The performance of the proposed algorithm is evaluated using number of data sets from various families, and similarity value is calculated both within the family and across the families. A comparative analysis and time complexity of the proposed method reveal that Smith-waterman approach is appropriate method when gene/protein sequence belongs to same family and Longest Common Subsequence is best suited when sequence belong to two different families. Keywords - Bioinformatics, Gene, Gene Sequencing, Edit distance, String Similarity.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
process in which instead comparing whole query sequence with database sequence it breaks
query sequence into small words and these words are used to align patterns. it uses heuristic method which
make it faster than earlier smith-waterman algorithm. But due small query sequence used for align in case of
very large database with complex queries it may perform poor. To remove this draw back we suggest by using
MSA tools which can filter database in by removing unnecessary sequences from data. This sorted data set then
applies to BLAST which can then indentify relationship among them i.e. HOMOLOGS, ORTHOLOGS,
PARALOGS. The proposed system can be further use to find relation among two persons or used to create
family tree. Ortholog is interesting for a wide range of bioinformatics analyses, including functional annotation,
phylogenetic inference, or genome evolution. This system describes and motivates the algorithm for predicting
orthologous relationships among complete genomes. The algorithm takes a pairwise approach, thus neither
requiring tree reconstruction nor reconciliation
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...IJRTEMJOURNAL
BLAST is most popular sequence alignment tool used to align bioinformatics patterns. It uses
local alignment process in which instead comparing whole query sequence with database sequence it breaks
query sequence into small words and these words are used to align patterns. it uses heuristic method which
make it faster than earlier smith-waterman algorithm. But due small query sequence used for align in case of
very large database with complex queries it may perform poor. To remove this draw back we suggest by using
MSA tools which can filter database in by removing unnecessary sequences from data. This sorted data set then
applies to BLAST which can then indentify relationship among them i.e. HOMOLOGS, ORTHOLOGS,
PARALOGS. The proposed system can be further use to find relation among two persons or used to create
family tree. Ortholog is interesting for a wide range of bioinformatics analyses, including functional annotation,
phylogenetic inference, or genome evolution. This system describes and motivates the algorithm for predicting
orthologous relationships among complete genomes. The algorithm takes a pairwise approach, thus neither
requiring tree reconstruction nor reconciliation
This paper presents a literature survey conducted for research oriented developments made till. The significance of this paper would be to provide a deep rooted understanding and knowledge transfer regarding existing approaches for gene sequencing and alignments using Smith Waterman algorithms and their respective strengths and weaknesses. In order to develop or perform any quality research it is always advised to conduct research goal oriented literature survey that could facilitate an in depth understanding of research work and an objective can be formulated on the basis of gaps existing between present requirements and existing approaches. Gene sequencing problems are one of the predominant issues for researchers to come up with optimized system model that could facilitate optimum processing and efficiency without introducing overheads in terms of memory and time. This research is oriented towards developing such kind of system while taking into consideration of dynamic programming approach called Smith Waterman algorithm in its enhanced form decorated with other supporting and optimized techniques. This paper provides an introduction oriented knowledge transfer so as to provide a brief introduction of research domain, research gap and motivations, objective formulated and proposed systems to accomplish ultimate objectives.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
A Magnified Application of Deficient Data Using Bolzano Classifierjournal ijrtem
Abstract: Deficient and Inconsistent knowledge are a pervasive and lasting problem in heavy information sets. However, basic accusation is attractive often used to impute missing data, whereas multiple imputation generates right value to replace. Consequently, variety of machine learning (ML) techniques are developed to reprocess the incomplete information. It is estimated multiple imputation of missing information in large datasets and the performance of MI focuses on several unsupervised ML algorithms like mean, median, standard deviation and Supervised ML techniques for probabilistic algorithm like NBI classifier. Such research is carried out employing a comprehensive range of databases, for which missing cases are first filled in by various sets of plausible values to create multiple completed datasets, then standard complete- data operations are applied to each completed dataset, and finally the multiple sets of results combine to generate a single inference. The main aim of this report is to offer general guidelines for selection of suitable data imputation algorithms based on characteristics of the data. Implementing Bolzano Weierstrass theorem in machine learning techniques to assess the functioning of every sequence of rational and irrational number has a monotonic subsequence and specify every sequence always has a finite boundary. For estimate imputation of missing data, the standard machine learning repository dataset has been applied. Tentative effects manifest the proposed approach to have good accuracy and the accuracy measured in terms of percent. Keywords: Bolzano Classifier, Maximum Likelihood, NBI classifier, posterior probability, predictor probability, prior probability, Supervised ML, Unsupervised ML.
ROBUST TEXT DETECTION AND EXTRACTION IN NATURAL SCENE IMAGES USING CONDITIONA...ijiert bestjournal
In Natural Scene Image,Text detection is important tasks which are used for many content based image analysis. A maximally stable external region based method is us ed for scene detection .This MSER based method incl udes stages character candidate extraction,text candida te construction,text candidate elimination & text candidate classification. Main limitations of this method are how to detect highly blurred text in low resolutio n natural scene images. The current technology not focuses on any t ext extraction method. In proposed system a Conditi onal Random field (CRF) model is used to assign candidat e component as one of the two classes (text& Non Te xt) by Considering both unary component properties and bin ary contextual component relationship. For this pur pose we are using connected component analysis method. The proposed system also performs a text extraction usi ng OCR
Presentation summarizes main content of Farrelly, C. M. (2017). Extensions of Morse-Smale Regression with Application to Actuarial Science. arXiv preprint arXiv:1708.05712.
Paper was accepted December 2017 by Casualty Actuarial Society.
Survey on Supervised Method for Face Image Retrieval Based on Euclidean Dist...Editor IJCATR
Content-based image retrieval is a technique which uses visual contents to search images from large scale image databases
according to users' interests. Given a query face image, content-based face image retrieval tries to find similar face images from a large
image database. Initially face of the image is detected from the query image. After the removal of noise present in the image, it is
separated as patches. For each patch, the Local binary pattern (LBP) is extracted which improves the detection performance. LBP is a
type of feature used for classification in computer vision. The LBP operator assigns a label to every pixel of a gray level image. The
label mapping to a pixel is affected by the relationship between this pixel and its eight neighbors. Support Vector Machine (SVM) is
used then which will produce a model (based on the training data) that predicts the target values of the test data given only the test data
attributes. When the feature values are provided to the SVM classifier, it will train about the feature. Finally it will classify about the
result. SVM maps input vectors to a higher dimensional vector space where an optimal hyper plane is constructed. Among the
available hyper planes, there is one hyper plane alone that maximizes the distance between itself and the nearest data vectors of each
category. The Euclidean distance between the query image and database image is calculated and the index of the Euclidean distance is
sorted.The indexing scheme used for this purpose provides an efficient way to search the image. Then the corresponding image from
the database is retrieved based upon the index. This SVM classifier mainly improves the detection performance and the rate of
accuracy.
A comparative study of clustering and biclustering of microarray dataijcsit
There are subsets of genes that have similar behavior under subsets of conditions, so we say that they
coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can
be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of
utmost importance to make a simultaneous clustering of genes and conditions to identify clusters of genes
that are coexpressed under clusters of conditions. This type of clustering is called biclustering.
Biclustering is an NP-hard problem. Consequently, heuristic algorithms are typically used to approximate
this problem by finding suboptimal solutions. In this paper, we make a new survey on clustering and
biclustering of gene expression data, also called microarray data.
A fast clustering based feature subset selection algorithm for high-dimension...IEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Extending superlearner framework to survival analysis. Includes boosted regression, random forest, decision trees, Bayesian model average, and Morse-Smale regression.
Comparative analysis of dynamic programming algorithms to find similarity in ...eSAT Journals
Abstract There exist many computational methods for finding similarity in gene sequence, finding suitable methods that gives optimal similarity is difficult task. Objective of this project is to find an appropriate method to compute similarity in gene/protein sequence, both within the families and across the families. Many dynamic programming algorithms like Levenshtein edit distance; Longest Common Subsequence and Smith-waterman have used dynamic programming approach to find similarities between two sequences. But none of the method mentioned above have used real benchmark data sets. They have only used dynamic programming algorithms for synthetic data. We proposed a new method to compute similarity. The performance of the proposed algorithm is evaluated using number of data sets from various families, and similarity value is calculated both within the family and across the families. A comparative analysis and time complexity of the proposed method reveal that Smith-waterman approach is appropriate method when gene/protein sequence belongs to same family and Longest Common Subsequence is best suited when sequence belong to two different families. Keywords - Bioinformatics, Gene, Gene Sequencing, Edit distance, String Similarity.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
process in which instead comparing whole query sequence with database sequence it breaks
query sequence into small words and these words are used to align patterns. it uses heuristic method which
make it faster than earlier smith-waterman algorithm. But due small query sequence used for align in case of
very large database with complex queries it may perform poor. To remove this draw back we suggest by using
MSA tools which can filter database in by removing unnecessary sequences from data. This sorted data set then
applies to BLAST which can then indentify relationship among them i.e. HOMOLOGS, ORTHOLOGS,
PARALOGS. The proposed system can be further use to find relation among two persons or used to create
family tree. Ortholog is interesting for a wide range of bioinformatics analyses, including functional annotation,
phylogenetic inference, or genome evolution. This system describes and motivates the algorithm for predicting
orthologous relationships among complete genomes. The algorithm takes a pairwise approach, thus neither
requiring tree reconstruction nor reconciliation
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...IJRTEMJOURNAL
BLAST is most popular sequence alignment tool used to align bioinformatics patterns. It uses
local alignment process in which instead comparing whole query sequence with database sequence it breaks
query sequence into small words and these words are used to align patterns. it uses heuristic method which
make it faster than earlier smith-waterman algorithm. But due small query sequence used for align in case of
very large database with complex queries it may perform poor. To remove this draw back we suggest by using
MSA tools which can filter database in by removing unnecessary sequences from data. This sorted data set then
applies to BLAST which can then indentify relationship among them i.e. HOMOLOGS, ORTHOLOGS,
PARALOGS. The proposed system can be further use to find relation among two persons or used to create
family tree. Ortholog is interesting for a wide range of bioinformatics analyses, including functional annotation,
phylogenetic inference, or genome evolution. This system describes and motivates the algorithm for predicting
orthologous relationships among complete genomes. The algorithm takes a pairwise approach, thus neither
requiring tree reconstruction nor reconciliation
This paper presents a literature survey conducted for research oriented developments made till. The significance of this paper would be to provide a deep rooted understanding and knowledge transfer regarding existing approaches for gene sequencing and alignments using Smith Waterman algorithms and their respective strengths and weaknesses. In order to develop or perform any quality research it is always advised to conduct research goal oriented literature survey that could facilitate an in depth understanding of research work and an objective can be formulated on the basis of gaps existing between present requirements and existing approaches. Gene sequencing problems are one of the predominant issues for researchers to come up with optimized system model that could facilitate optimum processing and efficiency without introducing overheads in terms of memory and time. This research is oriented towards developing such kind of system while taking into consideration of dynamic programming approach called Smith Waterman algorithm in its enhanced form decorated with other supporting and optimized techniques. This paper provides an introduction oriented knowledge transfer so as to provide a brief introduction of research domain, research gap and motivations, objective formulated and proposed systems to accomplish ultimate objectives.
Genome structure prediction a review over soft computing techniqueseSAT Journals
Abstract There are some techniques like spectrometry or crystallography for the determination of DNA, RNA or protein structures. These processes provide very accurate results for the structure estimation. But these conventional techniques are very slow and could be applied over a few special cases only. Soft computing techniques guarantee a near appropriate results in much smaller time and have very large applicability. These techniques are much easier to apply. Different approaches have been used in soft computing including nature inspired computing for estimation of genome structures with a considerable accuracy of results. This paper provides a review over different soft computing techniques been applied along with application method for the determination of genome structure. Keywords—DNA, RNA, proteins, structure, soft computing, techniques.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
GPCODON ALIGNMENT: A GLOBAL PAIRWISE CODON BASED SEQUENCE ALIGNMENT APPROACHijdms
The alignment of two DNA sequences is a basic step in the analysis of biological data. Sequencing a long
DNA sequence is one of the most interesting problems in bioinformatics. Several techniques have been
developed to solve this sequence alignment problem like dynamic programming and heuristic algorithms.
In this paper, we introduce (GPCodon alignment) a pairwise DNA-DNA method for global sequence
alignment that improves the accuracy of pairwise sequence alignment. We use a new scoring matrix to
produce the final alignment called the empirical codon substitution matrix. Using this matrix in our
technique enabled the discovery of new relationships between sequences that could not be discovered using
traditional matrices. In addition, we present experimental results that show the performance of the
proposed technique over eleven datasets of average length of 2967 bps. We compared the efficiency and
accuracy of our techniques against a comparable tool called “Pairwise Align Codons” [1].
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
The analysis of proteins and messenger RNA is commonly used in the comparison of gene expression patterns in tissues or cells of different types and under distinct conditions. In gene expression analysis, normalization is a critical step as it guarantees the validity of downstream analyses. Data preprocessing is an indispensable step in the extraction and normalization of microarray gene expression data. The normalization of gene expression data is essential in ensuring accurate inferences. A number of normalization methods in high throughput sequencing studies are being employed. The preprocessing activity begins by a careful analysis of the gene expression data and usually involves the classification of many raw signal intensities into one expression value. The Robust Multiarray Average (RMA) is a normalization approach for microarrays that involves background correction, normalization and summarization of probe levels information without using MM probes (Lim et al., 2007). It is an algorithm commonly used in the creation of an expression matrix for Affymetrix data and is one of the most commonly used modes of preprocessing to normalize gene expression data. Values of raw intensity are initially background corrected and log2 transformed before being normalized. In order to generate an expression measure for probe sets on each array, a linear model is fitted to the normalized data.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Clustering heterogeneous categorical data using enhanced mini batch K-means ...IJECEIAES
Clustering methods in data mining aim to group a set of patterns based on their similarity. In a data survey, heterogeneous information is established with various types of data scales like nominal, ordinal, binary, and Likert scales. A lack of treatment of heterogeneous data and information leads to loss of information and scanty decision-making. Although many similarity measures have been established, solutions for heterogeneous data in clustering are still lacking. The recent entropy distance measure seems to provide good results for the heterogeneous categorical data. However, it requires many experiments and evaluations. This article presents a proposed framework for heterogeneous categorical data solution using a mini batch k-means with entropy measure (MBKEM) which is to investigate the effectiveness of similarity measure in clustering method using heterogeneous categorical data. Secondary data from a public survey was used. The findings demonstrate the proposed framework has improved the clustering’s quality. MBKEM outperformed other clustering algorithms with the accuracy at 0.88, v-measure (VM) at 0.82, adjusted rand index (ARI) at 0.87, and Fowlkes-Mallow’s index (FMI) at 0.94. It is observed that the average minimum elapsed time-varying for cluster generation, k at 0.26 s. In the future, the proposed solution would be beneficial for improving the quality of clustering for heterogeneous categorical data problems in many domains.
A SURVEY ON OPTIMIZATION APPROACHES TO TEXT DOCUMENT CLUSTERINGijcsa
Text Document Clustering is one of the fastest growing research areas because of availability of huge amount of information in an electronic form. There are several number of techniques launched for clustering documents in such a way that documents within a cluster have high intra-similarity and low inter-similarity to other clusters. Many document clustering algorithms provide localized search in effectively navigating, summarizing, and organizing information. A global optimal solution can be obtained by applying high-speed and high-quality optimization algorithms. The optimization technique performs a globalized search in the entire solution space. In this paper, a brief survey on optimization approaches to text document clustering is turned out.
ANALYSIS OF EXISTING TRAILERS’ CONTAINER LOCK SYSTEMS IJCSEIT Journal
Trailers carry large containers to various destinations in the world. These are manually locked on to
trailers as they move through these long distances. Security mainly refers to the safety of a state,
organization, property, and individuals against criminal activity. The study was made to analyze the
existing trailer locks and the insecurity being experienced currently. The study also focused on creating a
background to building an automated lock system for auto-mobiles. Findings showed that there are various
container types like the General Purpose containers, the Hard-Top containers and the Open-Top among
others. Similarly, the twist locks were the ones revised for this study. The study also discussed the
weaknesses of the twist locks, most especially the non-notification on unsecured locks. This causes leads to
accidents and wastage of lives and property. The study finally proposed an automated lock system to
overcome these weaknesses to some good extent.
A MODEL FOR REMOTE ACCESS AND PROTECTION OF SMARTPHONES USING SHORT MESSAGE S...IJCSEIT Journal
The smartphone usage among people is increasing rapidly. With the phenomenal growth of smartphone
use, smartphone theft is also increasing. This paper proposes a model to secure smartphones from theft as
well as provides options to access a smartphone through other smartphone or a normal mobile via Short
Message Service. This model provides option to track and secure the mobile by locking it. It also provides
facilities to receive the incoming call and sms information to the remotely connected device and enables the
remote user to control the mobile through SMS. The proposed model is validated by the prototype
implementation in Android platform. Various tests are conducted in the implementation and the results are
discussed.
BIOMETRIC APPLICATION OF INTELLIGENT AGENTS IN FAKE DOCUMENT DETECTION OF JOB...IJCSEIT Journal
The Job selection process in today’s globally competitive economy can be a daunting task for prospective
employees no matter their experience level. Although many years of research has been devoted to job
search and application resulting in good integration with information technology including the internet and
intelligent agent-based architectures, there are still many areas that need to be enhanced. Two such areas
include the quality of jobs associated with applicants in the job search by profiling the needs of employers
against the needs of prospective employees and the security and verifications schemes integrated to reduce
the instances of fraud and identity theft. The integration of mobile, intelligent agent, and cryptography
technologies provide benefits such as improved accessibility wirelessly, intelligent dynamic profiling, and
increased security. With this in mind we propose the intelligent mobile agents instead of human agents to
perform the Job search using fuzzy preferences which is been published elsewhere and application
operations incorporating the use of agents with a trust authority to establish employer trust and validate
applicant identity and accuracy. Our proposed system incorporates design methodologies to use JADELEAP
and Android to provide a robust, secure, user friendly solution.
FACE RECOGNITION USING DIFFERENT LOCAL FEATURES WITH DIFFERENT DISTANCE TECHN...IJCSEIT Journal
A face recognition system using different local features with different distance measures is proposed in this
paper. Proposed method is fast and gives accurate detection. Feature vector is based on Eigen values,
Eigen vectors, and diagonal vectors of sub images. Images are partitioned into sub images to detect local
features. Sub partitions are rearranged into vertically and horizontally matrices. Eigen values, Eigenvector
and diagonal vectors are computed for these matrices. Global feature vector is generated for face
recognition. Experiments are performed on benchmark face YALE database. Results indicate that the
proposed method gives better recognition performance in terms of average recognized rate and retrieval
time compared to the existing methods.
BIOMETRICS AUTHENTICATION TECHNIQUE FOR INTRUSION DETECTION SYSTEMS USING FIN...IJCSEIT Journal
Identifying attackers is a major apprehension to both organizations and governments. Recently, the most
used applications for prevention or detection of attacks are intrusion detection systems. Biometrics
technology is simply the measurement and use of the unique characteristics of living humans to distinguish
them from one another and it is more useful as compare to passwords and tokens as they can be lost or
stolen so we have choose the technique biometric authentication. The biometric authentication provides the
ability to require more instances of authentication in such a quick and easy manner that users are not
bothered by the additional requirements. In this paper, we have given a brief introduction about
biometrics. Then we have given the information regarding the intrusion detection system and finally we
have proposed a method which is based on fingerprint recognition which would allow us to detect more
efficiently any abuse of the computer system that is running.
PERFORMANCE ANALYSIS OF FINGERPRINTING EXTRACTION ALGORITHM IN VIDEO COPY DET...IJCSEIT Journal
A video fingerprint is a recognizer that is derived from a piece of video content. The video fingerprinting
methods obtain unique features of a video that differentiates one video clip from another. It aims to identify
whether a query video segment is a copy of video from the video database or not based on the signature of
the video. It is difficult to find whether a video is a copied video or a similar video, since the features of the
content are very similar from one video to the other. The main focus of this paper is to detect that the query
video is present in the video database with robustness depending on the content of video and also by fast
search of fingerprints. The Fingerprint Extraction Algorithm and Fast Search Algorithms are adopted in
this paper to achieve robust, fast, efficient and accurate video copy detection. As a first step, the
Fingerprint Extraction algorithm is employed which extracts a fingerprint through the features from the
image content of video. The images are represented as Temporally Informative Representative Images
(TIRI). Then, the second step is to find the presence of copy of a query video in a video database, in which
a close match of its fingerprint in the corresponding fingerprint database is searched using inverted-filebased
method. The proposed system is tested against various attacks like noise, brightness, contrast,
rotation and frame drop. Thus the performance of the proposed system on an average shows high true
positive rate of 98% and low false positive rate of 1.3% for different attacks.
Effect of Interleaved FEC Code on Wavelet Based MC-CDMA System with Alamouti ...IJCSEIT Journal
In this paper, the impact of Forward Error Correction (FEC) code namely Trellis code with interleaver on
the performance of wavelet based MC-CDMA wireless communication system with the implementation of
Alamouti antenna diversity scheme has been investigated in terms of Bit Error Rate (BER) as a function of
Signal-to-Noise Ratio (SNR) per bit. Simulation of the system under proposed study has been done in M-ary
modulation schemes (MPSK, MQAM and DPSK) over AWGN and Rayleigh fading channel incorporating
Walsh Hadamard code as orthogonal spreading code to discriminate the message signal for individual
user. It is observed via computer simulation that the performance of the interleaved coded based proposed
system outperforms than that of the uncoded system in all modulation schemes over Rayleigh fading
channel.
FUZZY WEIGHTED ASSOCIATIVE CLASSIFIER: A PREDICTIVE TECHNIQUE FOR HEALTH CARE...IJCSEIT Journal
In this paper we extend the problem of classification using Fuzzy Association Rule Mining and propose the
concept of Fuzzy Weighted Associative Classifier (FWAC). Classification based on Association rules is
considered to be effective and advantageous in many cases. Associative classifiers are especially fit to
applications where the model may assist the domain experts in their decisions. Weighted Associative
Classifiers that takes advantage of weighted Association Rule Mining is already being proposed. However,
there is a so-called "sharp boundary" problem in association rules mining with quantitative attribute
domains. This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates
classification rules using Fuzzy Weighted Support and Confidence framework. The naïve approach can be
used to generating strong rules instead of weak irrelevant rules. where fuzzy logic is used in partitioning
the domains. The problem of Invalidation of Downward Closure property is solved and the concept of
Fuzzy Weighted Support and Fuzzy Weighted Confidence frame work for Boolean and quantitative item
with weighted setting is generalized. We propose a theoretical model to introduce new associative classifier
that takes advantage of Fuzzy Weighted Association rule mining.
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALIJCSEIT Journal
In this paper, a system, developed for speech encoding, analysis, synthesis and gender identification is
presented. A typical gender recognition system can be divided into front-end system and back-end system.
The task of the front-end system is to extract the gender related information from a speech signal and
represents it by a set of vectors called feature. Features like power spectrum density, frequency at
maximum power carry speaker information. The feature is extracted using First Fourier Transform (FFT)
algorithm. The task of the back-end system (also called classifier) is to create a gender model to recognize
the gender from his/her speech signal in recognition phase. This paper also presents the digital processing
of a speech signals (pronounced “A” and “B”) which are taken from 10 persons, 5 of them are Male and
the rest of them are Female. Power Spectrum Estimation of the signal is examined .The frequency at
maximum power of the English Phonemes is extracted from the estimated power spectrum. The system uses
threshold technique as identification tool. The recognition accuracy of this system is 80% on average.
DETECTION OF CONCEALED WEAPONS IN X-RAY IMAGES USING FUZZY K-NNIJCSEIT Journal
Scanning baggage by x-ray and analysing such images have become important technique for detecting
illicit materials in the baggage at Airports. In order to provide adequate security, a reliable and fast
screening technique is needed for baggage examination.This paper aims at providing an automatic method
for detecting concealed weapons, typically a gun in the baggage by employing image segmentation method
to extract the objects of interest from the image followed by applying feature extraction methods namely
Shape context descriptor and Zernike moments. Finally the objects are classified using fuzzy KNN as illicit
or non-illicit object.
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
The proposed approach avoids the semantic gap in image retrieval by combining automatic relevance
feedback and a modified stochastic algorithm. A visual feature database is constructed from the image
database, using combined feature vector. Very few fast-computable features are included in this step. The
user selects the query image, and based on that, the system ranks the whole dataset. The nearest images are
retrieved and the first automatic relevance feedback is generated. The combined similarity of textual and
visual feature space using Latent Semantic Indexing is evaluated and the images are labelled as relevant or
irrelevant. The feedback drives a feature re-weighting process and is routed to the particle swarm
optimizer. Instead of classical swarm update approach, the swarm is split, for each swarm to perform the
search in parallel, thereby increasing the performance of the system. It provides a powerful optimization
tool and an effective space exploration mechanism. The proposed approach aims to achieve the following
goals without any human interaction - to cluster relevant images using meta-heuristics and to dynamically
modify the feature space by feeding automatic relevance feedback.
ERROR PERFORMANCE ANALYSIS USING COOPERATIVE CONTENTION-BASED ROUTING IN WIRE...IJCSEIT Journal
In Wireless Ad hoc network, cooperation of nodes can be achieved by more interactions at higher protocol
layers, particularly the MAC (Medium Access Control) and network layers play vital role. MAC facilitates
a routing protocol based on position location of nodes at network layer specially known as Beacon-less
geographic routing (BLGR) using Contention-based selection process. This paper proposes two levels of
cross-layer framework -a MAC network cross-layer design for forwarder selection (or routing) and a
MAC-PHY for relay selection. Wireless networks suffers huge number of communication at the same time
leads to increase in collision and energy consumption; hence focused on new Contention access method
that uses a dynamical change of channel access probability which can reduce the number of contention
times and collisions. Simulation result demonstrates the best Relay selection and the comparative of direct
mode with the cooperative networks. And also demonstrates the Performance evaluation of contention
probability with Collision avoidance.
M-FISH KARYOTYPING - A NEW APPROACH BASED ON WATERSHED TRANSFORMIJCSEIT Journal
Karyotyping is a process in which chromosomes in a dividing cell are properly stained, identified and
displayed in a standard format, which helps geneticist to study and diagnose genetic factors behind various
genetic diseases and for studying cancer. M-FISH (Multiplex Fluorescent In-Situ Hybridization) provides
color karyotyping. In this paper, an automated method for M-FISH chromosome segmentation based on
watershed transform followed by naive Bayes classification of each region using the features, mean and
standard deviation, is presented. Also, a post processing step is added to re-classify the small chromosome
segments to the neighboring larger segment for reducing the chances of misclassification. The approach
provided improved accuracy when compared to the pixel-by-pixel approach. The approach was tested on
40 images from the dataset and achieved an accuracy of 84.21 %.
Steganography is the technique of hiding a confidential message in an ordinary message and the extraction
of that secret message at its destination. Different carrier file formats can be used in steganography.
Among these carrier file formats, digital images are the most popular. For this work, digital images are
used. Here steganography is done on the skin portion of an image. First skin portion of an image is
detected. Random pixels are selected from that detected region using a pseudo-random number generator.
The bits of the secret message will be embedded on the LSB of these random pixels. An analysis is done to
check the efficiency and robustness of the proposed method. The aim of this work is to show that
steganography done using random pixel selection is less prone to outside attacks.
A NOVEL WINDOW FUNCTION YIELDING SUPPRESSED MAINLOBE WIDTH AND MINIMUM SIDELO...IJCSEIT Journal
In many applications like FIR filters, FFT, signal processing and measurements, we are required (~45 dB)
or less side lobes amplitudes. However, the problem is usual window based FIR filter design lies in its side
lobes amplitudes that are higher than the requirement of application. We propose a window function,
which has better performance like narrower main lobe width, minimum side lobe peak compared to the
several commonly used windows. The proposed window has slightly larger main lobe width of the
commonly used Hamming window, while featuring 6.2~22.62 dB smaller side lobe peak. The proposed
window maintains its maximum side lobe peak about -58.4~-52.6 dB compared to -35.8~-38.8 dB of
Hamming window for M=10~14, while offering roughly equal main lobe width. Our simulated results also
show significant performance upgrading of the proposed window compared to the Kaiser, Gaussian, and
Lanczos windows. The proposed window also shows better performance than Dolph-Chebyshev window.
Finally, the example of designed low pass FIR filter confirms the efficiency of the proposed window.
CSHURI – Modified HURI algorithm for Customer Segmentation and Transaction Pr...IJCSEIT Journal
Association rule mining (ARM) is the process of generating rules based on the correlation between the set
of items that the customers purchase.Of late, data mining researchers have improved upon the quality of
association rule mining for business development by incorporating factors like value (utility), quantity of
items sold (weight) and profit. The rules mined without considering utility values (profit margin) will lead
to a probable loss of profitable rules.
The advantage of wealth of the customers’ needs information and rules aids the retailer in designing his
store layout[9]. An algorithm CSHURI, Customer Segmentation using HURI, is proposed, a modified
version of HURI [6], finds customers who purchase high profitable rare items and accordingly classify the
customers based on some criteria; for example, a retail business may need to identify valuable customers
who are major contributors to a company’s overall profit. For a potential customer arriving in the store,
which customer group one should belong to according to customer needs, what are the preferred functional
features or products that the customer focuses on and what kind of offers will satisfy the customer, etc.,
finds the key in targeting customers to improve sales [9], which forms the base for customer utility mining.
AN EFFICIENT IMPLEMENTATION OF TRACKING USING KALMAN FILTER FOR UNDERWATER RO...IJCSEIT Journal
The exploration of oceans and sea beds is being made increasingly possible through the development of
Autonomous Underwater Vehicles (AUVs). This is an activity that concerns the marine community and it
must confront the existence of notable challenges. However, an automatic detecting and tracking system is
the first and foremost element for an AUV or an aqueous surveillance network. In this paper a method of
Kalman filter was presented to solve the problems of objects track in sonar images. Region of object was
extracted by threshold segment and morphology process, and the features of invariant moment and area
were analysed. Results show that the method presented has the advantages of good robustness, high
accuracy and real-time characteristic, and it is efficient in underwater target track based on sonar images
and also suited for the purpose of Obstacle avoidance for the AUV to operate in the constrained
underwater environment.
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
FACTORS AFFECTING ACCEPTANCE OF WEB-BASED TRAINING SYSTEM: USING EXTENDED UTA...IJCSEIT Journal
Advancement in information system leads organizations to apply e-learning system to train their employees
in order to enhance its performance. In this respect, applying web based training will enable the
organization to train their employees quickly, efficiently and effectively anywhere at any time. This
research aims to extend Unified Theory of Acceptance and Use Technology (UTAUT) using some factors
such flexibility of web based training system, system interactivity and system enjoyment, in order to explain
the employees’ intention to use web based training system. A total of 290 employees have participated in
this study. The findings of the study revealed that performance expectancy, facilitating conditions, social
influence and system flexibility have direct effect on the employees’ intention to use web based training
system, while effort expectancy, system enjoyment and system interactivity have indirect effect on
employees’ intention to use the system.
PROBABILISTIC INTERPRETATION OF COMPLEX FUZZY SETIJCSEIT Journal
The innovative concept of Complex Fuzzy Set is introduced. The objective of the this paper to investigate
the concept of Complex Fuzzy set in constraint to a traditional Fuzzy set , where the membership function
ranges from [0, 1], but in the Complex fuzzy set extended to a unit circle in a complex plane, where the
member ship function in the form of complex number. The Compressive study of mathematical operation of
Complex Fuzzy set is presented. The basic operation like addition, subtraction, multiplication and division
are described here. The Novel idea of this paper to measure the similarity between two fuzzy relations by
evaluating δ -equality. Here also we introduce the probabilistic interpretation of the complex fuzzy set
where we attempted to clarify the distinction between Fuzzy logic and probability.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Water Industry Process Automation and Control Monthly - May 2024.pdf
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT)
1. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
DOI : 10.5121/ijcseit.2013.3101 1
IMPLEMENTING HIERARCHICAL CLUSTERING
METHOD FOR MULTIPLE SEQUENCE
ALIGNMENT AND PHYLOGENETIC TREE
CONSTRUCTION
Harmandeep Singh1
, Er. Rajbir Singh Associate Prof.2
, Navjot Kaur3
1
Lala Lajpat Rai Institute of Engineering and Tech., Moga
Punjab, INDIA
har_pannu@yahoo.co.in
ABSTRACT
In the field of proteomics because of more data is added, the computational methods need to be more
efficient. The part of molecular sequences is functionally more important to the molecule which is more
resistant to change. To ensure the reliability of sequence alignment, comparative approaches are used. The
problem of multiple sequence alignment is a proposition of evolutionary history. For each column in the
alignment, the explicit homologous correspondence of each individual sequence position is established. The
different pair-wise sequence alignment methods are elaborated in the present work. But these methods are
only used for aligning the limited number of sequences having small sequence length. For aligning
sequences based on the local alignment with consensus sequences, a new method is introduced. From NCBI
databank triticum wheat varieties are loaded. Phylogenetic trees are constructed for divided parts of
dataset. A single new tree is constructed from previous generated trees using advanced pruning technique.
Then, the closely related sequences are extracted by applying threshold conditions and by using shift
operations in the both directions optimal sequence alignment is obtained.
General Terms
Bioinformatics, Sequence Alignment
KEYWORDS
Local Alignment, Multiple Sequence Alignment, Phylogenetic Tree, NCBI Data Bank
1. INTRODUCTION
Bioinformatics is the application of computer technology to the management of biological
information. It is the analysis of biological information using computers and statistical
techniques; the science of developing and utilizing computer databases and algorithms to
accelerate and enhance biological research. Bioinformatics is more of a tool than a discipline, the
tools for analysis of Biological Data.
In bioinformatics, a multiple sequence alignment is a way of arranging the primary sequences of
DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional,
structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide
or amino acid residues are typically represented as rows within a matrix. Gaps are inserted
between the residues so that residues with identical or similar characters are aligned in successive
2. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
2
columns. Mismatches can be interpreted, if two sequences in an alignment share a common
ancestor and gaps as indels i.e. insertion or deletion mutations introduced in lineages. In DNA
sequence alignment, the degree of similarity between amino acids occupying a particular position
in the sequence can be presented as a rough measure of how conserved a particular region is
among lineages. The substitutions of nucleotides whose side chains have similar biochemical
properties suggest that this region has structural or functional importance.
2. DATA MINING
Data mining is the process of extracting patterns from data. Data mining is becoming an
increasingly important tool to transform this data into information. As part of the larger process
known as knowledge discovery, data mining is the process of extracting information from large
volumes of data. This is achieved through the identification and analysis of relationships and
trends within commercial databases. Data mining is used in areas as diverse as space exploration
and medical research. Data mining tools predict future trends and behaviors, allowing businesses
to make proactive, knowledge-driven decisions. The knowledge discovery process includes:
• Data Cleaning
• Data integration
• Data selection
• Data Transformation
• Data Mining
• Pattern Evaluation
• Knowledge Presentation
2.1 Data Mining Techniques
To design the data mining model the choice has to be made from various data mining techniques
[6][8], which are as follows:
a. Cluster Analysis
• Hierarchical Analysis
• Non- Hierarchical Analysis
b. Outlier Analysis
c. Induction
d. Online Analytical Processing (OLAP)
e. Neural Networks
f. Genetic Algorithms
g. Deviation Detection
h. Support Vector Machines
i. Data Visualization
j. Sequence Mining
In the present work we have adopted Hierarchical Cluster Analysis as a Data Mining approach, as
it is most suitable to work for a common group of protein sequences.
2.2 Data Mining Methods
Prediction and description are two important goals of data mining. Prediction uses the current
database to predict unknown or probable values of other variables of interest, and description
extracts the human-interpretable patterns describing the data.
3. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
3
These goals can be achieved using following data-mining methods:
a. Classification
b. Regression
c. Clustering
d. Summarization
e. Dependency Modeling
f. Change and Deviation Detection
In the present work, we have picked Classification Method for the sequence alignment problem.
3. ALIGNMENT METHODS
The sequences can be aligned manually that are short or similar. However, lengthy, highly
variable or extremely numerous sequences cannot be aligned solely by human effort.
Computational approaches are used for the alignment of these sequences. Computational
approaches divided into two categories:
• Global
• Local
3.1 Global and Local Alignments
When the sequences in the query set are similar and of roughly equal size then, Global alignments
are useful which attempt to align every residue in every sequence. (This does not mean global
alignments cannot end in gaps.) The Needleman-Wunsch algorithm is a global alignment
technique, which is mainly based on dynamic programming. For dissimilar sequences that are
suspected to contain regions of similarity within their larger sequence context local alignments
are more useful. A general local alignment technique is Smith-Waterman algorithm which is also
based on dynamic programming. There is no difference between local and global alignments
with sufficiently similar sequences.
3.2 Pair wise Alignment
To find the best-matching local or global alignments of two query sequences a Pair wise sequence
alignment methods are used. At a time only two sequences can be align in pair wise alignment
methods, which are efficient to calculate and are often used for methods that do not require
extreme searching a database for sequences to a query. For producing Pair wise alignments, there
are basically three methods:
• dot-matrix methods
• dynamic programming
• word methods
However; to align pairs of sequences multiple sequence alignment techniques can also be used.
Although each and every individual method has its own pros and cons, with highly repetitive
sequences all these Pair wise methods have difficulty because of low information content -
especially where the numbers of repetitions differ in the two sequences to be aligned. the
'maximum unique match', or the longest subsequence one way of quantifying the utility of a given
Pair wise alignment is that occurs in both query sequence. Longer MUM sequences typically
reflect closer relatedness.
4. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
4
3.3 Dot Matrix Method
The dot-matrix approach is qualitative and simple to produce a family of alignments for
individual sequence regions but it is time-consuming to analyze on a large scale. From a dot-
matrix plot certain sequence features (such as insertions, deletions, repeats, or inverted repeats)
can easily be identified. The dot-matrix plot is created by designating one sequence to be the
subject and placing it on the horizontal axis and designating the second sequence to be the query
and placing it on the vertical axis of the matrix. Some implementations vary the size or intensity
of the dot which depends on the degree of similarity between two characters, to accommodate
conservative substitutions. The dot plots of very closely related sequences will make a single
diagonal line along the matrix. To assess repetitiveness in a single sequence, dot plots can also be
used. A sequence can be plotted against itself and regions that share significant similarities will
appear as diagonal lines along the matrix. This effect can occur when a DNA/RNA consists of
multiple similar structural domains. A dot matrix plot is a method of aligning two sequences to
provide a picture of the homology between them. The main diagonal represents the sequence's
alignment with itself; lines off the main diagonal represent similar or repetitive patterns within the
sequence.
3.4 Progressive Method
A Progressive also known as hierarchical or tree methods, generate a multiple sequence
alignment by first aligning the most similar sequences and then adding successively less related
sequences or groups to the alignment until the entire query set has been incorporated into the
solution. Sequence relatedness is describing by the initial tree that is based on Pair wise
alignments which may include heuristic Pair wise alignment methods. Results of progressive
alignment are dependent on the choice of most related sequences and thus can be sensitive to
inaccuracies in the initial Pair wise alignments. In most progressive multiple sequence alignment
methods, the sequences are weighted in the query set according to their relatedness, which
reduces the likelihood of making a poor choice of initial sequences and thus improves alignment
accuracy.
4. INTRODUCTION TO PHYLOGENETICS ANALYSIS
Phylogenetics is the study of evolutionary relationships. Phylogenetic analysis is useful for
inferring or estimating these relationships. Phylogenetic analysis is usually depicted as branching,
tree like diagrams that represent an estimated pedegree of the inherited relationships among gene
trees, organisms, or both. In phylogenetics because the word clade used which is a set of
descendants from a single ancestor, is derived from the Greek word for branch, that’s why it is
sometimes called as cladistics. However, for hypothesizing about evolutionary relationships,
cladistics method is used.
In cladistics, members of a group or clade share a common evolutionary history and are more
related to each other than to members of another group that particular group is recognized by
sharing unique features that were not present in distant ancestors. These shared, derived
characteristics can be anything that can be observed and described from two organisms. Usually,
cladistic analysis is performed by comparing multiple characteristics at once, either multiple
phenotypic characters or multiple base pairs or nucleotides in a sequence.
1) There are three basic assumptions in cladistics: Organisms of any group is related by descent
from a common ancestor.
2) There is a bifurcating pattern of cladogenesis. This assumption is controversial.
3) A phylogenetic tree represents the resulting relationship from cladistic analysis.
5. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
5
5. METHODOLOGY
The methodology for this work involves the uses the cluster analysis techniques to compute the
alignment scores between the multiple sequences. Based on the alignment the phylogenetic tree is
constructed signifying the relationship between different entered sequences. The data is taken
from the databank of NCBI.
5.1 SNAP (synonymous non-synonymous analysis program)
In SNAP, based on a set of codon aligned nucleotide sequences, synonymous and non-
synonymous substitution rates are calculates.
They should be provided in table format:
Seq1ACTGCCTTTGGC...
Seq2ACTGCCTATGGG...
The first field is the sequence name and the second field is the sequence, and then returns to a
new
line for the second sequence. A single insertion could be treated as a way to keep the codons ACT
and GGC intact:
ACTTGCC ==> ACTT--GCC
ACTGGC ==> ACT---GCC
To represent ambiguous bases, the letter N should be used.
A dash, '-' for insertions, Only A C G T N - are allowed.
6. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
6
5.2 Jukes cantor Method
The Jukes and Cantor model is a model which computes probability of substitution from one state
to another. From this model we can also derive a formula for computing the distance between 2
sequences. This model was for nucleotides, but this can easily be substituted by codons or amino
acids. In this model the probability of changing from one state to a different state is always equal
as well as the different sites are independent. The evolutionary distance between two species is
given by the following formula.
Where Nd is the number of mutations (or different nucleotides) between the two sequences and N
is the nucleotide length.
5.3 Algorithm
The algorithm finds the optimal alignment of the most closely related species from the set of
species. The work includes the construction of phylogenetic trees for the triticum wheat varies.
The sequences for the varieties are loaded from the NCBI database. The phylogenetic distances
are calculated based on the jukes cantor method and the trees are constructed based on nearest-
neighbor method. The final tree is obtained by using tree pruning. The closely related species are
selected based on the threshold condition. As the sequences are very lengthy and the alignment is
tedious work for these sequences. To obtain the multiple sequence alignment, the consensus
sequence (fixed sequence for eukaryotes) is aligned with the available sequence. This helps in
locating the positions near to the optimal alignment. The sequences are aligned based on local
alignment and shift right and shift left operations are performed five times to obtain the optimal
multiple sequence alignment. The algorithm steps are written below:
• Load the m wheat sequences from NCBI database
• Calculate the distances from jukes cantor method.
• Create the matrix1 for different species based on JC distance.
7. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
7
• Find the smallest distance by examine the original matrix. Half that number is the branch
length to each of those two species from the first Node.
• Create a new, reduced matrix with entries for the new Node and the unpicked species.
The distance from each of the unpicked will be the average of the distance from the
unpicked to each of the two species in Node 1.
• Find the smallest distance by examining the reduced matrix. If these are two unpicked
species, then they Continue these distances calculation and matrix reduction steps until all
species have been picked.
• Construct the tree 1.
• Load the n more wheat sequences from NCBI database.
• Calculate the distances from jukes cantor method.
• Create the matrix2 for different species based on JC distance.
• Repeat step iv to vi
• Construct tree 2.
• Apply tree pruning to join the tree1 and tree2.
• Consider the p species above threshold value 0.7.
• Consider the consensus sequence (TATA box) for wheat varieties.
• Load the p sequences (S1, S2, ……..Sp)
• Align TATA consensus sequence with S1 to Sp considering local alignment.
• Align the sequences using local alignment with TATA consensus sequence.
• Calculate the alignment with shift five shift right and five shift left operations.
• Consider the alignment with minimum score.
6. RESULTS AND DISCUSSIONS
The triticum wheat varieties shown in table 1 are used as input for the present research work. The
data is loaded from National Center for Biotechnology Information advances science and health
(ncbi.nlm.nih.gov). The five wheat varieties are chosen. The evolutionary distance is calculated
with the help of Jukes Cantor Method. The phylogenetic tree is created using nearest neighbor
technique. The tree obtain is shown in Fig. 2.Then the different seven varieties are chosen. The
tree is constructed by the same method and is combined with the first tree by using tree pruning
techniques. The final tree for twelve wheat varieties is shown in Fig. 3.
The Jukes Cantor values for these varieties are shown below.
Columns 1 through 15
34.2415 2.2956 2.2840 2.3334 3.5762 4.5120 34.2415 1.3255 2.2956 0.7981
Columns 16 through 30
0.9966 1.7686 1.0740 0.0161 1.0029 0.7850 0.7981 0.7830 0.5973 0.8810 1.3949
Columns 31 through 45
1.0590 0.8596 0.6849 0.8790 2.2364 1.0842 0.0183 0.0522 0.9209 0.9863 1.7589
TRITICUM WHEAT VARIETIES
• Compactum Cultivar (GU259621)
• Compactum Isolate (GU048562)
• Compactum Glutenin (DQ233207)
• Macha Floral (AY714342)
• Macha Isolate (GU048564)
• Macha Glutenin (GQ870249)
8. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
8
• Sphaerococcum (EU670731)
• Tibeticum Isolate (GU048531)
• Yunnanense Isolate (GUO48553)
• Vavilovii Isolate (GU048554)
• Vavilovii Caroxylase (DQ419977)
• Gamma Gladin (AJ389678)
Columns 46 through 60
0.9966 1.7686 1.0740 0.0161 0.9092 0.9751 1.7275 1.0985 0.0500 0.8219 1.1671
Columns 61 through 66
1.9545 0.9863 3.3140 1.7589
The threshold condition is applied to obtain the most closely related varieties from the set of
twelve varieties. The most commonly related varieties are shown in Fig. 4. These varieties are:
• 'yunnanense_isolate‟
• 'vavilovii_isolate‟
• 'compactum_isolate‟
• 'macha_isolate‟
• 'tibeticum_isolate‟
The multiple sequence alignment for these varieties is obtained. The MSA is shown in Fig. 5.
7. CONCLUSION AND FUTURE SCOPE
7.1 Conclusion
To simplify the complexity of the problem of Multiple Sequence Alignment several different
heuristics have been employed. The solution to the neighborhood of only two sequences at a time
is restricted by pair-wise progressive dynamic programming. Using pair-wise progressive
dynamic programming method, all sequences are compared pair-wise and then each is aligned to
its most similar partner or group of partners. Each group of partners is then aligned to finish the
complete multiple sequence alignment. This is a quite tedious job and is suited for limited number
of sequences and also of small length. Also, it doesn’t guarantee the optimal alignment.
Complexity is reduced by restricting the search space to only the most conserved local portions of
all the sequences involved. The model is constructed for aligning the DNA sequences of different
wheat varieties. There are different sequence formats available from which plain text format is
utilized. The two phylogenetic trees are constructed for different datasets.
The closely related sequences are extracted based on the threshold condition. Then the consensus
sequence is used for to obtain the local alignment with each sequence. Based on the local
alignment, the sequences are arranged at the corresponding positions detected by the consensus
sequence. Shift right and shift left operations are performed on each sequence to obtain the
optimal alignment.
9. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
9
Figure1. Evolutionary distance between for five wheat varieties
Figure2. Evolutionary distance between for 12 wheat varieties
Figure3. Phylogenetic tree for five closely related varieties
10. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
10
Figure4. Multiple sequence alignment for five closely related species
Figure5. Phylogenetic tree for twelve closely related varieties
Figure 6. Multiple sequence alignment for twelve closely related species
11. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
11
Figure 7. Contd. Multiple sequence alignment for twelve closely related species
Figure 8. Contd. Multiple sequence alignment for twelve closely related species
Figure 9. Contd. Multiple sequence alignment for twelve closely related species
7.2 Future Scope
Following improvements regarding the developed model of bioinformatics can be made:
12. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.3, No.1, February 2013
12
• The model can be extended for protein sequence alignment.
• Various Scoring matrices like PAM and BLOSUM can be incorporated for computing the
evolutionary distances.
8. REFERENCES
[1] B.Bergeron, Bioinformatics Computing, Pearson Education, 2003, pp. 110- 160.
[2] Clare, A. Machine learning and data mining for yeast functional genomics Ph.D. thesis, University of
Wales, 2003.
[3] Han, J. and Kamber M., Data Mining: Concepts and Techniques, Morga Kaufmann Publishers, 2004,
pp. 19-25.
[4] Jiang, D. Tang, C. and Zhang, A., “Cluster Analysis for Gene Expression Data”, IEEE Transactions
on knowledge and data engineering, vol. 11, 2004, pp. 1370-1386.
[5] Kai, L. and Li-Juan, C., “Study of Clustering Algorithm Based on Model Data”, International
Conference on Machine Learning and Cybernetics, Honkong, 2007, Volume 7, No. 2., 3961-3964
[6] Kantardzic, M., Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley & Sons,
2000, pp. 112-129.
[7] Baxevanis, A.D and Ouellette, B.(2001) ” Sequence alignment and Database Searching” in
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. John Wiley & Sons, New
York, pp. 187-212.
[8] Tzeng YH. Et al. (2004) “Comparison of three methods for estimating rates of synonymous and
nonsynonymous nucleotide substitutions”, Molecular. Biology Evol. , pp. 2290-8.
AUTHOR
Harmandeep Singh has received B-Tech degree from Punjab Technical University,
Jalandhar in 2009 and his M-Tech Degree from Punjab Technical University, Jalandhar
in 2012.
Er. Rajbir Singh Cheema is working as a Head of Department in Lala Lajpat Rai
Institute of Engineering & Technology, Moga, Punjab. His research interests are in the
fields of Data Mining, Open Reading Frame in Bioinformatics, Gene Expression
Omnibus, Cloud Computing, Routing Algorithms, Routing Protocols Load Balancing
and Network Security. He has published many national and international papers.
Navjot Kaur has received B-Tech degree from Punjab Technical University, Jalandhar in 2010 and
pursuing her M-Tech Degree from Punjab Technical University, Jalandhar. She is working as an Assitant
Professor in Lala Lajpat Rai Institute of Engineering & Technology, Moga, Punjab.