SlideShare a Scribd company logo
International Journal of Engineering Research and Development
e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com
Volume 9, Issue 10 (January 2014), PP. 01-07
1
Determining the Promoter Region in the DNA using ke-Rem
(ke-Rule Extraction Method)
Günay Karlı1
1
International Burch University, Faculty of Engineering and IT, IT Department, Sarajevo,
Bosnia and Herzegovina.
Abstract:- Biologists endeavor to reveal the secrets of life by looking into gene sequences. Concordantly,
determining the promoter region in the DNA is an important step in the process of detecting genes. However the
gene sequence data grow too huge recently. Conventional methods remain incapable to predict promoter so it
becomes increasingly important to automate the identification of functional elements, such as coding region,
genomic region or promoters. Thus many computer scientists are interested in the biological technology, and
improve some data mining methods which take advantages of computer power to see into gene sequences. In this
study, we employ ke-REM (ke-Rule Extraction Method) classifier to predict promoters of DNA sequences, and
evaluate their performances. The obtained results show that the classifier competes the existing techniques for
identifying promoter regions.
Keywords:- Promoter prediction, inductive learning, data mining, bioinformatics, DNA.
I. INTRODUCTION
Genome coding segments for transfer ribonucleic acids (tRNAs), messenger ribonucleic acids (mRNAs)
and ribosomal ribonucleic acids (rRNAs) are known as genes [1]. Proteins have amino acids whose sequence is
determined by mRNAs. Prokaryotic cells have a simple mechanism since all the genes tend to be converted into the
corresponding mRNA and then finally into proteins [2]. Gene finding or genome analysis generally relates to that
part of computational biology that involves identification of stretches of sequence algorithmically, and it is basically
the genomic DNA which is functional biologically. This particularly incorporates protein-coding genes as well as
various functional elements like regulatory regions and RNA genes. To clearly understand a species‟ genome, one of
the most crucial and significant step is to understand gene finding.
For prokaryotes, it is relatively simple to predict the Computational Gene since all the genes are basically
converted into the corresponding mRNA and finally into proteins. However, for eukaryotic cells, the process
becomes more complicated since there is interruption of coding of the DNA sequence by random sequences
commonly known as introns. There are a number of questions which biologists have currently attempted to answer
and they include [3]:
 What section of a DNA sequence codes for a protein and what section is considered as junk DNA?
 How can a junk DNA be classified as intron, transposes, untranslated region, regulatory elements, dead
genes, etc.?
 Dividing a genome that has been newly sequenced into the coding genes and non-coding regions.
In a DNA, determination of the promoter region is a crucial step in the process of detecting genes and this
implies that the problem of determining a promoter is of major significance in biology [4] [5]. Biologists have tried
to analyze the secrets of life by investigating the gene sequence. Nevertheless, the growth of data on the gene
sequence has been growing rapidly. Therefore, most of computer scientists use biological technology to come up
with methods generated by the power of a computer to see gene sequences [6].
Despite the fact that approaches for determining regions of coding in genome DNA sequences
have been in existence since the nineteenth century, programs designed for combining coding sequences into mRNA
sequences that could be translated were invented in early 20th century [7]. Biologists have since then had various
programs at their disposal including GenViewer [8], GeneID [7], GenLang [9], GeneParser [10], FGENEH [11],
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
2
SORFIND [12], Xpound [13], GRAIL [14], VEIL [15], GenScan [16], etc. Two tools, which include GRAIL and
GenScan, are the most widely used tools both in the industry and in academics [17].
Most of the above approaches are founded on motifs searching in regard to DNA sequence to establish
whether it forms a promoter or not [18]. In this context, the search is done through the help of Markov models and
position weight matrices [19][20]. Artificial intelligence has also been used to supplement statistical intelligence.
More specifically artificial neural networks have produced acceptable values with the only disadvantage being high
positive rates that are false [21][22]. However, this has not curtailed their application to solve other bioinformatics
problems.
Computational method for predicting accurate promoter is still to be wholly developed. Currently,
biologists use ke-REM. The purpose of this paper is to explicate ke-REM can be successful be used for promoter
sequences.
A. What is promoter and importance of promoter prediction?
A promoter by definition is a DNA non-coding region responsible for initiating transcriptions a specific
gene. They are usually located on the upstream and on the same strand of the DNA-towards the anti-sense strand‟s
3‟ region also referred to as the template strand. A promoter may be characterized by a nucleotide sequence of
between 100 to 1000 base pair long [23]. A special enzyme, RNA polymerase, is needed for mRNA transcription.
This enzyme needs to attach itself to the DNA near a gene for it to qualify to be called a promoter sequence.
Sequences comprises of specific response and DNA sequences responsible for providing a fully secure primary
binding site for the enzyme as well as for proteins referred to as transcription factors.
Eukaryotic and Prokaryotic promoters vary from each other. In prokaryotic organism ς70 sigma factor is
able to identify specific promoter sequences, which in this case are 5‟TATAAT3‟ and 5‟TTGACA3‟ (-10 and -35
respectively) through the help of ς70 subunit of the polymerase enzyme[24]. Eukaryotic organism on the other hand
is more complex requiring at least 7 different factors for the polymerase II enzyme to bind to the promoter.
The promoter intensity correlates with identity degree to the sequence but separated by the spacer length.
Dense promoters are however founded closer to gene [25][26]. It has for long been thought that for transcription
activity to be optimal, various promoter elements‟ combinations including -35 and -10 hexamers, must be in
existence coupled with downstream and upstream regions [25]. According to this school of thought, RNAP works
both regions of the two hexamers in sequence and promoters of A+ T-rich sequences upstream of the −35 hexamer
[26][27] in several E. coli or Bacillus subtilis were identified as facilitating increased transcription in vitro when
accessory proteins were absent [28]. Different upstream sequences show different effects on transcription increasing
it from a mere 1.5 to 90 fold [29]. Those promoter sequences that are characterized by powerful binding affinity
have a direct effect on mRNA transcription.
Regardless of whether a transcribed DNA sequence can be identified through biological testing or not,
experiments are known to be time consuming and costly. The promoter prediction approach can however narrow
promoter regions amongst huge DNA sequences. A subsequent experiment can be established and tested thus saving
time and money [6].
II. METARIAL AND METHODS
There are two core classes of the promoter prediction, namely „+‟ and „-‟. These classes will denote the
existence of promoter prediction in the DNA sequence, having the „+‟ denoting for a positive indication of promoter
location in the DNA sequence and the „-‟ denoting the absence of promoter locations in the DNA sequence. This
research paper proposes to deal with a supervised learning technique in the prediction of promoter regions in the
DNA sequence.
A. Data set
The research sought to incorporate the E. Cole promoter gene arrays of DNA in the testing the proficiency
of ANN. Such data were collected from the UCI Repository [30]. This contains a set of 106 promoter and non-
promoter instances. The research paper notes that such data is viable in the comparisons of ANN with the models
existing in the literature; additionally such information involving the use of the data set is publicly available [5].
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
3
The 106 DNA arrays are composed of 57 nucleotides each. 53 of the DNA sequences in the data set had a
„+‟ denoting, indicating the presence of promoter location in the DNA array. The research then sought to align the
(+) parameter instances separately allowing for transcription. The following data characterize the (+) instances as
observed from the experiment. One is that for every occurrence the (+) represents for the promoter positive
presence, a name was also given in each instance and a classification of the DNA array was made composing of A,
T, G and C stand for Adenine, Thymine, Guanine, Cytosine [30].
B. ke-REM (ke-Rule Extraction Method)
This section introduces the novel development referred to as ke-REM (ke-Rule Extraction Method) and
addresses its ability in utilizing DNA promoter region predictions. As provided for above, an e.coli dataset consists
of a total of 106 DNA sequences, each containing a length of 57 nucleotides. The computer science perspective
expresses the dataset for e-coli as consisting of 106 instances containing 57 attributes bearing four values. The
attributes for these instances can be expressed as nucleotides locations for the 57-element sequence. Each attribute
accommodates 4 values, namely T-Thymine, A-Adenine, C-Cytosine and G-Guanine.
ke-REM constructs a rule-base by applying the data set attribute-value pairs. In an effort to generate a
robust rule-base, attribute-value pairs with significant importance are used. The significant question at this point
queries, “How are pairs with significant informational value determined?” the new ke-REM upgrade uses a “gain
function” in computing the informational value for the set‟s pair. ke-REM considers the higher gain value as a
higher informational value indicator. Therefore, the attribute-value with a higher value has a greater priority in the
processing of rule-base for the prediction system. keREM (ke-Rule Extraction Method) was upgraded to have the
ability to obtain IF-THEN rules from a given set of examples. It proactively discards encountered pitfalls commonly
present in inductive learning algorithms. keREM applies the gain function value, to determine which attributes are
of significant importance and are thus given a higher priority and as such, serve to further provide rules that are
more commonly acceptable.
The following is a summary of the algorithm:
Step 1: In a particular training set, a person computes class distribution and probability distribution rate of
every attribute-value.
Step 2: For every attribute in the data set, you compute the power of classification.
Step 3: for every attribute-value pairs, its Class-based Gain is calculated with the use of computed
probability distributions, power of classification and class distribution rate.
Step 4: One rule of selection is that you can select any value whose probability distributions one for n=1.
The next step is to convert the attribute-values into rules and then you mark the classified examples.
Step 5: Move to step 8.
Step 6: Starting from the first example that is unclassified, you form combinations with the n values by
using the attribute-values that has a bigger gain.
Step 7: You apply each combination in all examples. Using the values that are made up of n combinations,
those that match only with on class are converted into a rule. You mark the classified examples.
Step 8: In the training set, when all examples are classified, you move to step 11.
Step 9: perform the expression n=n+1
Step 10: go to step 6 if n<N
Step 11: Select the most general rule if there is over one rule that represents the same examples.
Step 12: End.
III. RESULTS AND DISCUSSION
The aim of this section is to experimentally analyze our approach for promoter sequences recognition with
the use of ke-REM and compare it with the existing approaches. Ke-REM has an important feature which allows it
to compute class-based gain for every attribute-value in a particular training set. First, in this context, the class
distribution and probability distribution rate of every nucleotide that forms DNA sequence is computed on the basis
of promoter and non-promoter classes. For every attribute in the data set, you contribute power of classification.
However, there is no class information in the results for the attribute-value pairs. Therefore, using class distribution
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
4
rate, probability distribution and the power of classification, you compute class-based gain for every value in the
DNA sequence data set. Using this method, rules that the algorithm produces were formed by attribute-value which
has a maximum information value.
There are two phases which fulfill the experiments of promoter prediction and they reflect the principles
behind a supervised learning algorithm which include testing and training. Classification model is built during
training, and during testing, the model is applied for classification of unseen examples that are DNA sequence and
they consist of promoter and non-promoter region. You evaluate the performance of ke-REM with the use of a
standard 5-fold cross-validation. Dataset is partitioned randomly in the 5-fold cross-validation into five subsets.
Every subset contains an equal ratio of both the promoter and non-promoter region. For five times, ke-REM is
trained using 4 subsets for each time for training and remaining the 5thh subset for testing. 5 models are generated in
this way during cross-validation. You obtain the final prediction performance by averaging the results that are
achieved from every model.
We determined prediction performance of the algorithm by measuring the threshold-dependent parameters
sensitivity (SE), specificity (SP), accuracy (ACC) and Matthew‟s Correlation coefficient (MCC). The following
equations were used to calculate ACC, SE, SP and MCC.
SE=TP/(TP+FN) (1)
SP=TN/(TN+FP) (2)
ACC= (TP+TN) / (TP+TN+FP+FN) (3)
MCC= ((TP*TN)-(FN*FP))/SQRT((TP+FN)*(TN+FP)*(TP+FP)*(TN+FN)) (4)
TP is true positive (promoter predicted as promoter)
FN is false negative (promoter predicted as non- promoter)
TN is true negative (non- promoter predicted as non- promoter)
FP is false positive (non- promoter predicted promoter).
The detailed performance of module in term of SE, SP, ACC and MCC is shown in Table 1.
Table 1: Performance of ke-REM in term of SE, SP, ACC and MCC
Threshold-
Dependent
Parameters
1. Model 2. Model 3. Model 4. Model 5. Model Average
ACC 0.75 0.90 0.80 0.90 0.69 0.8085
SE 0.80 0.90 0.80 1.00 0.54 0.8077
SP 0.70 0.90 0.80 0.80 0.85 0.8092
MCC 0.50 0.80 0.60 0.82 0.40 0.6246
In the literature, the way learning algorithms perform can be evaluated by applying cross-validation with
the use of a “leave-one-out” methodology. Leave-one-out cross validation (LOOCV) is considered as a special type
of k-fold cross-validation and k represents the number of instance in the data. This implies that in all iteration, close
to every data apart from the single observation are utilized for training and then you test the model on the single
observation. An estimate that is accurate that is obtained with the use LOOCV is considered as almost unbiased
[31]. This is commonly used when the data that is available is rare, particularly in bioinformatics when there is only
dozens of data sample available.
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
5
Table 2: The errors of some machine learning algorithms on promoter data
set.
System Errors Classifier
REX-1 0/106 Inductive L.A
ke-REM 3/106 Class-based gain
KBANN 4/106 A hybrid ML system
BP 8/106 Standard backpropagation with one layer
O'Neill 12/106 Ad hoc tech. from the bio. lit.
Near-
Neigh
13/106 A nearest neighbours algorithm
ID3 19/106 Quinlan's decision builder
Unlike the classifiers that have been applied here for promoter prediction (Table 2), ke-REM that has been
introduced in this document tends to outperform the present classifier for promoter prediction. When the
classification error is considered, it becomes better compared to ID3, KB, NN, O‟Neil and BP.
IV. CONCLUSIONS
Within the bioinformatics field, the promoter prediction is considered as a crucial problem from a
computational perspective. Newly developed ke-REM (ke-Rule Extraction Method) in this document has been
proposed as effective in tackling the problem. For rule-base with efficient rule to be generated, you employ attribute-
value pairs with higher importance. To calculate information value of the pair in the set, ke-REM uses its own “gain
function” to calculate information value. Because value of the higher gain tends to indicate higher information
value, a greater priority is given to attribute-value with higher value when it comes to production of rule-base of the
predicting system. According the results given above, it is possible to conclude that when ke-REM for promoter
prediction is employed it leads to results that are promising. Moreover, additional improvement increases the
accuracy of the results that have been obtained.
REFERENCES
[1]. D. Mount, Bioinformatics: Sequence and Genome Analysis, New York: Cold Spring Lab Press, 2013.
[2]. K. Davies, The Sequence, US: Weidenfeld & Nicholson, 2001.
[3]. P. Jayaram and K. Bhushan, Bioinformatics For Better Tomorrow, New Delhi: Indian Institute of
Technology, 2000.
[4]. B. Óscar and B. Santiago, “Cnn-promoter, new consensus promoter prediction program,” Revista EIA, no.
15, pp. 153-164, 2011.
[5]. C. Gabriela and M.-I. Bocicor, “Promoter Sequences Prediction Using Relational Association Rule
Mining,” Evolutionary Bioinformatics, vol. 8, pp. 181-196, 2012.
[6]. J.-W. Huang, Promoter Prediction in DNA Sequences, Kaohsiung,: National Sun Yat-sen University, 2003.
[7]. R. Guigo, S. Knudsen, N. Drake and T. Smith, “Prediction of gene structure,” Journal of Molecular
Biology, no. 226, pp. 141-157, 1995.
[8]. L. Milanesi, N. Kolchanov and I. Rogozin, “GenViewer: A computing tool for protein coding regions
prediction in nucleotide sequences,” in the 2nd International Congress on Bioinformatics, Supercomputing
and Complex Genome Analysis,, 573-587, 1993.
[9]. S. Dong and G. Stormo, “Gene structure prediction by linguistic methods,” Genomics, no. 23, pp. 540-551,
1994.
[10]. E. Stormo and E. Snyder, “Identification of coding regions in genomic DNA sequences: An application of
dynamic programming and neural networks,” Nucleic Acids Research, no. 21, pp. 607-613, 1993.
[11]. V. Solovyev, A. Salamov and C. Lawrence, “Prediction of internal exons by oligonucleotide composition
and discriminant analysis of spliceable open reading frames,” Nucleic Acids Research, no. 22, pp. 5156-
5163, 1994.
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
6
[12]. G. Hayden and M. Hutchinson, “The prediction of exons through an analysis of spliceable open reading
frames,” Nucleic Acids Research, no. 20, pp. 3453-3462, 1992.
[13]. A. Skolnick and M. Thomas, “A probabilistic model for detecting coding regions in DNA sequences,” IMA
J. Math. Appl. Med. Biol., no. 11, pp. 149-160, 1992.
[14]. Y. Xu, R. Mural and E. Uberbacher, “Constructing gene models from accurately predicted exons: An
application of dynamic programming,” Comput. Appl. Biosci, no. 10, pp. 613-623, 1994.
[15]. J. Henderson, S. Salzberg and K. Fasman, “Finding genes in DNA with a hidden Markov model,” Journal
of Computational Biology, vol. 2, no. 4, pp. 127-141, 1997.
[16]. C. Karlin and C. Burge, “Prediction of complete gene structures in human genomic DNA,” J. Mol. Biol, no.
268:, pp. 78-94, 1997.
[17]. M. Hrishikesh, S. Nitya and M. Krishna, “An ANN-GA model based promoter prediction in Arabidopsis
thaliana using tilling microarray data,” Bioinformation, vol. 6, no. 6, p. 240–243, 2011.
[18]. J. Gordon, M. Towsey and J. Hogan, “Improved prediction of bacterial transcription start sites,”
Bioinformatics, vol. 22, no. 2, pp. 142-148, 2006.
[19]. R. Liu and D. States, “Consensus promoter identification in the human genome utilizing expressed gene
markers and gene modelling,” Genome Research, no. 12, pp. 462-469, 2002.
[20]. C. Premalatha and C. a. K. K. Aravindan, “On improving the performance of promoter prediction classifier
for eukaryotes using fuzzy based distribution balanced stratified method.,” in Proceedings of the
International Conference on Advance in Computing, Control, and Telecommunication Technologies IEEE,,
ACT, 2009.
[21]. T. Abeel, Y. Saeys, E. Bonnet and P. Rouzé, “Generic eukaryotic core promoter prediction using structural
features of DNA,” Genome Research, vol. 18, no. 2, pp. 310-323, 2008.
[22]. Y.-J. Zhang, “A novel promoter prediction method inspiring by biological immune principles.,” Global
Congress on Intelligent Systems, no. 569-573, pp. 569-573, 2009.
[23]. S. S. Roded, S. Karni and Y. Felder, Promoter Sequence Analysis, Lecture 11,, 2007.
[24]. A. Dombroski, W. Walter and M. Record, “Polypeptides containing highly conserved regions of
transcription initiation factors 70 exhibit specificity of binding to promoter DNA,” Cell, p. 501–512, 1992 .
[25]. H. Bujard, M. Brenner and W. Kammerer, Structure-function relationship of Escherichia coli promoters,
New York: Elsevier, 1997.
[26]. D. Grana, T. Gardella and M. M. Susskind, “The effects of mutations in the ant promoter of phage P22,”
Genetics, p. 319–327, 1998.
[27]. M. Craig, M. L. Suh and T. Record, “HOz and DNase I probing of Es70RNA polymerase-l PR promoter
open complexes: Mg21binding and its structural consequences at the transcription start site,” Bio-
chemistry, p. 15624–15632, 1995.
[28]. D. Frisby and P. Zuber, “Analysis of the upstream activating sequence and site of carbon and nitrogen
source repression in the promoter of anearly-induced sporulation gene of Bacillus subtilis.,” Bacteriol, p.
7557–7564, 1991.
[29]. R. Wilma, A. Sarah and J. Salomon, “Escherichia Colipromoters With UP Elements Of Different Strengths:
Modular Structure Of Bacterial Promoters,” Journal Of Bacteriology, p. 5375–5383, 1998.
[30]. A. Frank and A. Asuncion, “UCI machine learning repository,” 2010. [Online]. Available:
http://archive.ics.uci.edu/ml/.
[31]. B. Efron, “Estimating the error rate of a prediction rule: improvement on cross-validation,” J. Am. Stat.
Assoc., no. 78, p. 316–331, 1983.
[32]. Essentials of Cell Biology, Nature Education, 2010.
[33]. Q. Luo and W. a. L. P. Yang, “Promoter recognition based on the interpolated Markov chains optimized via
simulated annealing and genetic algorithm,” Recognition Letters Pattern, vol. 9, no. 27, pp. 1031-1036,
2006.
[34]. S. Clancy, “Nature Education,” DNA transcription, vol. 1, no. 1, p. 41, 2008.
Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method)
7
[35]. M. Guigo and R. Burset, “Evaluation of gene structure prediction programs,” Genomics, vol. 3, no. 34, pp.
353-367, 1996.
[36]. M. Wang, M. Yin and T. Jason, “GeneScout: a data mining system for predicting vertebrate genes in
genomic DNA sequences,” Information Sciences, vol. 163, no. Special issue, pp. 201-218, 2013.
[37]. T. S. Y. B. E. R. P. a. V. d. P. Y. Abeel, “Generic eukaryotic core promoter prediction using structural
features of DNA,” Genome Research, vol. 18, no. 2, pp. 310-323, 2008.

More Related Content

What's hot

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Afra Fathima
 
Functional genomics, a conceptual approach
Functional genomics, a conceptual approachFunctional genomics, a conceptual approach
Functional genomics, a conceptual approach
KAUSHAL SAHU
 
11.06.13 - 2013 NIH Research Festival Poster
11.06.13 - 2013 NIH Research Festival Poster11.06.13 - 2013 NIH Research Festival Poster
11.06.13 - 2013 NIH Research Festival Poster
Farhoud Faraji
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
MugdhaSharma11
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
Sucheta Tripathy
 
20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling
Yi-Feng Chang
 
BIOL335: How to annotate a genome
BIOL335: How to annotate a genomeBIOL335: How to annotate a genome
BIOL335: How to annotate a genome
Paul Gardner
 
2011-NAR
2011-NAR2011-NAR
2011-NAR
Wei-Chih Tsai
 
2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge
Prof. Wim Van Criekinge
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Jajati Keshari Nayak
 
Prediction of mi-RNA related to late blight disease of potato
Prediction of mi-RNA related to late blight disease of potatoPrediction of mi-RNA related to late blight disease of potato
Prediction of mi-RNA related to late blight disease of potato
Animesh Kumar
 
HHMI Poster
HHMI PosterHHMI Poster
HHMI Poster
Mitchell Go
 
Improved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay DesignsImproved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay Designs
Thermo Fisher Scientific
 
Austin Neurology & Neurosciences
Austin Neurology & NeurosciencesAustin Neurology & Neurosciences
Austin Neurology & Neurosciences
Austin Publishing Group
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
Ashfaq Ahmad
 
Genomics experimental-methods
Genomics experimental-methodsGenomics experimental-methods
Genomics experimental-methods
Prof. Wim Van Criekinge
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
Stephen Turner
 
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
TELKOMNIKA JOURNAL
 
Human genome project
Human genome projectHuman genome project
Human genome project
Shital Pal
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
Amit Ruchi Yadav
 

What's hot (20)

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Functional genomics, a conceptual approach
Functional genomics, a conceptual approachFunctional genomics, a conceptual approach
Functional genomics, a conceptual approach
 
11.06.13 - 2013 NIH Research Festival Poster
11.06.13 - 2013 NIH Research Festival Poster11.06.13 - 2013 NIH Research Festival Poster
11.06.13 - 2013 NIH Research Festival Poster
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
 
20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling
 
BIOL335: How to annotate a genome
BIOL335: How to annotate a genomeBIOL335: How to annotate a genome
BIOL335: How to annotate a genome
 
2011-NAR
2011-NAR2011-NAR
2011-NAR
 
2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Prediction of mi-RNA related to late blight disease of potato
Prediction of mi-RNA related to late blight disease of potatoPrediction of mi-RNA related to late blight disease of potato
Prediction of mi-RNA related to late blight disease of potato
 
HHMI Poster
HHMI PosterHHMI Poster
HHMI Poster
 
Improved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay DesignsImproved Algorithm for Amplicon Sequencing Assay Designs
Improved Algorithm for Amplicon Sequencing Assay Designs
 
Austin Neurology & Neurosciences
Austin Neurology & NeurosciencesAustin Neurology & Neurosciences
Austin Neurology & Neurosciences
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Genomics experimental-methods
Genomics experimental-methodsGenomics experimental-methods
Genomics experimental-methods
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
 
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
 

Viewers also liked

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Elliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaElliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsuba
IAEME Publication
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
IJERD Editor
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Efficient implementation of bit parallel finite
Efficient implementation of bit parallel finite Efficient implementation of bit parallel finite
Efficient implementation of bit parallel finite
eSAT Journals
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 

Viewers also liked (9)

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Elliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsubaElliptic curve scalar multiplier using karatsuba
Elliptic curve scalar multiplier using karatsuba
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Efficient implementation of bit parallel finite
Efficient implementation of bit parallel finite Efficient implementation of bit parallel finite
Efficient implementation of bit parallel finite
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 

Similar to International Journal of Engineering Research and Development

Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
Long Pei
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
ChijiokeNsofor
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
University of Petroleum and Energy studies
 
Particle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster IdentificationParticle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster Identification
Editor IJCATR
 
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
International Journal of Engineering Inventions www.ijeijournal.com
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
saswat tripathy
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
Cherry
 
Analysis of Genomic and Proteomic Sequence Using Fir Filter
Analysis of Genomic and Proteomic Sequence Using Fir FilterAnalysis of Genomic and Proteomic Sequence Using Fir Filter
Analysis of Genomic and Proteomic Sequence Using Fir Filter
IJMER
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
Bahauddin Zakariya University lahore
 
functional genomics.ppt
functional genomics.pptfunctional genomics.ppt
functional genomics.ppt
Ahmed al jammal
 
project
projectproject
project
Ashish Tomar
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap
nooriasukmaningtyas
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
Aashish Patel
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
IJSTA
 
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse GenomeA Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
Daniel Wachtel
 
RT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferationRT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferation
IJAEMSJORNAL
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 
genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
VidyasriDharmalingam1
 
The research and application progress of transcriptome sequencing technology (i)
The research and application progress of transcriptome sequencing technology (i)The research and application progress of transcriptome sequencing technology (i)
The research and application progress of transcriptome sequencing technology (i)
creativebiolabs11
 
An26247254
An26247254An26247254
An26247254
IJERA Editor
 

Similar to International Journal of Engineering Research and Development (20)

Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
Particle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster IdentificationParticle Swarm Optimization for Gene cluster Identification
Particle Swarm Optimization for Gene cluster Identification
 
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...Predicting Functional Regions in Genomic DNA Sequences Using  Artificial Neur...
Predicting Functional Regions in Genomic DNA Sequences Using Artificial Neur...
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
Analysis of Genomic and Proteomic Sequence Using Fir Filter
Analysis of Genomic and Proteomic Sequence Using Fir FilterAnalysis of Genomic and Proteomic Sequence Using Fir Filter
Analysis of Genomic and Proteomic Sequence Using Fir Filter
 
HGP, the human genome project
HGP, the human genome projectHGP, the human genome project
HGP, the human genome project
 
functional genomics.ppt
functional genomics.pptfunctional genomics.ppt
functional genomics.ppt
 
project
projectproject
project
 
27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap27 20 dec16 13794 28120-1-sm(edit)genap
27 20 dec16 13794 28120-1-sm(edit)genap
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
 
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse GenomeA Comparative Encyclopedia Of DNA Elements In The Mouse Genome
A Comparative Encyclopedia Of DNA Elements In The Mouse Genome
 
RT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferationRT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferation
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
 
The research and application progress of transcriptome sequencing technology (i)
The research and application progress of transcriptome sequencing technology (i)The research and application progress of transcriptome sequencing technology (i)
The research and application progress of transcriptome sequencing technology (i)
 
An26247254
An26247254An26247254
An26247254
 

More from IJERD Editor

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
IJERD Editor
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACE
IJERD Editor
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
IJERD Editor
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’
IJERD Editor
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding Design
IJERD Editor
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and Verification
IJERD Editor
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
IJERD Editor
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
IJERD Editor
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive Manufacturing
IJERD Editor
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string value
IJERD Editor
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
IJERD Editor
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
IJERD Editor
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
IJERD Editor
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
IJERD Editor
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
IJERD Editor
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
IJERD Editor
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF Dxing
IJERD Editor
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
IJERD Editor
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart Grid
IJERD Editor
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powder
IJERD Editor
 

More from IJERD Editor (20)

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACE
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding Design
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and Verification
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive Manufacturing
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string value
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF Dxing
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart Grid
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powder
 

Recently uploaded

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 

Recently uploaded (20)

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 

International Journal of Engineering Research and Development

  • 1. International Journal of Engineering Research and Development e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com Volume 9, Issue 10 (January 2014), PP. 01-07 1 Determining the Promoter Region in the DNA using ke-Rem (ke-Rule Extraction Method) Günay Karlı1 1 International Burch University, Faculty of Engineering and IT, IT Department, Sarajevo, Bosnia and Herzegovina. Abstract:- Biologists endeavor to reveal the secrets of life by looking into gene sequences. Concordantly, determining the promoter region in the DNA is an important step in the process of detecting genes. However the gene sequence data grow too huge recently. Conventional methods remain incapable to predict promoter so it becomes increasingly important to automate the identification of functional elements, such as coding region, genomic region or promoters. Thus many computer scientists are interested in the biological technology, and improve some data mining methods which take advantages of computer power to see into gene sequences. In this study, we employ ke-REM (ke-Rule Extraction Method) classifier to predict promoters of DNA sequences, and evaluate their performances. The obtained results show that the classifier competes the existing techniques for identifying promoter regions. Keywords:- Promoter prediction, inductive learning, data mining, bioinformatics, DNA. I. INTRODUCTION Genome coding segments for transfer ribonucleic acids (tRNAs), messenger ribonucleic acids (mRNAs) and ribosomal ribonucleic acids (rRNAs) are known as genes [1]. Proteins have amino acids whose sequence is determined by mRNAs. Prokaryotic cells have a simple mechanism since all the genes tend to be converted into the corresponding mRNA and then finally into proteins [2]. Gene finding or genome analysis generally relates to that part of computational biology that involves identification of stretches of sequence algorithmically, and it is basically the genomic DNA which is functional biologically. This particularly incorporates protein-coding genes as well as various functional elements like regulatory regions and RNA genes. To clearly understand a species‟ genome, one of the most crucial and significant step is to understand gene finding. For prokaryotes, it is relatively simple to predict the Computational Gene since all the genes are basically converted into the corresponding mRNA and finally into proteins. However, for eukaryotic cells, the process becomes more complicated since there is interruption of coding of the DNA sequence by random sequences commonly known as introns. There are a number of questions which biologists have currently attempted to answer and they include [3]:  What section of a DNA sequence codes for a protein and what section is considered as junk DNA?  How can a junk DNA be classified as intron, transposes, untranslated region, regulatory elements, dead genes, etc.?  Dividing a genome that has been newly sequenced into the coding genes and non-coding regions. In a DNA, determination of the promoter region is a crucial step in the process of detecting genes and this implies that the problem of determining a promoter is of major significance in biology [4] [5]. Biologists have tried to analyze the secrets of life by investigating the gene sequence. Nevertheless, the growth of data on the gene sequence has been growing rapidly. Therefore, most of computer scientists use biological technology to come up with methods generated by the power of a computer to see gene sequences [6]. Despite the fact that approaches for determining regions of coding in genome DNA sequences have been in existence since the nineteenth century, programs designed for combining coding sequences into mRNA sequences that could be translated were invented in early 20th century [7]. Biologists have since then had various programs at their disposal including GenViewer [8], GeneID [7], GenLang [9], GeneParser [10], FGENEH [11],
  • 2. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 2 SORFIND [12], Xpound [13], GRAIL [14], VEIL [15], GenScan [16], etc. Two tools, which include GRAIL and GenScan, are the most widely used tools both in the industry and in academics [17]. Most of the above approaches are founded on motifs searching in regard to DNA sequence to establish whether it forms a promoter or not [18]. In this context, the search is done through the help of Markov models and position weight matrices [19][20]. Artificial intelligence has also been used to supplement statistical intelligence. More specifically artificial neural networks have produced acceptable values with the only disadvantage being high positive rates that are false [21][22]. However, this has not curtailed their application to solve other bioinformatics problems. Computational method for predicting accurate promoter is still to be wholly developed. Currently, biologists use ke-REM. The purpose of this paper is to explicate ke-REM can be successful be used for promoter sequences. A. What is promoter and importance of promoter prediction? A promoter by definition is a DNA non-coding region responsible for initiating transcriptions a specific gene. They are usually located on the upstream and on the same strand of the DNA-towards the anti-sense strand‟s 3‟ region also referred to as the template strand. A promoter may be characterized by a nucleotide sequence of between 100 to 1000 base pair long [23]. A special enzyme, RNA polymerase, is needed for mRNA transcription. This enzyme needs to attach itself to the DNA near a gene for it to qualify to be called a promoter sequence. Sequences comprises of specific response and DNA sequences responsible for providing a fully secure primary binding site for the enzyme as well as for proteins referred to as transcription factors. Eukaryotic and Prokaryotic promoters vary from each other. In prokaryotic organism ς70 sigma factor is able to identify specific promoter sequences, which in this case are 5‟TATAAT3‟ and 5‟TTGACA3‟ (-10 and -35 respectively) through the help of ς70 subunit of the polymerase enzyme[24]. Eukaryotic organism on the other hand is more complex requiring at least 7 different factors for the polymerase II enzyme to bind to the promoter. The promoter intensity correlates with identity degree to the sequence but separated by the spacer length. Dense promoters are however founded closer to gene [25][26]. It has for long been thought that for transcription activity to be optimal, various promoter elements‟ combinations including -35 and -10 hexamers, must be in existence coupled with downstream and upstream regions [25]. According to this school of thought, RNAP works both regions of the two hexamers in sequence and promoters of A+ T-rich sequences upstream of the −35 hexamer [26][27] in several E. coli or Bacillus subtilis were identified as facilitating increased transcription in vitro when accessory proteins were absent [28]. Different upstream sequences show different effects on transcription increasing it from a mere 1.5 to 90 fold [29]. Those promoter sequences that are characterized by powerful binding affinity have a direct effect on mRNA transcription. Regardless of whether a transcribed DNA sequence can be identified through biological testing or not, experiments are known to be time consuming and costly. The promoter prediction approach can however narrow promoter regions amongst huge DNA sequences. A subsequent experiment can be established and tested thus saving time and money [6]. II. METARIAL AND METHODS There are two core classes of the promoter prediction, namely „+‟ and „-‟. These classes will denote the existence of promoter prediction in the DNA sequence, having the „+‟ denoting for a positive indication of promoter location in the DNA sequence and the „-‟ denoting the absence of promoter locations in the DNA sequence. This research paper proposes to deal with a supervised learning technique in the prediction of promoter regions in the DNA sequence. A. Data set The research sought to incorporate the E. Cole promoter gene arrays of DNA in the testing the proficiency of ANN. Such data were collected from the UCI Repository [30]. This contains a set of 106 promoter and non- promoter instances. The research paper notes that such data is viable in the comparisons of ANN with the models existing in the literature; additionally such information involving the use of the data set is publicly available [5].
  • 3. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 3 The 106 DNA arrays are composed of 57 nucleotides each. 53 of the DNA sequences in the data set had a „+‟ denoting, indicating the presence of promoter location in the DNA array. The research then sought to align the (+) parameter instances separately allowing for transcription. The following data characterize the (+) instances as observed from the experiment. One is that for every occurrence the (+) represents for the promoter positive presence, a name was also given in each instance and a classification of the DNA array was made composing of A, T, G and C stand for Adenine, Thymine, Guanine, Cytosine [30]. B. ke-REM (ke-Rule Extraction Method) This section introduces the novel development referred to as ke-REM (ke-Rule Extraction Method) and addresses its ability in utilizing DNA promoter region predictions. As provided for above, an e.coli dataset consists of a total of 106 DNA sequences, each containing a length of 57 nucleotides. The computer science perspective expresses the dataset for e-coli as consisting of 106 instances containing 57 attributes bearing four values. The attributes for these instances can be expressed as nucleotides locations for the 57-element sequence. Each attribute accommodates 4 values, namely T-Thymine, A-Adenine, C-Cytosine and G-Guanine. ke-REM constructs a rule-base by applying the data set attribute-value pairs. In an effort to generate a robust rule-base, attribute-value pairs with significant importance are used. The significant question at this point queries, “How are pairs with significant informational value determined?” the new ke-REM upgrade uses a “gain function” in computing the informational value for the set‟s pair. ke-REM considers the higher gain value as a higher informational value indicator. Therefore, the attribute-value with a higher value has a greater priority in the processing of rule-base for the prediction system. keREM (ke-Rule Extraction Method) was upgraded to have the ability to obtain IF-THEN rules from a given set of examples. It proactively discards encountered pitfalls commonly present in inductive learning algorithms. keREM applies the gain function value, to determine which attributes are of significant importance and are thus given a higher priority and as such, serve to further provide rules that are more commonly acceptable. The following is a summary of the algorithm: Step 1: In a particular training set, a person computes class distribution and probability distribution rate of every attribute-value. Step 2: For every attribute in the data set, you compute the power of classification. Step 3: for every attribute-value pairs, its Class-based Gain is calculated with the use of computed probability distributions, power of classification and class distribution rate. Step 4: One rule of selection is that you can select any value whose probability distributions one for n=1. The next step is to convert the attribute-values into rules and then you mark the classified examples. Step 5: Move to step 8. Step 6: Starting from the first example that is unclassified, you form combinations with the n values by using the attribute-values that has a bigger gain. Step 7: You apply each combination in all examples. Using the values that are made up of n combinations, those that match only with on class are converted into a rule. You mark the classified examples. Step 8: In the training set, when all examples are classified, you move to step 11. Step 9: perform the expression n=n+1 Step 10: go to step 6 if n<N Step 11: Select the most general rule if there is over one rule that represents the same examples. Step 12: End. III. RESULTS AND DISCUSSION The aim of this section is to experimentally analyze our approach for promoter sequences recognition with the use of ke-REM and compare it with the existing approaches. Ke-REM has an important feature which allows it to compute class-based gain for every attribute-value in a particular training set. First, in this context, the class distribution and probability distribution rate of every nucleotide that forms DNA sequence is computed on the basis of promoter and non-promoter classes. For every attribute in the data set, you contribute power of classification. However, there is no class information in the results for the attribute-value pairs. Therefore, using class distribution
  • 4. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 4 rate, probability distribution and the power of classification, you compute class-based gain for every value in the DNA sequence data set. Using this method, rules that the algorithm produces were formed by attribute-value which has a maximum information value. There are two phases which fulfill the experiments of promoter prediction and they reflect the principles behind a supervised learning algorithm which include testing and training. Classification model is built during training, and during testing, the model is applied for classification of unseen examples that are DNA sequence and they consist of promoter and non-promoter region. You evaluate the performance of ke-REM with the use of a standard 5-fold cross-validation. Dataset is partitioned randomly in the 5-fold cross-validation into five subsets. Every subset contains an equal ratio of both the promoter and non-promoter region. For five times, ke-REM is trained using 4 subsets for each time for training and remaining the 5thh subset for testing. 5 models are generated in this way during cross-validation. You obtain the final prediction performance by averaging the results that are achieved from every model. We determined prediction performance of the algorithm by measuring the threshold-dependent parameters sensitivity (SE), specificity (SP), accuracy (ACC) and Matthew‟s Correlation coefficient (MCC). The following equations were used to calculate ACC, SE, SP and MCC. SE=TP/(TP+FN) (1) SP=TN/(TN+FP) (2) ACC= (TP+TN) / (TP+TN+FP+FN) (3) MCC= ((TP*TN)-(FN*FP))/SQRT((TP+FN)*(TN+FP)*(TP+FP)*(TN+FN)) (4) TP is true positive (promoter predicted as promoter) FN is false negative (promoter predicted as non- promoter) TN is true negative (non- promoter predicted as non- promoter) FP is false positive (non- promoter predicted promoter). The detailed performance of module in term of SE, SP, ACC and MCC is shown in Table 1. Table 1: Performance of ke-REM in term of SE, SP, ACC and MCC Threshold- Dependent Parameters 1. Model 2. Model 3. Model 4. Model 5. Model Average ACC 0.75 0.90 0.80 0.90 0.69 0.8085 SE 0.80 0.90 0.80 1.00 0.54 0.8077 SP 0.70 0.90 0.80 0.80 0.85 0.8092 MCC 0.50 0.80 0.60 0.82 0.40 0.6246 In the literature, the way learning algorithms perform can be evaluated by applying cross-validation with the use of a “leave-one-out” methodology. Leave-one-out cross validation (LOOCV) is considered as a special type of k-fold cross-validation and k represents the number of instance in the data. This implies that in all iteration, close to every data apart from the single observation are utilized for training and then you test the model on the single observation. An estimate that is accurate that is obtained with the use LOOCV is considered as almost unbiased [31]. This is commonly used when the data that is available is rare, particularly in bioinformatics when there is only dozens of data sample available.
  • 5. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 5 Table 2: The errors of some machine learning algorithms on promoter data set. System Errors Classifier REX-1 0/106 Inductive L.A ke-REM 3/106 Class-based gain KBANN 4/106 A hybrid ML system BP 8/106 Standard backpropagation with one layer O'Neill 12/106 Ad hoc tech. from the bio. lit. Near- Neigh 13/106 A nearest neighbours algorithm ID3 19/106 Quinlan's decision builder Unlike the classifiers that have been applied here for promoter prediction (Table 2), ke-REM that has been introduced in this document tends to outperform the present classifier for promoter prediction. When the classification error is considered, it becomes better compared to ID3, KB, NN, O‟Neil and BP. IV. CONCLUSIONS Within the bioinformatics field, the promoter prediction is considered as a crucial problem from a computational perspective. Newly developed ke-REM (ke-Rule Extraction Method) in this document has been proposed as effective in tackling the problem. For rule-base with efficient rule to be generated, you employ attribute- value pairs with higher importance. To calculate information value of the pair in the set, ke-REM uses its own “gain function” to calculate information value. Because value of the higher gain tends to indicate higher information value, a greater priority is given to attribute-value with higher value when it comes to production of rule-base of the predicting system. According the results given above, it is possible to conclude that when ke-REM for promoter prediction is employed it leads to results that are promising. Moreover, additional improvement increases the accuracy of the results that have been obtained. REFERENCES [1]. D. Mount, Bioinformatics: Sequence and Genome Analysis, New York: Cold Spring Lab Press, 2013. [2]. K. Davies, The Sequence, US: Weidenfeld & Nicholson, 2001. [3]. P. Jayaram and K. Bhushan, Bioinformatics For Better Tomorrow, New Delhi: Indian Institute of Technology, 2000. [4]. B. Óscar and B. Santiago, “Cnn-promoter, new consensus promoter prediction program,” Revista EIA, no. 15, pp. 153-164, 2011. [5]. C. Gabriela and M.-I. Bocicor, “Promoter Sequences Prediction Using Relational Association Rule Mining,” Evolutionary Bioinformatics, vol. 8, pp. 181-196, 2012. [6]. J.-W. Huang, Promoter Prediction in DNA Sequences, Kaohsiung,: National Sun Yat-sen University, 2003. [7]. R. Guigo, S. Knudsen, N. Drake and T. Smith, “Prediction of gene structure,” Journal of Molecular Biology, no. 226, pp. 141-157, 1995. [8]. L. Milanesi, N. Kolchanov and I. Rogozin, “GenViewer: A computing tool for protein coding regions prediction in nucleotide sequences,” in the 2nd International Congress on Bioinformatics, Supercomputing and Complex Genome Analysis,, 573-587, 1993. [9]. S. Dong and G. Stormo, “Gene structure prediction by linguistic methods,” Genomics, no. 23, pp. 540-551, 1994. [10]. E. Stormo and E. Snyder, “Identification of coding regions in genomic DNA sequences: An application of dynamic programming and neural networks,” Nucleic Acids Research, no. 21, pp. 607-613, 1993. [11]. V. Solovyev, A. Salamov and C. Lawrence, “Prediction of internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames,” Nucleic Acids Research, no. 22, pp. 5156- 5163, 1994.
  • 6. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 6 [12]. G. Hayden and M. Hutchinson, “The prediction of exons through an analysis of spliceable open reading frames,” Nucleic Acids Research, no. 20, pp. 3453-3462, 1992. [13]. A. Skolnick and M. Thomas, “A probabilistic model for detecting coding regions in DNA sequences,” IMA J. Math. Appl. Med. Biol., no. 11, pp. 149-160, 1992. [14]. Y. Xu, R. Mural and E. Uberbacher, “Constructing gene models from accurately predicted exons: An application of dynamic programming,” Comput. Appl. Biosci, no. 10, pp. 613-623, 1994. [15]. J. Henderson, S. Salzberg and K. Fasman, “Finding genes in DNA with a hidden Markov model,” Journal of Computational Biology, vol. 2, no. 4, pp. 127-141, 1997. [16]. C. Karlin and C. Burge, “Prediction of complete gene structures in human genomic DNA,” J. Mol. Biol, no. 268:, pp. 78-94, 1997. [17]. M. Hrishikesh, S. Nitya and M. Krishna, “An ANN-GA model based promoter prediction in Arabidopsis thaliana using tilling microarray data,” Bioinformation, vol. 6, no. 6, p. 240–243, 2011. [18]. J. Gordon, M. Towsey and J. Hogan, “Improved prediction of bacterial transcription start sites,” Bioinformatics, vol. 22, no. 2, pp. 142-148, 2006. [19]. R. Liu and D. States, “Consensus promoter identification in the human genome utilizing expressed gene markers and gene modelling,” Genome Research, no. 12, pp. 462-469, 2002. [20]. C. Premalatha and C. a. K. K. Aravindan, “On improving the performance of promoter prediction classifier for eukaryotes using fuzzy based distribution balanced stratified method.,” in Proceedings of the International Conference on Advance in Computing, Control, and Telecommunication Technologies IEEE,, ACT, 2009. [21]. T. Abeel, Y. Saeys, E. Bonnet and P. Rouzé, “Generic eukaryotic core promoter prediction using structural features of DNA,” Genome Research, vol. 18, no. 2, pp. 310-323, 2008. [22]. Y.-J. Zhang, “A novel promoter prediction method inspiring by biological immune principles.,” Global Congress on Intelligent Systems, no. 569-573, pp. 569-573, 2009. [23]. S. S. Roded, S. Karni and Y. Felder, Promoter Sequence Analysis, Lecture 11,, 2007. [24]. A. Dombroski, W. Walter and M. Record, “Polypeptides containing highly conserved regions of transcription initiation factors 70 exhibit specificity of binding to promoter DNA,” Cell, p. 501–512, 1992 . [25]. H. Bujard, M. Brenner and W. Kammerer, Structure-function relationship of Escherichia coli promoters, New York: Elsevier, 1997. [26]. D. Grana, T. Gardella and M. M. Susskind, “The effects of mutations in the ant promoter of phage P22,” Genetics, p. 319–327, 1998. [27]. M. Craig, M. L. Suh and T. Record, “HOz and DNase I probing of Es70RNA polymerase-l PR promoter open complexes: Mg21binding and its structural consequences at the transcription start site,” Bio- chemistry, p. 15624–15632, 1995. [28]. D. Frisby and P. Zuber, “Analysis of the upstream activating sequence and site of carbon and nitrogen source repression in the promoter of anearly-induced sporulation gene of Bacillus subtilis.,” Bacteriol, p. 7557–7564, 1991. [29]. R. Wilma, A. Sarah and J. Salomon, “Escherichia Colipromoters With UP Elements Of Different Strengths: Modular Structure Of Bacterial Promoters,” Journal Of Bacteriology, p. 5375–5383, 1998. [30]. A. Frank and A. Asuncion, “UCI machine learning repository,” 2010. [Online]. Available: http://archive.ics.uci.edu/ml/. [31]. B. Efron, “Estimating the error rate of a prediction rule: improvement on cross-validation,” J. Am. Stat. Assoc., no. 78, p. 316–331, 1983. [32]. Essentials of Cell Biology, Nature Education, 2010. [33]. Q. Luo and W. a. L. P. Yang, “Promoter recognition based on the interpolated Markov chains optimized via simulated annealing and genetic algorithm,” Recognition Letters Pattern, vol. 9, no. 27, pp. 1031-1036, 2006. [34]. S. Clancy, “Nature Education,” DNA transcription, vol. 1, no. 1, p. 41, 2008.
  • 7. Determining the Promoter Region in the DNAusing ke-Rem (ke-Rule Extraction Method) 7 [35]. M. Guigo and R. Burset, “Evaluation of gene structure prediction programs,” Genomics, vol. 3, no. 34, pp. 353-367, 1996. [36]. M. Wang, M. Yin and T. Jason, “GeneScout: a data mining system for predicting vertebrate genes in genomic DNA sequences,” Information Sciences, vol. 163, no. Special issue, pp. 201-218, 2013. [37]. T. S. Y. B. E. R. P. a. V. d. P. Y. Abeel, “Generic eukaryotic core promoter prediction using structural features of DNA,” Genome Research, vol. 18, no. 2, pp. 310-323, 2008.