The influenza A virus has proven to be lethal over the history of time. Every season the virus is usually formed
from a new combination of various subtypes of hemagglutinin and neuraminidase. It is impossible to determine
in what combination an outburst of the virus will occur and thus presents the challenge of developing efficient,
multi-effective drug/vaccine. In this study, the variation pattern followed by the neuraminidase enzyme of the
pathogen has been derived using the concept of substitution mutation. The transition score matrix has been
calculated to derive the most preferred substitution mutation by an amino acid using multiple sequence
alignment and un-gapped block identification. This score matrix has been used to predict the most probable
mutations in the present subtype of neuraminidase and propose the next in line subtype. The prediction of the
upcoming subtype has been achieved with an average accuracy of more than 60% which can further be improved
and the same methodology can be applied to other such highly varying pathogenic viral proteins.
Classification of Gene Expression Data by Gene Combination using Fuzzy LogicIJARIIE JOURNAL
The goal of microarray experiments is to identify genes that are differentially transcribed with respect to different
biological conditions of cell cultures and samples. Among the large amount of genes presented in gene expression
data, only a small fraction of them is effective for performing a certain diagnostic test. Hence, one of the major tasks
with the gene expression data is to find groups of co regulated genes whose collective expression is strongly
associated with the sample categories or response variables. A framework is improved/ modified in this report to
find informative gene combinations and to classify gene combinations belonging to its relevant subtype by using
fuzzy logic. The genes are ranked based on their statistical scores and highly informative genes mare filtered. Such
genes are fuzzified to identify 2-gene and 3-gene combinations and the intermediate value for each gene is
calculated to select top gene combinations to further classify gene lymphoma subtypes by using fuzzy rules. Finally
the accuracy of top gene combinations is compared with clustering results. The classification is done using the gene
combinations and it is analyzed to predict the accuracy of the results. The work is implemented using java language
Gene Action for Yield and its Attributes by Generation Mean Analysis in Brinj...AI Publications
Genetic studies assist the breeder in understanding the inheritance mechanism and enhance the efficiency of a breeding programme. Knowledge of gene action and their relative contribution in expression of character is of great importance. Eggplant yield depends on two components viz., fruit weight and number of fruits per plant. These traits are quantitative and therefore influenced by multiple genes. The objective of this study was to estimate the main gene effects (additive, dominance and digenic epistasis) and to determine the mode of inheritance for fruit Yield and its components. The generation mean analysis was employed in three crosses viz., Ac-2 x Annamalai, EP-45 x Annamalai and EP-89 X Annamalai to partition the genetic variance. Among the three crosses studied, the cross Ac-2 x Annamalai had complimentary type of epistasis along with significant additive gene effects and additive x additive interaction gene effects for all the three traits. Considering fruit yield per plant and its attributes, this cross was judged as the best cross for further selection programme.
Longevity is a highly desirable trait that considerably affects overall profitability. With increased longevity, the mean production of the herd increases because a greater proportion of the culling decisions are based on production. Longevity did not receive adequate attention in breeding programs because genetic evaluation for this trait is generally difficult as some animals are still alive at the time of genetic evaluation. Therefore, three basic strategies were suggested to evaluate longevity for cows: Firstly, cow survival to a specific age, which can be analyzed as a binary trait by either linear or threshold models. Secondly, estimating life expectancy of live cows and including these records in a linear model analysis. Thirdly, survival analysis: a method of combining the information of dead (uncensored) and alive (censored) cows in same analysis. This review represents an attempt to shed a light on different strategies of genetic evaluation of longevity in dairy cattle in most of developed countries.
Gene therapy is a contemporary therapeutic intervention with recent positive results and regulatory approvals either completed or expected in the next several years for various conditions. The evolving view is that gene therapy will ultimately offer hope across a range of otherwise debilitating or difficult to treat conditions. The renaissance in gene therapy has seen major development of both non viral and viral vectors and accelerated preclinical studies and clinical trials. It is therefore timely to address the progress in gene therapy through a special issue presenting reviews on non viral and viral vectors including relevant updates on applications on herpes simplex virus HSV and adeno associated virus AAV vectors. Thus, the purpose of this review is to summarize the general concepts of gene therapy with a specific focus on monogenic rare disease in hematology and central nervous system disorders where burgeoning therapies are currently entering clinical investigations and approaching regulatory approval. Ms. Snehal D. Jadhav | Mr. Jeevan R. Rajguru | Ms. Hina U. Momin | Dr. Mrunal K. Shirsat "Gene Therapy- Challenges & Success" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-3 , April 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30612.pdf Paper Url :https://www.ijtsrd.com/medicine/other/30612/gene-therapy-challenges-and-success/ms-snehal-d-jadhav
Classification of Gene Expression Data by Gene Combination using Fuzzy LogicIJARIIE JOURNAL
The goal of microarray experiments is to identify genes that are differentially transcribed with respect to different
biological conditions of cell cultures and samples. Among the large amount of genes presented in gene expression
data, only a small fraction of them is effective for performing a certain diagnostic test. Hence, one of the major tasks
with the gene expression data is to find groups of co regulated genes whose collective expression is strongly
associated with the sample categories or response variables. A framework is improved/ modified in this report to
find informative gene combinations and to classify gene combinations belonging to its relevant subtype by using
fuzzy logic. The genes are ranked based on their statistical scores and highly informative genes mare filtered. Such
genes are fuzzified to identify 2-gene and 3-gene combinations and the intermediate value for each gene is
calculated to select top gene combinations to further classify gene lymphoma subtypes by using fuzzy rules. Finally
the accuracy of top gene combinations is compared with clustering results. The classification is done using the gene
combinations and it is analyzed to predict the accuracy of the results. The work is implemented using java language
Gene Action for Yield and its Attributes by Generation Mean Analysis in Brinj...AI Publications
Genetic studies assist the breeder in understanding the inheritance mechanism and enhance the efficiency of a breeding programme. Knowledge of gene action and their relative contribution in expression of character is of great importance. Eggplant yield depends on two components viz., fruit weight and number of fruits per plant. These traits are quantitative and therefore influenced by multiple genes. The objective of this study was to estimate the main gene effects (additive, dominance and digenic epistasis) and to determine the mode of inheritance for fruit Yield and its components. The generation mean analysis was employed in three crosses viz., Ac-2 x Annamalai, EP-45 x Annamalai and EP-89 X Annamalai to partition the genetic variance. Among the three crosses studied, the cross Ac-2 x Annamalai had complimentary type of epistasis along with significant additive gene effects and additive x additive interaction gene effects for all the three traits. Considering fruit yield per plant and its attributes, this cross was judged as the best cross for further selection programme.
Longevity is a highly desirable trait that considerably affects overall profitability. With increased longevity, the mean production of the herd increases because a greater proportion of the culling decisions are based on production. Longevity did not receive adequate attention in breeding programs because genetic evaluation for this trait is generally difficult as some animals are still alive at the time of genetic evaluation. Therefore, three basic strategies were suggested to evaluate longevity for cows: Firstly, cow survival to a specific age, which can be analyzed as a binary trait by either linear or threshold models. Secondly, estimating life expectancy of live cows and including these records in a linear model analysis. Thirdly, survival analysis: a method of combining the information of dead (uncensored) and alive (censored) cows in same analysis. This review represents an attempt to shed a light on different strategies of genetic evaluation of longevity in dairy cattle in most of developed countries.
Gene therapy is a contemporary therapeutic intervention with recent positive results and regulatory approvals either completed or expected in the next several years for various conditions. The evolving view is that gene therapy will ultimately offer hope across a range of otherwise debilitating or difficult to treat conditions. The renaissance in gene therapy has seen major development of both non viral and viral vectors and accelerated preclinical studies and clinical trials. It is therefore timely to address the progress in gene therapy through a special issue presenting reviews on non viral and viral vectors including relevant updates on applications on herpes simplex virus HSV and adeno associated virus AAV vectors. Thus, the purpose of this review is to summarize the general concepts of gene therapy with a specific focus on monogenic rare disease in hematology and central nervous system disorders where burgeoning therapies are currently entering clinical investigations and approaching regulatory approval. Ms. Snehal D. Jadhav | Mr. Jeevan R. Rajguru | Ms. Hina U. Momin | Dr. Mrunal K. Shirsat "Gene Therapy- Challenges & Success" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-3 , April 2020, URL: https://www.ijtsrd.com/papers/ijtsrd30612.pdf Paper Url :https://www.ijtsrd.com/medicine/other/30612/gene-therapy-challenges-and-success/ms-snehal-d-jadhav
Combining ability analysis and nature of gene action for grain yield in Maize...Agriculture Journal IJOEAR
Abstract— In the present investigation combining ability analysis and nature of gene action was studied for twenty lines, four testers and eighty hybrids, which were obtained from Line x tester biparental crossing scheme. The twelve characters were studied for winter maize under this experiment. Parental variance, Line variance, and line x tester variance revealed that there were significant differences in all the characters, whereas only tester variance showed three non-significant characters, namely days to 50% anthesis, days to maturity and cob length. The nature and magnitude of gene action showed that the dominance variance major reason towards hybrid performance for all characters. This means that non-additive action is important for the hybrid performance. The most promising crosses for higher yield per ha were L8 x T1 (27.63), L9 x T4 (23.44), L3 X T3 (23.41), L16 x T2 (23.03), L3 x T3 (22.81), L1 x T3 (22.51), L20 x T2 (19.48), L13 x T4 (19.47), L7 x T1 (18.22) and L17 x T4 (17.58) which have shown high SCA effects for grain yield which high parental GCA effects can be exploited for the development of SCHs because of non-additive gene action.
A stochastic modeling of biological systems is crucial to effectively and efficiently developing treatments for medical conditions that plague humanity. The study of challenge tests designed to evaluate serotoninergic pathways have widely used intravenous citalopram. Oral citalopram has also been used, but unsatisfactory results were obtained with a dose of 20 mg. We evaluated cortisol, growth hormone and prolactin levels and determine whether a higher oral dose would reproduce similar to those described for intravenous administration. Under the assumption that the threshold level of cortisol is a random variable follows exponentiated modified weibull distribution. The survival function of cortisol and its p.d.f are derived.
Genetic Variability and Morphological Diversity among Open-Pollinated Maize (...Premier Publishers
A study to characterize and determine the magnitude of genetic variation among 60 open-pollinated maize varieties was conducted at two contrasting locations in Sierra Leone during the 2015 wet cropping season. Results revealed that traits such as grain moisture content, anthesis-silking interval, plant and ear heights, number of ears harvested, field weight and grain yield showed moderate to high values of the components of genetic variation while days to 50% anthesis and silking revealed low values of the components of genetic variation. The first two PCA axes explained 54% of the total variation, of which the first principal component (PC1) accounted for 35% and PC2 contributed 19% of the total variation. The cluster diagram grouped the genotypes into seven main clusters and results suggest that crosses involving clusters I and V with any other clusters would produce segregants with low grain yields while the crosses between clusters IV, VI and VII would be expected to manifest higher heterosis and could result in segregants with higher grain yields. There was significant genetic variability observed among the genotypes evaluated thereby suggest the scope to bring about traits improvement of genotypes through direct selection and hybridization.
Factor and Principal Component Analyses of Component of Yield and Morphologic...Premier Publishers
The research was conducted to evaluate the yield performance, genetic variation and diversity of the rice genotypes for breeding purposes. Genetic variability and diversity assessment for component of yield and morphological traits among sixteen lowland rice genotypes were carried out at three locations namely Akungba, Akure and Okitipupa during the rainy seasons of 2013, 2014 and 2015. The experiment was conducted in a randomized complete block design (RCBD) replicated three times, a plot size of 3m x 3m and spacing of 20cm x 20cm was adopted to make a total plant density of 250,000 stands/ha. Cultural operations such as weeding, fertilizer and pesticide applications were carried out as appropriate. Data were collected on plant height, number of tillers per hill, effective tillers, tiller without panicle, flag leaf length, panicle length, panicle weight, number of grains per panicle, number of spikelets per panicle, one thousand grains weight, grain length, grain width, number of days to panicle initiation, number of days to maturity and grain yield per hill. Factor analysis indicated that the first five factors accounted for 79.3 % phenotypic variability, number of tillers, effective tillers with panicle, number of days to flowering and number of days to maturity exhibited 1.00 communality. The first eight principal components had cumulative variance of 93.1 %, whereas, PC(s) 1 and 2 had eigen value greater than 2.0. Therefore, factor and principal component analyses identified some similar characters as the most important for classifying the variation among rice genotypes and these include grain yield, panicle weight, panicle length, one thousand grain weight and number of effective tillers per hill.
Integrative bioinformatics analysis of Parkinson's disease related omics dataEnrico Glaab
Presentation on statistical meta analysis of omics data from Parkinson's disease case-control studies. The results are used for a comparative analysis against aging-related omics alterations in the brain and a prioritization of new candidate disease genes using the phenologs approach.
RT-PCR and DNA microarray measurement of mRNA cell proliferationIJAEMSJORNAL
For mRNA quantification, RT-PCR and DNA microarrays have been compared in few studies
(RT-PCR). Healing callus of adult and juvenile rats after femur injury was found to be rich in mRNA at
various stages of the healing process. We used both methods to examine ten samples and a total of 26 genes.
Internal DNA probes tagged with 32P were employed in reverse transcription-polymerase chain reaction
(RT-PCR) to identify genes (RT-PCR). Ten Affymetrix® Rat U34A cRNA microarrays were hybridized with
biotin-labeled cRNA generated from mRNA. There was a wide range of correlation coefficients (r) between
RT-PCR and microarray data for each gene. Meaning became genetically unique because of this diversity.
Relatively lowly expressed genes had the highest r values. The distance between PCR primers and
microarray probes was found to be higher than previously assumed, leading to a drop in agreement between
microarray calls and PCR outcomes. Microarray research showed that RT-PCR expression levels for two
genes had a "floor effect." As a result, PCR primers and microarray probes that overlap in mRNA expression
levels can provide good agreement between these two techniques.
Combining ability analysis and nature of gene action for grain yield in Maize...Agriculture Journal IJOEAR
Abstract— In the present investigation combining ability analysis and nature of gene action was studied for twenty lines, four testers and eighty hybrids, which were obtained from Line x tester biparental crossing scheme. The twelve characters were studied for winter maize under this experiment. Parental variance, Line variance, and line x tester variance revealed that there were significant differences in all the characters, whereas only tester variance showed three non-significant characters, namely days to 50% anthesis, days to maturity and cob length. The nature and magnitude of gene action showed that the dominance variance major reason towards hybrid performance for all characters. This means that non-additive action is important for the hybrid performance. The most promising crosses for higher yield per ha were L8 x T1 (27.63), L9 x T4 (23.44), L3 X T3 (23.41), L16 x T2 (23.03), L3 x T3 (22.81), L1 x T3 (22.51), L20 x T2 (19.48), L13 x T4 (19.47), L7 x T1 (18.22) and L17 x T4 (17.58) which have shown high SCA effects for grain yield which high parental GCA effects can be exploited for the development of SCHs because of non-additive gene action.
A stochastic modeling of biological systems is crucial to effectively and efficiently developing treatments for medical conditions that plague humanity. The study of challenge tests designed to evaluate serotoninergic pathways have widely used intravenous citalopram. Oral citalopram has also been used, but unsatisfactory results were obtained with a dose of 20 mg. We evaluated cortisol, growth hormone and prolactin levels and determine whether a higher oral dose would reproduce similar to those described for intravenous administration. Under the assumption that the threshold level of cortisol is a random variable follows exponentiated modified weibull distribution. The survival function of cortisol and its p.d.f are derived.
Genetic Variability and Morphological Diversity among Open-Pollinated Maize (...Premier Publishers
A study to characterize and determine the magnitude of genetic variation among 60 open-pollinated maize varieties was conducted at two contrasting locations in Sierra Leone during the 2015 wet cropping season. Results revealed that traits such as grain moisture content, anthesis-silking interval, plant and ear heights, number of ears harvested, field weight and grain yield showed moderate to high values of the components of genetic variation while days to 50% anthesis and silking revealed low values of the components of genetic variation. The first two PCA axes explained 54% of the total variation, of which the first principal component (PC1) accounted for 35% and PC2 contributed 19% of the total variation. The cluster diagram grouped the genotypes into seven main clusters and results suggest that crosses involving clusters I and V with any other clusters would produce segregants with low grain yields while the crosses between clusters IV, VI and VII would be expected to manifest higher heterosis and could result in segregants with higher grain yields. There was significant genetic variability observed among the genotypes evaluated thereby suggest the scope to bring about traits improvement of genotypes through direct selection and hybridization.
Factor and Principal Component Analyses of Component of Yield and Morphologic...Premier Publishers
The research was conducted to evaluate the yield performance, genetic variation and diversity of the rice genotypes for breeding purposes. Genetic variability and diversity assessment for component of yield and morphological traits among sixteen lowland rice genotypes were carried out at three locations namely Akungba, Akure and Okitipupa during the rainy seasons of 2013, 2014 and 2015. The experiment was conducted in a randomized complete block design (RCBD) replicated three times, a plot size of 3m x 3m and spacing of 20cm x 20cm was adopted to make a total plant density of 250,000 stands/ha. Cultural operations such as weeding, fertilizer and pesticide applications were carried out as appropriate. Data were collected on plant height, number of tillers per hill, effective tillers, tiller without panicle, flag leaf length, panicle length, panicle weight, number of grains per panicle, number of spikelets per panicle, one thousand grains weight, grain length, grain width, number of days to panicle initiation, number of days to maturity and grain yield per hill. Factor analysis indicated that the first five factors accounted for 79.3 % phenotypic variability, number of tillers, effective tillers with panicle, number of days to flowering and number of days to maturity exhibited 1.00 communality. The first eight principal components had cumulative variance of 93.1 %, whereas, PC(s) 1 and 2 had eigen value greater than 2.0. Therefore, factor and principal component analyses identified some similar characters as the most important for classifying the variation among rice genotypes and these include grain yield, panicle weight, panicle length, one thousand grain weight and number of effective tillers per hill.
Integrative bioinformatics analysis of Parkinson's disease related omics dataEnrico Glaab
Presentation on statistical meta analysis of omics data from Parkinson's disease case-control studies. The results are used for a comparative analysis against aging-related omics alterations in the brain and a prioritization of new candidate disease genes using the phenologs approach.
Study of genetic variability in germplasm of common bread wheat
Similar to Development Of Method To Derive Variation Pattern In Neuraminidase Enzyme Of Influenza-A Virus And Predict The Most Probable Upcoming Subtype.
RT-PCR and DNA microarray measurement of mRNA cell proliferationIJAEMSJORNAL
For mRNA quantification, RT-PCR and DNA microarrays have been compared in few studies
(RT-PCR). Healing callus of adult and juvenile rats after femur injury was found to be rich in mRNA at
various stages of the healing process. We used both methods to examine ten samples and a total of 26 genes.
Internal DNA probes tagged with 32P were employed in reverse transcription-polymerase chain reaction
(RT-PCR) to identify genes (RT-PCR). Ten Affymetrix® Rat U34A cRNA microarrays were hybridized with
biotin-labeled cRNA generated from mRNA. There was a wide range of correlation coefficients (r) between
RT-PCR and microarray data for each gene. Meaning became genetically unique because of this diversity.
Relatively lowly expressed genes had the highest r values. The distance between PCR primers and
microarray probes was found to be higher than previously assumed, leading to a drop in agreement between
microarray calls and PCR outcomes. Microarray research showed that RT-PCR expression levels for two
genes had a "floor effect." As a result, PCR primers and microarray probes that overlap in mRNA expression
levels can provide good agreement between these two techniques.
Innovative Technique for Gene Selection in Microarray Based on Recursive Clus...AM Publications
Gene selection is usually the crucial step in microarray data analysis. A great deal of recent research has focused on the
challenging task of selecting differentially expressed genes from microarray data (‘gene selection’). Numerous gene selection
algorithms have been proposed in the literature, but it is often unclear exactly how these algorithms respond to conditions like
small sample-sizes or differing variances. Choosing an appropriate algorithm can therefore be difficult in many cases. This paper
presents combination of Analysis of Variance (ANOVA), Principle Component Analysis (PCA), Recursive Cluster Elimination
(RCE) a classification algorithm by employing a innovative method for gene selection. It reduces the gene expression data into
minimal number of gene subset. This is a new feature selection method which uses ANOVA statistical test, principal component
analysis, KNN classification &RCE (recursive cluster elimination). At each step redundant & irrelevant features are get
eliminated. Classification accuracy reaches up to 99.10% and lesser time for classification when compared to other convectional techniques.
Hiv Replication Model for The Succeeding Period Of Viral Dynamic Studies In A...inventionjournals
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...CSCJournals
In this work, we utilized Quantitative Structure-Activity Relationship (QSAR) techniques to develop predictive models for inhibitors of the HIV-1 enzymes Integrase, HIV-Protease, and Reverse Transcriptase. Each predictive model was composed of quantitative drug characteristics that were selected by genetic evolutionary algorithms, such as Genetic Algorithm (GE), Differential Evolutionary Algorithm (DE), Binary Particle Swarm Optimization (BPSO), and Differential Evolution with Binary Particle Swarm Optimization (DE-BPSO). After characteristic selection, each model was tested with machine-learning algorithms such as Multiple Linear Regression (MLR), Support Vector Machine (SVM), and Multi-Layer Perceptron neural networks (MLP/ANN). We found that a combination of DE-BPSO combined with Multi-Layer Perceptron produced the most accurate predictive models as measured by R2, the statistical measure of proportion of variance in prediction values, and root-mean-square-error (RMSE) of prediction values compared to observed values. As for the models themselves: the best predictors for Integrase inhibitor included mass-weighted centred Broto-Moreau autocorrelation values, Moran autocorrelations, and eigenvalues of Burden matrices weighted by I-states; the best predictors for HIV-Protease inhibitors included the second Zagreb index value, the normalized spectral positive sum from Laplace matrix, and the connectivity-like index of order 0 from edge adjacency mat; and the best predictors for Reverse Transcriptase inhibitors included the number of hydrogen atoms, the molecular path count of order 7, the centred Broto-Moreau autocorrelation of lag 2 weighted by Sanderson electronegativity, the P_VSA-like on ionization potential, and the frequency of C – N bonds at topological distance 3.
Receptor binding and antigenic site analysis of hemagglutinin gene fragments ...UniversitasGadjahMada
We reported a retrospective study on hemagglutinin (HA) gene fragments of Avian Influenza (AI) viruses recovered between 2010 to 2012, using reverse transcriptase polymerase chain reaction (RT-PCR) followed by sequencing. The results provide information about the receptor binding sites (RBS) and antigenic sites character of HA gene of AI viruses in Indonesia. Viral RNA was extracted from allantoic fluid of specific pathogen free (SPF) of chicken embryonated eggs inoculated by AI suspected samples. Amplification was performed by using H5 specific primers to produce amplification target of 544 bp. The resulting sequences were analyzed with MEGA-5 consisting of multiple alignment, deductive amino acid prediction, and phylogenetic tree analysis. The results showed that out of the 12 samples amplified using RT-PCR technique, only 7 were detected to be avian influenza serotype H5 viruses. Sequence analysis of AIV H5 positive samples, showed a binding preference towards avian type receptors. Antigenic site analysis is consistent with the previous report, however, the antigenic site B at position 189 showed that the residue had undergone mutation from arginine to methionine. Phylogenetic tree analysis showed that these viruses were clustered into clade 2.1.3. Our report supports the importance of the previous study of RBS and antigenic properties of HPAI H5N1 in Indonesia.
The Convergence Speed of Single- And Multi-Objective Immune Algorithm Based O...CSCJournals
Despite the considerable amount of research related to immune algorithms and its applications in numerical optimization, digital filters design, and data mining, there is still little work related to issues as important as sensitivity analysis, [1]-[4]. Other aspects, such as convergence speed and parameters adaptation, have been practically disregarded in the current specialized literature [7]-[8]. The convergence speed of the immune algorithm heavily depends on its main control parameters: population size, replication rate, mutation rate, clonal rate and hyper-mutation rate. In this paper we investigate the effect of control parameters variation on the convergence speed for single- and multi-objective optimization problems. Three examples are a devoted for this purpose; namely the design of 2-D recursive digital filter, minimization of simple function, and banana function. The effect of each parameter on the convergence speed of the IA is studied considering the other parameters with fix values and taking the average of 100 times independent runs. Then, the concluded rules are applied on some examples introduced in [2] and [3]. Computational results show how to select the immune algorithm parameters to speedup the algorithm convergence and to obtain the optimal solution.
A Comparative Analysis of Feature Selection Methods for Clustering DNA SequencesCSCJournals
Large-scale analysis of genome sequences is in progress around the world, the major application of which is to establish the evolutionary relationship among the species using phylogenetic trees. Hierarchical agglomerative algorithms can be used to generate such phylogenetic trees given the distance matrix representing the dissimilarity among the species. ClustalW and Muscle are two general purpose programs that generates distance matrix from the input DNA or protein sequences. The limitation of these programs is that they are based on Smith-Waterman algorithm which uses dynamic programming for doing the pair-wise alignment. This is an extremely time consuming process and the existing systems may even fail to work for larger input data set. To overcome this limitation, we have used the frequency of codons usage as an approximation to find dissimilarity among species. The proposed technique further reduces the complexity by extracting only the significant features of the species from the mtDNA sequences using the techniques like frequent codons, codons with maximum range value or PCA technique. We have observed that the proposed system produces nearly accurate results in a significantly reduced running time.
Modeling and Analysis of Influenza A H1N1 Outbreaks in IndiaYogeshIJTSRD
Influenza A H1N1 incidence in India, which is a highly contagious acute respiratory disease in humans caused by type A influenza virus. This study employs retrospective comparative study of the data from National Centre for Disease Control CDC yearly reports from April 2010 to April 2019. The case fatality rates of Influenza A H1N1 incidence in India was forecasted using autoregressive integrated moving average ARIMA models in order to build a predictive tool for Influenza A H1N1 surveillance. Clearly from the study, lack of rainfall spread the virus more efficiently and Maharashtra stood first in total number of cases and deaths of Influenza A H1N1 whereas Lakshadweep had no signs of disease. Further, number of Cases were reported in the year 2015 i.e. 28 and 25 of cases have been reported in 2017 when compared to ten years data. ARIMA 2, 1, 3 model was selected for its minimum value of normalized BIC, MAPE and good R Square among all other models. The model forecasted the decrease in case fatality rate of Influenza A H1N1 for next 10years. Thus, results indicate that, ARIMA models provide a means to better understand Influenza A H1N1 incidence yielding forecasts that can be used for public health planning at the national level. Stavelin Abhinandithe K | Sathya Velu R | Dr. Madhu B | Sahana S | Sowmyavalli R |Bibin John | Dr. Balasubramanian S "Modeling and Analysis of Influenza A H1N1 Outbreaks in India" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-3 , April 2021, URL: https://www.ijtsrd.com/papers/ijtsrd39862.pdf Paper URL: https://www.ijtsrd.com/other-scientific-research-area/applied-mathamatics/39862/modeling-and-analysis-of-influenza-a-h1n1-outbreaks-in-india/stavelin-abhinandithe-k
ASHG 2015 - Redundant Annotations in Tertiary AnalysisJames Warren
After obtaining genetic variants from next generation sequencing data, a precursory step in tertiary analysis is to annotate each variant with available relevant information. There is no standardized compendium for this purpose; researchers instead are required to compile data from a motley of annotation tools and public datasets. These sources for annotation are independently maintained, and accordingly there is limited concordance between their reported contents. The choice of annotation datasets thus has a direct and significant impact on the results of the analysis.
QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial...ijtsrd
In the present study we have performed Quantitative structure activity relationship (QSAR) analysis for 43bisbenzofuran derivatives to estimate the antimalarial activity using some 2D descriptors. Several significant QSAR models has been calculated for predicting the antimalarial activity (“logIC50) of these molecules by using the multiple linear regression (MLR) technique. Among the obtained QSAR models, a four parametric model was most significant having R2=0.9502. An external set was used for confirming the predictive power of the models. High correlation between experimental and predicted antimalarial activity values, was obtained in the validation approach that displayed the good modality of the derived QSAR models. Tripti Kaushal | Anita K | Bashirulla Shaik | Vijay K. Agrawal"QSAR Modeling of Bisbenzofuran Compounds using 2D-Descriptors as Antimalarial Agents" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-2 , February 2018, URL: http://www.ijtsrd.com/papers/ijtsrd9497.pdf http://www.ijtsrd.com/chemistry/other/9497/qsar-modeling-of-bisbenzofuran-compounds-using-2d-descriptors-as-antimalarial-agents/tripti-kaushal
PREDICTING MORE INFECTIOUS VIRUS VARIANTS FOR PANDEMIC PREVENTION THROUGH DEE...gerogepatton
More infectious virus variants can arise from rapid mutations in their proteins, creating new infection
waves. These variants can evade one’s immune system and infect vaccinated individuals, lowering vaccine
efficacy. Hence, to improve vaccine design, this project proposes Optimus PPIme – a deep learning
approach to predict future, more infectious variants from an existing virus (exemplified by SARS-CoV-2).
The approach comprises an algorithm which acts as a “virus” attacking a host cell. To increase infectivity,
the “virus” mutates to bind better to the host’s receptor. 2 algorithms were attempted – greedy search and
beam search. The strength of this variant-host binding was then assessed by a transformer network we
developed, with a high accuracy of 90%. With both components, beam search eventually proposed more
infectious variants. Therefore, this approach can potentially enable researchers to develop vaccines that
provide protection against future infectious variants before they emerge, pre-empting outbreaks and saving
lives.
Bacterial virulence proteins, which have been classified on structure of virulence, causes
several diseases. For instance, Adhesins play an important role in the host cells. They are
inserted DNA sequences for a variety of virulence properties. Several important methods
conducted for the prediction of bacterial virulence proteins for finding new drugs or vaccines.
In this study, we propose a method for feature selection about classification of bacterial
virulence protein. The features are constituted directly from the amino acid sequence of a given
protein. Amino acids form proteins, which are critical to life, and have many important
functions in living cells. They occurring with different physicochemical properties by a vector of
20 numerical values, and collected in AAIndex databases of known 544 indices.
For all that, this approach have two steps. Firstly, the amino acid sequence of a given protein
analysed with Lyapunov Exponents that they have a chaotic structure in accordance with the
chaos theory. After that, if the results show characterization over the complete distribution in
the phase space from the point of deterministic system, it means related protein will show a
chaotic structure.
Empirical results revealed that generated feature vectors give the best performance with chaotic
structure of physicochemical features of amino acids with Adhesins and non-Adhesins data sets.
Similar to Development Of Method To Derive Variation Pattern In Neuraminidase Enzyme Of Influenza-A Virus And Predict The Most Probable Upcoming Subtype. (20)
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Vaccine management system project report documentation..pdf
Development Of Method To Derive Variation Pattern In Neuraminidase Enzyme Of Influenza-A Virus And Predict The Most Probable Upcoming Subtype.
1. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 67 | P a g e
Development Of Method To Derive Variation Pattern In
Neuraminidase Enzyme Of Influenza-A Virus And Predict The
Most Probable Upcoming Subtype.
Karishma Agarwal1
, Arun Malik1
, Nishtha Pandey2
, Ravi Kant Pathak2*
1
(Department of Computer Science, Lovely Professional University, Phagwara, India)
2
(Department of Biotechnology, Lovely Professional University, Phagwara, India)
ABSTRACT
The influenza A virus has proven to be lethal over the history of time. Every season the virus is usually formed
from a new combination of various subtypes of hemagglutinin and neuraminidase. It is impossible to determine
in what combination an outburst of the virus will occur and thus presents the challenge of developing efficient,
multi-effective drug/vaccine. In this study, the variation pattern followed by the neuraminidase enzyme of the
pathogen has been derived using the concept of substitution mutation. The transition score matrix has been
calculated to derive the most preferred substitution mutation by an amino acid using multiple sequence
alignment and un-gapped block identification. This score matrix has been used to predict the most probable
mutations in the present subtype of neuraminidase and propose the next in line subtype. The prediction of the
upcoming subtype has been achieved with an average accuracy of more than 60% which can further be improved
and the same methodology can be applied to other such highly varying pathogenic viral proteins.
Keywords - Neuraminidase, Influenza A virus, Transition score, CD-HIT, sequence alignment, variation
pattern.
I. INTRODUCTION
Influenza has been recognized as one of the
deadliest infectious diseases in the recent times. It
has affected as large as 40% of the population in
some countries. Avian flu and swine flu are some of
the examples of the pandemics occurred. The
Influenza A virus is responsible for causing the flu
pandemics. It can cross species barrier and can affect
human as well as animals (Bao et. al., 2008).The
seasonal pathogenic strain exhibit different subtypes
depending on the proteins that are expressed on the
surface of the influenza virus. Neuraminidase (NA)
and Hemagglutinin (HA) are the two large
glycoprotein molecules that lie on the surface of the
influenza virus (Ruigrok et. al., 1998). Envelope
glycoprotein NA has an enzymatic activity. It helps
the release of newly formed virus particles by
cleaving the attachment of the pathogen from the
surface of infected cells(Hirst, 1942).Because of its
pivotal role in the spread of the infection, NA has
been used as a potential target for the antiviral drugs.
Several strategies have been developed till
date taking NA as target, however for each infection
season the subtype of the NA changes, which makes
it difficult to devise a specific vaccine. Hence the
vaccine is updated every year (Colacino et. al.,
1999). Similarly, the drugs that are used to target
NA such as oseltamivir (Tamiflu) and zanamivir
(Relenza) (Palese et. al, 1976) have also been proven
to be somewhat ineffective due to emerging
drug resistance (Russell et. al., 2006).Therefore there
has always been a pressing need to engineer new
treatment strategy for influenza virus (Barik, 2012).
To solve this challenge it becomes very important to
understand the pattern of variation (if any) followed
by the antigenic protein (NA). In this work, it has
been shown that there is an amino acid biasness
followed during the transition from one subtype to
another posed through substitution mutation. A
method has thus been designed to predict the
upcoming subtype by looking at the previous
outbreak based on a transition score matrix derived
through sequence analysis.
II. MATERIAL AND METHODS
2.1 Data Collection
To make a data set, protein sequences of
different subtypes of Neuraminidase were collected
from the RCSB Protein Data Bank (Berman et. al.,
2000). The query made was using the keyword
Neuraminidase and was further refined using
taxonomy as Influenza A Virus and experimental
method as X-Ray and Date of release from 01-01-
2010 up to 31-07-2015.
2.2 Redundancy Check
It is critical that the collected data should be
accurate, random and non-redundant in order to
ensure that biasness of sequences that are in higher
RESEARCH ARTICLE OPEN ACCESS
2. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 68 | P a g e
number is eliminated. For checking the redundancy
of the data a cluster analysis has been performed
using the tool CD-HIT (Li and Godzik, 2006) and
the repetitions have been eliminated to make sure
that the data is accurate and non-redundant.
Representative sequence for each cluster has been
derived.
2.3 Multiple Sequence Alignment
MSA has been performed with intent to
determine an ungapped block of sequences. The
alignment of the conserved regions in the input
sequences is clearly visualized using the tool Jalview
(Waterhouse et. al., 2009). A consensus sequence is
also obtained from the multiple sequence alignment
of representative protein sequences. The concept
here is that if any change (mutation) occurs at a
particular position in the consensus sequence then
the effects of this mutation can be mapped to all the
representative sequences which were used to attain
the consensus sequence (Schneider, 2002).
2.4 Threshold Value
In the consensus sequence each position is
represented with a value called as Percent Identity.
A threshold value of 30% was set because the
protein sequences are considered homologous if the
percentage identity in the consensus sequence is
more than or equal to 30% (Pearson, 2013). Only
those positions from the consensus sequences having
a percent identity equal to or higher than 30% were
selected.
2.5 Phylogenetic Analysis
A phylogenetic tree was calculated by using
the representative sequences obtained from CD-HIT
as input. The tree was calculated based on the
neighbor joining method using BLOSUM 62
distance matrix (Saitou and Nei,1987) Based on the
phylogenetic tree derived from the Jalview, an
evolutionary path of NA was derived. From the tree,
the evolutionary path of the virus in the form of
clusters of sequences was obtained. These clusters of
sequences are termed as sister sequences (Martin et.
al., 2005). Each sister consists of a set of NA
sequences. It signifies that the sequences included in
particular sister occurred at a same time period in the
evolution of the virus. A representative sequence
was derived for each sister. This was done by
selecting a representative amino acid for each
position. The representative amino acid was chosen
based on the occurrence of amino acid in all the NA
protein sequences of a particular sister. The amino
acid with maximum occurrence within the sister at a
position was selected as a representative amino acid
for that position.
2.6 Mutational Analysis
All the positions in the consensus that
satisfied the threshold value of 30% identity were
extracted along with the corresponding positions of
all the sisters.
Based on the observed statistical data, a
20x20 transition matrix was calculated. In every cell
of this transition matrix, a score value is stored
which is calculated on the basis of relative pair
change frequency. Every score value can be
considered as A(i,j) where A is referred as the
transition matrix and A(i,j) is the score of transition
of a particular amino acid with index ‗i‘ to a
particular amino acid with index ‗j‘. Here, ‗i‘
represent the index values for every row of the
matrix and similarly ‗j‘ for every column of the
matrix. Every time such transition is met, the score
value is incremented by 1. Hence the transition
matrix will consist of transition scores and it will be
used while making the prediction.
2.7 Determining the position where prediction is
to be made
Pairwise sequence alignment of the input
sequence with the consensus sequence is performed
using EMBOSS-NEEDLE (Needleman and Wunsch,
1970).Those amino acids in input sequence have
been identified which are aligned with the consensus
sequence considering them to be the critical
positions in terms of structure and function.
2.8 Prediction
Each of these critical positions is filtered
based on the threshold PID of 30% and above.
Prediction process is then performed on the resulting
amino acids. The predicted amino acids are then
stored in the same position of the input sequence.
2.9 Transition Matrix Lookup
The process of looking up the transition matrix
occurs in the following manner:
1. Result returned by pairwise alignment of
consensus and input sequence i.e. the aligned amino
acids and their respective positions are stored in the
database.
2. For every aligned amino acid: The corresponding
i index of the amino acid is identified. The scores at
position i in the transition matrix are looked up to
find a j index such that A[i,j] has the maximum
transition value. The amino acids indexed with j‘ is
the predicted amino acid for the specific position.
3. The amino acids other than the critical amino
acids do not undergo any change.
3. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 69 | P a g e
III. RESULTS AND DISCUSSION
3.1 Collection of data
The search in PDB using the keyword
―Neuraminidase‖ resulted in 338 hits which when
refined with organism name as ―Influenza A Virus‖
gave 159 hits. Further refinement with experimental
method as ―X-Ray‖ resulted in 159 hits. Final
refinement by selecting the Date of Release in the
range of 01-01-2010 to 31-07-2015, returned 49 hits.
3.2 Redundancy check
For performing redundancy check using
CD-HIT, the value for the parameter ―Sequence
Identity cut-off‖ was set to 1 to ensure the complete
removal of any redundant sequence. The 49
sequences have been clustered into 20 unique and
non-redundant clusters. For each of the 20 clusters,
one representative sequence is assigned. In the
further processing of the data, the 20 representative
sequences are used.
3.3 Multiple sequence alignment
An ungapped block of positions 1 to 369
has been observed after MSA of the 20
representative sequences. It is shown in the figure 1.
3.4 Consensus sequence
After performing multiple sequence
alignment on the protein sequences following
consensus sequence was obtained:
>Consensus/1-466 Percentage Identity Consensus
GSPSNLPKPLCTIPGCSIFGKDNAIRLGSSGDVLVTRE
PYSSCDPDSCDFFACGQGALLRGKHSNGTIKDRTPY
RALISWPLGSPPLLGNSKVECIAVSSSSSHDGKGLGS
ACISGNDNDAAAVIYYGRRALTIIKDSAAIILTTQSSE
CCCICTCCSVVVTDGPAAGSADTRIYIIEGGIIHKKK
EKTSTGIGEEEECSYCYCIVRCCCCRDNNKGNNRPV
RIIDEDANIETGYVCSGIVTDTPRPDDPSTNDKCNNP
NEGGGNGGVGGGGDKGGANTWGGRTISSESSSGY
EIYKVEGAKTKPNSKKLENKQIIVNNDWSGYSGSSG
DYSIESCCCRCCFIEEIGIGGGDVDKEWTSNSIVSFSG
TSNEGGSGGWGDGSNIDGMPLADMDADMALGVM
VSMKEPGWYSFGFEIKDKECDVPCIGIEMVHDGGK
ETWHSAATAIYCLMGSGQLLWDTVTGVDMAL
A threshold of 30% was applied on the
consensus sequence such that all the amino acids
whose score is below than 30% in the consensus
sequence are filtered out.
3.5 Phylogenetic analysis
The phylogenetic tree was used to derive various
groups/sisters of sequences which signified major
chronological mutations. The sequences in each
sister signify that those sequences have occurred in
same time period during the evolution of NA. A total
of 13 sisters were identified with one or multiple
sequences as shown in table 1.
Table 1: 13 Sisters and the corresponding
sequences that constitute them.
Sister
Sequence
PDB_IDs in
each sister Sister
Sequence
PDB_IDs in
each sister
Sister 1 4CPL:A Sister 7 4DGR:A
4CPO:A Sister 8 4QN3:A
Sister 2 4QN4:A Sister 9 4H52:A
Sister 3 4K3Y:A 4H53:A
Sister 4 4GDI:A Sister 10 4MWJ:A
4GDJ:A 4MWL:A
Sister 5 4MC7:A Sister 11 4HZV:A
Sister 6 4GZO:A 4HZY:A
4GZS:A Sister 12 3SAL:A
Sister 7 4DGR:A
Sister 13 3K36:A
3K38:A
4. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 70 | P a g e
Figure 1: Ungapped block of 20 representative sequences from position 1 to 369 as
obtained from
5. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 71 | P a g e
Each value in the transition matrix is
calculated on the basis of relative pair exchange
frequency. Every time such transition is met, the
score value is incremented by 1. The matrix points
towards the possible amino acid biasness followed
by the virus during variation as shown in the Figure
2.
1.6 Mutational Analysis
Each value in the transition matrix is
calculated on the basis of relative pair exchange
frequency. Every time such transition is met, the
score value is incremented by 1. The matrix points
towards the possible amino acid biasness followed
by the virus during variation as shown in the Figure
2.
Figure 2: 3-D graph representation of the 20X20
transition matrix representing the transition
frequency of one amino acid to another.
3.7 Input
The latest influenza outbreak has been
recorded by WHO on 26th
April, 2016 in which a
human was tested positive for H7N9, a similar case
of influenza outbreak has been observed few days
earlier by WHO in china on 23rd march,2016 in
which human infection with avian influenza H5N6
has been observed (WHO, "Disease Outbreak News
(DONs)," 2016). This data has been used to test the
validity of prediction algorithm. Therefore N6 with
PDB-ID 4QN4 has been selected as the input, to
which the prediction sequence must come similar to
N9.
Input sequence is:
EFGTFLNLTKPLCEVSSWHILSKDNAIR
IGEDAHILVTREPYLSCDPQGCRMFALSQGTTL
RGRHANGTIHDRSPFRALISWEMGQAPSPYNV
RVECIGWSSTSCHDGISRMSICMSGPNNNASA
VVWYGGRPVTEIPSWAGNILRTQESECVCHKG
ICPVVMTDGPANNRAATKIIYFKEGKIQKIEEL
AGNAQHIEECSCYGAVGVIKCVCRDNWKGAN
RPVITIDPEMMTHTSKYLCSKILTDTSRPNDPT
NGNCDAPITGGSPDPGVKGFAFLDGENSWLGR
TISKDSRSGYEMLKVPNAETDTQSGPISHQVIV
NNQNWSGYSGAFIDYWANKECFNPCFYVELIR
GRPKESSVLWTSNSIVALCGSKERLGSWSWHD
GAEIIYFK
The predicted sequence has been observed as:
EFGTFLNLTKPLCEVSSWHILSKDNAV
RIGEDAHILVSREPSLSCDPQGCRMGALSTGTT
LRGRHANGTIHDRSPFRALISWEMGQAPSPYN
VRVECVGWSSTSCHDGISRMSICMSGPNNNAS
AVVWSGGRPVSEVPSWAGNVLRSTESECVCH
KGICPVVMSDGPANNRAASKIIYFKEGKVQKIE
ELAGNAQHIEECSCSGAVGVIKCVCRDNWKG
ANRPVITVDPEMMTHSSKSLCSKILSDSSRPND
PSNGNCDAPITGGSPDPGVKGFAFLDGENSWL
GRTISKDSRSGSEMLKVPNAETDTQSGPISHQV
IVNNQNWSGSSGAFIDSWANKECFNPCGYVEL
IRGRPKESSVLWTSNSVVALCGSKERLGSWSW
HDGAEIIYFK
3.8 Validation
In order to validate the results obtained from
the prediction methodology the phylogenetic tree of
the input data set was observed. In the Phylogenetic
tree if I was an instance of one input sequence then P
was next the observed sequence in the tree. Based on
these observations, the Input sequence I was
processed using the tool, and obtained a prediction
sequence P‘. Now in order to determine the
similarity between P and P‘, pairwise alignment of P
and P‘ was performed and the similarity percentage
was noted.
Using the above mentioned validation
method, when the protein sequence of N6 i.e. 4QN4
was processed as input to the prediction algorithm,
the predicted protein sequence showed a 62.7%
identity and 76.0% similarity with N9 having PDB-
ID 4MWJ. Similarly, an average was calculated of
10 random sequences as shown in table 2. From the
input data set, an average of 62.01% similarity and
44.36% identity was obtained.
6. Karishma Agarwal.et al. Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 5, (Part - 4) May 2016, pp.67-72
www.ijera.com 72 | P a g e
Table 2: Validation result of 10 randomly
selected sequences and their similarity and
identity percentage with the existing next-in-line
subtype as per the chronological arrangement of
the sequences
S.
No
.
Input
Sequenc
e
PDB_ID
Expecte
d Next
Sequenc
e
PDB_ID
Number
of
Position
s
predicte
d
Identity
percentag
e
Similarity
Percentag
e
1 4K3Y 4GDI 85 37.9 54.7
2 4DGR 4QN3 161 57.4 72.2
3 4QN3 4H52 138 46.8 68.5
4 4NWJ 4HZV 154 43.9 61.9
5 4HZY 3SAL 152 42.8 60.1
6 4H52 4MWJ 145 45.2 63
7 4GZS 4DGR 141 41.8 60.5
8 4H53 4MWL 144 44.9 63
9 4GDI 4MC7 79 36.3 52.9
10 4QN3 4H53 138 46.6 63.3
IV. CONCLUSION
49 protein sequences of NA were extracted
from PDB and clustered into 20 unique and non
redundant groups. MSA of the representative
sequences from each of the clusters output a 369
positioned ungapped block which act as the basis of
the variation analysis. Threshold of 30% has been
used to filter the positions which might have
evolutionary significance. Amino acid from all the
13 chronologically arranged sister groups at the
critical positions were extracted and used to derive
the transition matrix. The transition matrix thus
obtained directed the focus on the possible amino
acid biasness. An average accuracy of more than
60% has been achieved for the prediction algorithm
based on the transition matrix. Although the
accuracy can still be improved, this method proves
to be a step closer to development of new treatment
strategies and get prepared for any disease in which
the pathogen is highly mutating.
REFERENCES
[1] M. Waterhouse, J. B. Procter, D. M.
Martin, M. Clamp, and G. J. Barton,
(2009), Jalview Version 2—a multiple
sequence alignment editor and analysis
workbench, Bioinformatics, vol. 25, pp.
1189-1191.
[2] D. P. Martin, C. Williamson, and D.
Posada, (2005), RDP2: recombination
detection and analysis from sequence
alignments, Bioinformatics, vol. 21, pp.
260-262.
[3] GK Hirst., (1942), Adsorption of influenza
haemagglutinins and virus by red blood
cells, J Exp Med, 76, 195 – 209
[4] H.M. Berman, J. Westbrook, Z. Feng, G.
Gilliland, T.N. Bhat, H. Weissig, I.N.
Shindyalov, P.E. Bourne (2000) The
Protein Data Bank Nucleic Acids Research,
28: 235-242.
[5] J. M. Colacino, K. A. Staschke, and W. G.
Laver, (1999), Approaches and strategies
for the treatment of influenza virus
infections, Antiviral Chemistry and
Chemotherapy, vol. 10, pp. 155-185.
[6] N. Saitou and M. Nei, (1987), The
neighbor-joining method: a new method for
reconstructing phylogenetic trees,
Molecular biology and evolution, vol. 4, pp.
406-425, 1987.
[7] P Palese, RW Compans. (1976), Inhibition
of influenza virus replication in tissue
culture by 2-deoxy-2,3-dehydro-N-
trifluoroacetylneuraminic acid (FANA):
mechanism of action. J Gen Virol 33,159 -
163
[8] Rupert J. Russell, Lesley F. Haire, David J.
Stevens, Patrick J. Collins, Yi Pu Lin, G.
Michael Blackburn, Alan J. Hay, Steven J.
Gamblin& John J. Skehel, (2006), The
structure of H5N1 avian influenza
neuraminidase suggests new opportunities
for drug design, Nature 443, 45-49
[9] RWH Ruigrok, KG Nicholson, RG
Webster, AJ Hay, (1998), Structure of
influenza A, B and C viruses. Textbook of
Influenza, Blackwell Science, 29 – 42.
[10] S. Barik, (2012), New treatments for
influenza, BMC medicine, vol. 10, p. 104.
[11] T. D. Schneider, (2002), Consensus
sequence zen, Applied bioinformatics, vol.
1, p. 111.
[12] W. Li and A. Godzik, (2006), Cd-hit: a fast
program for clustering and comparing large
sets of protein or nucleotide sequences,
Bioinformatics, vol. 22, pp. 1658-1659.
[13] W. R. Pearson, (2013), An introduction to
sequence similarity (―homology‖)
searching, Current protocols in
bioinformatics, pp. 3.1. 1-3.1. 8.
[14] Y. Bao, P. Bolotov, D. Dernovoy, B.
Kiryutin, L. Zaslavsky, T. Tatusova,(2008),
The influenza virus resource at the National
Center for Biotechnology Information,
Journal of virology, vol. 82, pp. 596-601.