This document summarizes an experiment using deep neural networks (DNNs) to predict alternative splicing patterns in mouse tissues from RNA-seq data. The DNN model contains three hidden layers and jointly represents genomic sequence features and tissue types to predict splicing percentages and changes across tissues. Hyperparameters were optimized using 5-fold cross-validation on AUC. The trained DNN was able to accurately predict splicing patterns for 11,019 exons in 5 mouse tissues, outperforming previous models like Bayesian neural networks and multinomial logistic regression.
Deep learning for extracting protein-protein interactions from biomedical lit...Yifan Peng
State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information. But how to incorporate such knowledge in the recent deep learning methods remains an open question. In this paper, we propose a multichannel dependency-based convolutional neural network model (McDepCNN). It applies one channel to the embedding vector of each word in the sentence, and another channel to the embedding vector of the head of the corresponding word. Therefore, the model can use richer information obtained from different channels. Experiments on two public benchmarking datasets, AIMed and BioInfer, demonstrate that McDepCNN compares favorably to the state-of-the-art rich-feature and single-kernel based methods. In addition, McDepCNN achieves 24.4% relative improvement in F1-score over the state-of-the-art methods on cross-corpus evaluation and 12% improvement in F1-score over kernel-based methods on "difficult" instances. These results suggest that McDepCNN generalizes more easily over different corpora, and is capable of capturing long distance features in the sentences.
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...TELKOMNIKA JOURNAL
Problem in imbalanced data is very common in classification or identification. The problem is
raised when the number of instances of one class far exceeds the other. In the previous research, our
DNA barcode-based Identification System of Tuna and Mackerel was developed in imbalanced dataset.
The number of samples of Tuna and Mackerel were much more than those of other fish samples.
Therefore, the accuracy of the classification model was probably still in bias. This research aimed at
employing Synthetic Minority Oversampling Technique (SMOTE) to yield balanced dataset. We used kmers
frequencies from DNA barcode sequences as features and Support Vector Machine (SVM) as
classification method. In this research we used trinucleotide (3-mers) and tetranucleotide (4-mers). The
training dataset was taken from Barcode of Life Database (BOLD). For evaluating the model, we compared
the accuracy of model using SMOTE and without SMOTE in order to classify DNA barcode sequences
which is taken from Department of Aquatic Product Technology, Bogor Agricultural University. The results
showed that the accuracy of the model in the species level using SMOTE was 7% and 13% higher than
those of non-SMOTE for trinucleotide (3-mers) and tetranucleotide (4-mers), respectively. It is expected
that the use of SMOTE, as one of data balancing technique, could increase the accuracy of DNA barcode
based fish classification system, particularly in the species level which is difficult to be identified.
Diagnosis Chest Diseases Using Neural Network and Genetic Hybrid AlgorithmIJERA Editor
The back propagation algorithm is most popular algorithm in feed forward neural network with the multi-layer. It measures the output error and calculates the gradient of the error and adjusting the ANN weight moving along the descending gradient direction. Back propagation is used to learn and store by mapping relations of input- output model. A genetic algorithm is having a random probability distribution or pattern that may be analyses statistically but may not be predicted precisely. Genetic algorithm is an iterative procedure that generates new population for individual from the old one. In my paper I am proposing to implement the back propagation algorithm and genetic algorithm to compare the output accuracy percent for medical diagnosis on various chest diseases (Asthme, tuberculosis, lung cancer, pneumonia).
Genome annotation, NGS sequence data, decoding sequence information, The genome contains all the biological information required to build and maintain any given living organism.
AUDIO CRYPTOGRAPHY VIA ENHANCED GENETIC ALGORITHMijma
As communication technologies surged recently, the secrecy of shared information between communication parts has gained tremendous attention. Many Cryptographic techniques have been proposed/implemented to secure multimedia data and to allay public fears during communication. This paper expands the scope of audio data security via an enhanced genetic algorithm. Here, each individual (audio sample) is genetically engineered to produce new individuals. The enciphering process of the proposed technology acquires, conditions, and transforms each audio sample into bit strings. Bits fission, switching, mutation, fusion, and deconditioning operations are then applied to yield cipher audio signals. The original audio sample is recovered at the receiver's end through a deciphering process without the loss of any inherent message. The novelty of the proposed technique resides in the integration of fission and fusion into the traditional genetic algorithm operators and the use of a single (rather than two) individual(s) for reproduction. The effectiveness of the proposed cryptosystem is demonstrated through simulations and performance analyses.
Deep learning for extracting protein-protein interactions from biomedical lit...Yifan Peng
State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information. But how to incorporate such knowledge in the recent deep learning methods remains an open question. In this paper, we propose a multichannel dependency-based convolutional neural network model (McDepCNN). It applies one channel to the embedding vector of each word in the sentence, and another channel to the embedding vector of the head of the corresponding word. Therefore, the model can use richer information obtained from different channels. Experiments on two public benchmarking datasets, AIMed and BioInfer, demonstrate that McDepCNN compares favorably to the state-of-the-art rich-feature and single-kernel based methods. In addition, McDepCNN achieves 24.4% relative improvement in F1-score over the state-of-the-art methods on cross-corpus evaluation and 12% improvement in F1-score over kernel-based methods on "difficult" instances. These results suggest that McDepCNN generalizes more easily over different corpora, and is capable of capturing long distance features in the sentences.
Improving DNA Barcode-based Fish Identification System on Imbalanced Data usi...TELKOMNIKA JOURNAL
Problem in imbalanced data is very common in classification or identification. The problem is
raised when the number of instances of one class far exceeds the other. In the previous research, our
DNA barcode-based Identification System of Tuna and Mackerel was developed in imbalanced dataset.
The number of samples of Tuna and Mackerel were much more than those of other fish samples.
Therefore, the accuracy of the classification model was probably still in bias. This research aimed at
employing Synthetic Minority Oversampling Technique (SMOTE) to yield balanced dataset. We used kmers
frequencies from DNA barcode sequences as features and Support Vector Machine (SVM) as
classification method. In this research we used trinucleotide (3-mers) and tetranucleotide (4-mers). The
training dataset was taken from Barcode of Life Database (BOLD). For evaluating the model, we compared
the accuracy of model using SMOTE and without SMOTE in order to classify DNA barcode sequences
which is taken from Department of Aquatic Product Technology, Bogor Agricultural University. The results
showed that the accuracy of the model in the species level using SMOTE was 7% and 13% higher than
those of non-SMOTE for trinucleotide (3-mers) and tetranucleotide (4-mers), respectively. It is expected
that the use of SMOTE, as one of data balancing technique, could increase the accuracy of DNA barcode
based fish classification system, particularly in the species level which is difficult to be identified.
Diagnosis Chest Diseases Using Neural Network and Genetic Hybrid AlgorithmIJERA Editor
The back propagation algorithm is most popular algorithm in feed forward neural network with the multi-layer. It measures the output error and calculates the gradient of the error and adjusting the ANN weight moving along the descending gradient direction. Back propagation is used to learn and store by mapping relations of input- output model. A genetic algorithm is having a random probability distribution or pattern that may be analyses statistically but may not be predicted precisely. Genetic algorithm is an iterative procedure that generates new population for individual from the old one. In my paper I am proposing to implement the back propagation algorithm and genetic algorithm to compare the output accuracy percent for medical diagnosis on various chest diseases (Asthme, tuberculosis, lung cancer, pneumonia).
Genome annotation, NGS sequence data, decoding sequence information, The genome contains all the biological information required to build and maintain any given living organism.
AUDIO CRYPTOGRAPHY VIA ENHANCED GENETIC ALGORITHMijma
As communication technologies surged recently, the secrecy of shared information between communication parts has gained tremendous attention. Many Cryptographic techniques have been proposed/implemented to secure multimedia data and to allay public fears during communication. This paper expands the scope of audio data security via an enhanced genetic algorithm. Here, each individual (audio sample) is genetically engineered to produce new individuals. The enciphering process of the proposed technology acquires, conditions, and transforms each audio sample into bit strings. Bits fission, switching, mutation, fusion, and deconditioning operations are then applied to yield cipher audio signals. The original audio sample is recovered at the receiver's end through a deciphering process without the loss of any inherent message. The novelty of the proposed technique resides in the integration of fission and fusion into the traditional genetic algorithm operators and the use of a single (rather than two) individual(s) for reproduction. The effectiveness of the proposed cryptosystem is demonstrated through simulations and performance analyses.
A genetic algorithm approach for predicting ribonucleic acid sequencing data ...TELKOMNIKA JOURNAL
Malaria larvae accept explosive variable lifecycle as they spread across numerous mosquito vector stratosphere. Transcriptomes arise in thousands of diverse parasites. Ribonucleic acid sequencing (RNA-seq) is a prevalent gene expression that has led to enhanced understanding of genetic queries. RNA-seq tests transcript of gene expression, and provides methodological enhancements to machine learning procedures. Researchers have proposed several methods in evaluating and learning biological data. Genetic algorithm (GA) as a feature selection process is used in this study to fetch relevant information from the RNA-Seq Mosquito Anopheles gambiae malaria vector dataset, and evaluates the results using kth nearest neighbor (KNN) and decision tree classification algorithms. The experimental results obtained a classification accuracy of 88.3 and 98.3 percents respectively.
White Paper: In vivo Fiberoptic Fluorescence Microscopy in freely behaving miceFUJIFILM VisualSonics Inc.
Fiberoptic fluorescence microscopy (FFM) employs optical fibers as small as 300 micrometers in diameter and offers the ability to image cellular and subcellular processes in deep brain structures including the Ventral Tegmental Area (VTA) and the substantia nigra (Sn).
AN ANN BASED BRAIN ABNORMALITY DETECTION USING MR IMAGEScscpconf
The Main purpose of this paper is to design, implement and evaluate a strong automatic diagnostic system that increases the accuracy of tumor diagnosis in brain using MR images.This presented work classifies the brain tissues as normal or abnormal automatically, usingcomputer vision. This saves lot of radiologist time to carryout monotonous repeated job. The
acquired MR images are processed using image preprocessing techniques. The preprocessed images are then segmented, and the various features are extracted. The extracted features are
fed to the artificial neural network as input that trains the network using error back propagation algorithm for correct decision making.
Framework for Contextual Outlier Identification using Multivariate Analysis a...IJECEIAES
Majority of the existing commercial application for video surveillance system only captures the event frames where the accuracy level of captures is too poor. We reviewed the existing system to find that at present there is no such research technique that offers contextual-based scene identification of outliers. Therefore, we presented a framework that uses unsupervised learning approach to perform precise identification of outliers for a given video frames concerning the contextual information of the scene. The proposed system uses matrix decomposition method using multivariate analysis to maintain an equilibrium better faster response time and higher accuracy of the abnormal event/object detection as an outlier. Using an analytical methodology, the proposed system blocking operation followed by sparsity to perform detection. The study outcome shows that proposed system offers an increasing level of accuracy in contrast to the existing system with faster response time.
Training artificial neural network using particle swarm optimization algorithmA. Roy
Abstract -
In this paper, the adaptation of network weights using Particle Swarm Optimization (PSO) was proposed as a mechanism to improve the performance of Artificial Neural Network (ANN) in classification of IRIS dataset. Classification is a machine learning technique used to predict group membership for data instances. To simplify the problem of classification neural networks are being introduced. This paper focuses on IRIS plant classification using Neural Network. The problem concerns the identification of IRIS plant species on the basis of plant attribute measurements. Classification of IRIS data set would be discovering patterns from examining petal and sepal size of the IRIS plant and how the prediction was made from analyzing the pattern to form the class of IRIS plant. By using this pattern and classification, in future upcoming years the unknown data can be predicted more precisely. Artificial neural networks have been successfully applied to problems in pattern classification, function approximations, optimization, and associative memories. In this work, Multilayer feed- forward networks are trained using back propagation learning algorithm.
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...Anjani Dhrangadhariya
Is multitask learning worthy in PICO recognition? We explored this question in out paper with the same name (Read our paper here https://arodes.hes-so.ch/record/8949?ln=FR). These slides correspond to the paper and were presented in CLEF2021 Romania, Bucharest.
Detecting malaria using a deep convolutional neural networkYusuf Brima
Experiment with Deep Residual Convolutional Neural Network to classify microscopic blood cell images (Uninfected, Parasitized)
Utiling ResNet,Deep Residual Learning for Image Recognition (He et al, 2015) architecture.
Uses Keras with a Tensorflow backend.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A clonal based algorithm for the reconstruction of genetic network using s sy...eSAT Journals
Abstract Motivation: Gene regulatory network is the network based approach to represent the interactions between genes. DNA microarray is the most widely used technology for extracting the relationships between thousands of genes simultaneously. Gene microarray experiment provides the gene expression data for a particular condition and varying time periods. The expression of a particular gene depends upon the biological conditions and other genes. In this paper, we propose a new method for the analysis of microarray data. The proposed method makes use of S-system, which is a well-accepted model for the gene regulatory network reconstruction. Since the problem has multiple solutions, we have to identify an optimized solution. Evolutionary algorithms have been used to solve such problems. Though there are a number of attempts already been carried out by various researchers, the solutions are still not that satisfactory with respect to the time taken and the degree of accuracy achieved. Therefore, there is a need of huge amount further work in this topic for achieving solutions with improved performances. Results: In this work, we have proposed Clonal selection algorithm for identifying optimal gene regulatory network. The approach is tested on the real life data: SOS Ecoli DNA repairing gene expression data. It is observed that the proposed algorithm converges much faster and provides better results than the existing algorithms. Index Terms: Microarray analysis, Evolutionary Algorithm, Artificial Immune System, S-system, Gene Regulatory Network, SOS Ecoli DNA repairing, Clonal Selection Algorithm.
A genetic algorithm approach for predicting ribonucleic acid sequencing data ...TELKOMNIKA JOURNAL
Malaria larvae accept explosive variable lifecycle as they spread across numerous mosquito vector stratosphere. Transcriptomes arise in thousands of diverse parasites. Ribonucleic acid sequencing (RNA-seq) is a prevalent gene expression that has led to enhanced understanding of genetic queries. RNA-seq tests transcript of gene expression, and provides methodological enhancements to machine learning procedures. Researchers have proposed several methods in evaluating and learning biological data. Genetic algorithm (GA) as a feature selection process is used in this study to fetch relevant information from the RNA-Seq Mosquito Anopheles gambiae malaria vector dataset, and evaluates the results using kth nearest neighbor (KNN) and decision tree classification algorithms. The experimental results obtained a classification accuracy of 88.3 and 98.3 percents respectively.
White Paper: In vivo Fiberoptic Fluorescence Microscopy in freely behaving miceFUJIFILM VisualSonics Inc.
Fiberoptic fluorescence microscopy (FFM) employs optical fibers as small as 300 micrometers in diameter and offers the ability to image cellular and subcellular processes in deep brain structures including the Ventral Tegmental Area (VTA) and the substantia nigra (Sn).
AN ANN BASED BRAIN ABNORMALITY DETECTION USING MR IMAGEScscpconf
The Main purpose of this paper is to design, implement and evaluate a strong automatic diagnostic system that increases the accuracy of tumor diagnosis in brain using MR images.This presented work classifies the brain tissues as normal or abnormal automatically, usingcomputer vision. This saves lot of radiologist time to carryout monotonous repeated job. The
acquired MR images are processed using image preprocessing techniques. The preprocessed images are then segmented, and the various features are extracted. The extracted features are
fed to the artificial neural network as input that trains the network using error back propagation algorithm for correct decision making.
Framework for Contextual Outlier Identification using Multivariate Analysis a...IJECEIAES
Majority of the existing commercial application for video surveillance system only captures the event frames where the accuracy level of captures is too poor. We reviewed the existing system to find that at present there is no such research technique that offers contextual-based scene identification of outliers. Therefore, we presented a framework that uses unsupervised learning approach to perform precise identification of outliers for a given video frames concerning the contextual information of the scene. The proposed system uses matrix decomposition method using multivariate analysis to maintain an equilibrium better faster response time and higher accuracy of the abnormal event/object detection as an outlier. Using an analytical methodology, the proposed system blocking operation followed by sparsity to perform detection. The study outcome shows that proposed system offers an increasing level of accuracy in contrast to the existing system with faster response time.
Training artificial neural network using particle swarm optimization algorithmA. Roy
Abstract -
In this paper, the adaptation of network weights using Particle Swarm Optimization (PSO) was proposed as a mechanism to improve the performance of Artificial Neural Network (ANN) in classification of IRIS dataset. Classification is a machine learning technique used to predict group membership for data instances. To simplify the problem of classification neural networks are being introduced. This paper focuses on IRIS plant classification using Neural Network. The problem concerns the identification of IRIS plant species on the basis of plant attribute measurements. Classification of IRIS data set would be discovering patterns from examining petal and sepal size of the IRIS plant and how the prediction was made from analyzing the pattern to form the class of IRIS plant. By using this pattern and classification, in future upcoming years the unknown data can be predicted more precisely. Artificial neural networks have been successfully applied to problems in pattern classification, function approximations, optimization, and associative memories. In this work, Multilayer feed- forward networks are trained using back propagation learning algorithm.
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...Anjani Dhrangadhariya
Is multitask learning worthy in PICO recognition? We explored this question in out paper with the same name (Read our paper here https://arodes.hes-so.ch/record/8949?ln=FR). These slides correspond to the paper and were presented in CLEF2021 Romania, Bucharest.
Detecting malaria using a deep convolutional neural networkYusuf Brima
Experiment with Deep Residual Convolutional Neural Network to classify microscopic blood cell images (Uninfected, Parasitized)
Utiling ResNet,Deep Residual Learning for Image Recognition (He et al, 2015) architecture.
Uses Keras with a Tensorflow backend.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A clonal based algorithm for the reconstruction of genetic network using s sy...eSAT Journals
Abstract Motivation: Gene regulatory network is the network based approach to represent the interactions between genes. DNA microarray is the most widely used technology for extracting the relationships between thousands of genes simultaneously. Gene microarray experiment provides the gene expression data for a particular condition and varying time periods. The expression of a particular gene depends upon the biological conditions and other genes. In this paper, we propose a new method for the analysis of microarray data. The proposed method makes use of S-system, which is a well-accepted model for the gene regulatory network reconstruction. Since the problem has multiple solutions, we have to identify an optimized solution. Evolutionary algorithms have been used to solve such problems. Though there are a number of attempts already been carried out by various researchers, the solutions are still not that satisfactory with respect to the time taken and the degree of accuracy achieved. Therefore, there is a need of huge amount further work in this topic for achieving solutions with improved performances. Results: In this work, we have proposed Clonal selection algorithm for identifying optimal gene regulatory network. The approach is tested on the real life data: SOS Ecoli DNA repairing gene expression data. It is observed that the proposed algorithm converges much faster and provides better results than the existing algorithms. Index Terms: Microarray analysis, Evolutionary Algorithm, Artificial Immune System, S-system, Gene Regulatory Network, SOS Ecoli DNA repairing, Clonal Selection Algorithm.
Deep Learning: Towards General Artificial IntelligenceRukshan Batuwita
For the past several years Deep Learning methods have revolutionized the areas in Pattern Recognition, namely, Computer Vision, Speech Recognition, Natural Language Processing etc. These techniques have been mainly developed by academics, closely working with tech giants such as Google, Microsoft and Facebook where the research outcomes have been successfully integrated into commercial products such as Google image and voice search, Google Translate, Microsoft Cortana, Facebook M and many more interesting applications that are yet to come. More recently, Google DeepMind Technologies has been working on Artificial General Intelligence using Deep Reinforcement Learning methods, where their AlphaGo system beat the world champion of the complex Chinese game 'Go' in March 2016. This talk will present a thorough introduction to major Deep Learning techniques, recent breakthroughs and some exciting applications.
Artificial Neural Networks (ANNS) For Prediction of California Bearing Ratio ...IJMER
The behaviour of soil at the location of the project and interactions of the earth materials during and after construction has a major influence on the success, economy and safety of the work. Another complexity associated with some geotechnical engineering materials, such as sand and gravel, is the difficulty in obtaining undisturbed samples and time consuming involving skilled
technician. Knowledge of California Bearing Ratio (C.B.R) is essential in finding the road thickness. To cope up with the difficulties involved, an attempt has been made to model C.B.R in terms of Fine Fraction, Liquid Limit, Plasticity Index, Maximum Dry density, and Optimum Moisture content. A multi-layer perceptron network with feed forward back propagation is used to model varying the
number of hidden layers. For this purposes 50 soils test data was collected from the laboratory test
results. Among the test data 30 soils data is used for training and remaining 20 soils for testing using
60-40 distribution. The architectures developed are 5-4-1, 5-5-1, and 5-6-1. Model with 5-6-1 architecture is found to be quite satisfactory in predicting C.B.R of soils. A graph is plotted between
the predicted values and observed values of outputs for training and testing process, from the graph it
is found that all the points are close to equality line, indicating predicted values are close to observed
values
Why Neurons have thousands of synapses? A model of sequence memory in the brainNumenta
Presentation given by Yuwei Cui, Numenta Research Engineer at Beijing Normal University. December 2015.
Collaborators: Jeff Hawkins, Subutai Ahmad, Chetan Surpur
Understanding Protein Function on a Genome-scale through the Analysis of Molecular Networks
Cornell Medical School, Physiology, Biophysics and Systems Biology (PBSB) graduate program, 2009.01.26, 16:00-17:00; [I:CORNELL-PBSB] (Long networks talk, incl. the following topics: why networks w. amsci*, funnygene*, net. prediction intro, memint*, tse*, essen*, sandy*, metagenomics*, netpossel*, tyna*+ topnet*, & pubnet* . Fits easily into 60’ w. 10’ questions. PPT works on mac & PC and has many photos w. EXIF tag kwcornellpbsb .)
Date Given: 01/26/2009
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity
Automated Analysis of Microscopy Images using Deep Convolutional Neural NetworkAdetayoOkunoye
The general cell quantification and identification have technical limitations concerning the fast and accurate detection of complex morphological cells, especially for overlapping cells, irregular cell shapes, bad focal planes, among other factors. We use the deep convolutional neural networks (DCNN) to classify the annotated images of five types of white blood cells. The accuracy and performance of the proposed framework are evaluated for the blood cell classifications. The results demonstrate that the DCNN model performs close to the accuracy of 80% and provides an accurate and fast method for hematological laboratories.
Employing Neocognitron Neural Network Base Ensemble Classifiers To Enhance Ef...cscpconf
This paper presents an ensemble of neo-cognitron neural network base classifiers to enhance
the accuracy of the system, along the experimental results. The method offers lesser
computational preprocessing in comparison to other ensemble techniques as it ex-preempts
feature extraction process before feeding the data into base classifiers. This is achieved by the
basic nature of neo-cognitron, it is a multilayer feed-forward neural network. Ensemble of such
base classifiers gives class labels for each pattern that in turn is combined to give the final class
label for that pattern. The purpose of this paper is not only to exemplify learning behaviour of
neo-cognitron as base classifiers, but also to purport better fashion to combine neural network
based ensemble classifiers.
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a...Masahito Ohue
Thirteenth International Conference on Intelligent Computing (ICIC2017)
R13: Protein and Gene Bioinformatics: Analysis, Algorithms and Applications, Aug 9, 2017.
Masahito Ohue, Takuro Yamazaki, Tomohiro Ban, Yutaka Akiyama.
In Proceedings of the Thirteenth International Conference On Intelligent Computing (ICIC2017) (Lecture Notes in Computer Science), 10362, 549-558, Liverpool,UK August 7-10, 2017
https://link.springer.com/chapter/10.1007/978-3-319-63312-1_48
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
The proposed work involves the multiobjective PSO
based optimization of artificial neural network structure for
the classification of multispectral satellite images. The neural
network is used to classify each image pixel in various land
cove types like vegetations, waterways, man-made structures
and road network. It is per pixel supervised classification using
spectral bands (original feature space). Use of neural network
for classification requires selection of most discriminative
spectral bands and determination of optimal number of nodes
in hidden layer. We propose new methodology based on
multiobjective particle swarm optimization (MOPSO) to
determine discriminative spectral bands and the number of
hidden layer node simultaneously. The result obtained using
such optimized neural network is compared with that of
traditional classifiers like MLC and Euclidean classifier. The
performance of all classifiers is evaluated quantitatively using
Xie-Beni and â indexes. The result shows the superiority of
the proposed method.
Classification Of Iris Plant Using Feedforward Neural Networkirjes
The classification and recognition of type on the basis of individual features and behaviors constitute
a preliminary measure and is an important target in the behavioral sciences. Current statistical methods do not
always yield satisfactory answers. A Feed Forward Artificial Neural Network is the computer model inspired by
the structure of the Human Brain. It views as in the set of artificial nerve cells that are interconnected with the
other neurons. The primary aim of this paper is to demonstrate the process of developing the Artificial Neural
network based classifier which classifies the Iris database. The problem concerns the identification of Iris plant
species on the basis of plant attribute measurements. This paper is related to the use of feed forward neural
networks towards the identification of iris plants on the basis of the following measurements: sepal length, sepal
width, petal length, and petal width. Using this data set a Neural Network (NN) is used for the classification of
iris data set. The EBPA is used for training of this ANN. The results of simulations illustrate the effectiveness of
the neural system in iris class identification.
Similar to ISMB2014読み会 イントロ + Deep learning of the tissue-regulated splicing code (20)
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
2. What
is
ISMB?
• FAQから引用
– Intelligent
Systems
for
Molecular
Biology
(ISMB)
is
the
annual
meeDng
of
the
InternaDonal
Society
for
ComputaDonal
Biology
(ISCB).
Over
the
past
eighteen
years
the
ISMB
conference
has
grown
to
become
the
largest
bioinformaDcs
conference
in
the
world.
The
ISMB
conferences
provide
a
mulDdisciplinary
forum
for
disseminaDng
the
latest
developments
in
bioinformaDcs.
ISMB
brings
together
scienDsts
from
computer
science,
molecular
biology,
mathemaDcs,
and
staDsDcs.
Its
principal
focus
is
on
the
development
and
applicaDon
of
advanced
computaDonal
methods
for
biological
problems.
3. ISMB
2014
• 開催地:米国ボストン
• 日程:7月11日-‐15日
• プロシーディング:BioinformaDcs誌の特別号
• 採択率:
37/191
≒
19.4%
– accept
at
1st
round:
29
papers
– invite
to
2nd
round:
16
papers
– accept
at
2nd
round:
9
papers
– withdraw
aVer
acceptance:
1
paper
5. ISMB2014読み会@産総研CBRC
Vol. 30 ISMB 2014, pages i121–i129 BIOINFORMATICS doi:10.1093/bioinformatics/btu277
Deep learning of the tissue-regulated splicing code
Michael K. K. Leung1,2, Hui Yuan Xiong1,2, Leo J. Lee1,2 and Brendan J. Frey1,2,3,*
1Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, 2Banting and
Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada and 3Canadian
Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada
ABSTRACT
Motivation: Alternative splicing (AS) is a regulated process that directs
the generation of different transcripts from single genes. A computa-tional
Previously, a ‘splicing code’ that uses a Bayesian neural net-work
慶應義塾大学理工学部
(BNN) was developed to infer a model that can predict the
outcome of AS from sequence information in different cellular
contexts (Xiong et al., 2011). One advantage of Bayesian meth-ods
佐藤健吾
model that can accurately predict splicing patterns based on
genomic features and cellular context is highly desirable, both in
understanding this widespread phenomenon, and in exploring the
effects of genetic variations on AS.
Methods: Using a deep neural network, we developed a model
inferred from mouse RNA-Seq data that can predict splicing patterns
in individual tissues and differences in splicing patterns across tissues.
Our architecture uses hidden variables that jointly represent features in
genomic sequences and tissue types when making predictions.
A graphics processing unit was used to greatly reduce the training
time of our models with millions of parameters.
is that they protect against overfitting by integrating over
models. When the training data are sparse, as is the case for
many datasets in the life sciences, the Bayesian approach can
be beneficial. It was shown that the BNN outperforms several
common machine learning algorithms, such as multinomial lo-gistic
satoken@bio.keio.ac.jp
regression (MLR) and support vector machines, for AS
prediction in mouse trained using microarray data.
There are several practical considerations when using BNNs.
They often rely on methods like Markov Chain Monte Carlo
(MCMC) to sample models from a posterior distribution,
7. Deep
Neural
Networks
(DNN)
• 深いニューラルネットワークによる表現力
• 学習が極めて困難
Deep Neural Networks
construct
nonlinearity for hidden layers
the output layer
backpropagation does not
randomly initialized)
trained with
backpropagation (without
pretraining) perform
shallow networks
8. Deep
Neural
Networks
(DNN)
• いくつかのブレークスルー
– Autoencoderによるpre-‐training
[Hinton
et
al.,
2006]
– Dropoutによる学習の安定化 [Srivastava
et
al.,
2014]
• 様々な分野のコンテストで圧倒的な成績
– 画像認識、音声認識、化合物の活性予測、…
• バイオインフォマティクス分野での応用はまだ
それほど多くない
– タンパク質コンタクトマップ予測 [Eickholt
et
al.,
2012]
10. Dropout
• ランダムに隠れユニットを取り除いて学習
Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov
• アンサンブル学習と同じ効果
(a) Standard Neural Net (b) After applying dropout.
Figure 1: Dropout Neural Net Model. Left: A standard neural net with 2 hidden layers. Right:
[Srivastava
et
al.,
2014]
An example of a thinned net produced by applying dropout to the network on the left.
Crossed units have been dropped.
11. Deep
Neural
Networks
(DNN)
• 深いニューラルネットワークによる表現力
Neural Networks
hidden layers
not
initialized)
perform
Different Levels of Abstraction
• Hierarchical Learning
– Natural progression from low
level to high level structure as
seen in natural complexity
– Easier to monitor what is being
learnt and to guide the machine
to better subspaces
– A good lower level
representation can be used for
many distinct tasks
[Lee,
2010]
12. 問題設定
ARTICLES • エクソンがスプライシングを受けるかどうかを
予測する
features, active Information We use theory31 about the code. better than improved To assemble the compendium, parameters 300 nt 300 nt 300 nt 300 nt
RNA feature
extraction
Splicing code
5). The but diminished (Fig. 1b, code contained,a
Tissue type
Alternatively spliced exon
Feature set
Predicted change in
exon inclusion Code assembly
b
[Barash
et
al.,
2010]
13. モデル
Deep learning of of the DNN used to predict AS patterns. It contains three hidden layers, with hidden variables that jointly context (tissue types)
14. 特徴量
• 前後のエクソン・イントロンに関する1392個の特
徴量[Barash
et
al.,
2010]
300 nt 300 nt 300 nt 300 nt
ARTICLES – k-‐mer
– 翻訳可能性
– 長さ
– 保存度
– モチーフ配列(転写因子結合部位)
– …
features, thresholds active feature Information We use a measure theory31 (see Methods). about genome-the code. A code better than guessing, improved prediction To assemble a the compendium, parameters to maximize 5). The code but diminished gains (Fig. 1b, c, based code Splicing code
contained,compendium plus did not exceed 1 a
Tissue type
Alternatively spliced exon
Feature set
Predicted change in
exon inclusion bits)
400
Final
assembled
(code
d
b
RNA feature
extraction
Code assembly
15. 出力
• PSI
(Percentage
of
Splicing
In)
[Katz
et
al.,
2010]を離
散化
– LMH
code
• Low:
0-‐0.33,
Medium:
0.33-‐0.66,
High:
0.66-‐1
– DNI
code
• Decrease:
部位i
>
部位j
• No
change:
部位i
≒
部位j
(PSIの差の絶対値<0.15)
• Increase:
部位i
<
部位j
• 複数の出力を同時に学習
– 学習が安定化
1. Deep learning Architecture of the DNN used to predict AS patterns. It contains three hidden layers, with hidden variables that jointly
17. perform well on both tasks. The optimal set of hyperparameters were then used model using both training and validation data. Five models were trained this different folds of data. Predictions made for the corresponding test data from all then evaluated and reported.
The hyperparameters that were optimized and their search ranges are: (1) the learning each of the two tasks (0.1 to 2.0), (2) the number of hidden units in each layer (30 the L1 penalty (0.0 to 0.25), (4) the standard deviation of the normal distribution initialize the weights (0.001 to 0.200), (5) the momentum schedule defined as epochs to linearly increase the momentum from 0.50 to 0.99 (50 to 1500), and (6) size (500 to 8500). The number of training epoch was fixed to 1500. In our experience, set of hyperparameters were generally found in approximately 2 days, where experiments ran on a single GPU (Nvidia GTX Titan). The selected set of hyperparameters Table S2. There is a large range of acceptable values for the number hidden units layer.
ハイパーパラメータの最適化
• 5-‐fold
cross
validaDon:
AUCに基づき最適化
– training:
3
folds
(DNNの学習)
– validaDon:
1
fold
(ハイパーパラメータの最適化)
– test:
1
fold
(評価)
Table S2. The hyperparameters selected to train the deep neural network. Some ranges to reflect the variations from the different folds as well as hyperparameters performing runs within a given fold.
• ガウス過程に基づく
spearmint
[Snoek
et
al.,
2012]
という手法を適用
Range Selected
Hidden Units (layer 1) 450 - 650
Hidden Units (layer 2) 4500 - 6000
Hidden Units (layer 3) 400 - 600
L1 Regularization 0 - 0.05
Learning Rate (LMH code) 1.40 - 1.50
Learning Rate (DNI code) 1.80 - 2.00
Momentum Rate 1250
Minibatch Size 1500
Weight Initialization 0.05 - 0.09
18. Toronto, ON, Canada M5S 3G4
実験
• 実験環境
– Python
with
Gnumpy
[Tieleman,
2010]
で実装
– Nvidia
GTX
Titan上で実験
• データ
– マウスの5部位RNA-‐seqデータ [Brawand
et
al.,
2011]
か
ら得た
11,019個のエクソンのスプライシングパ
ターン
1
S1 Dataset Description
The dataset consists of 11,019 mouse alternative exons in five tissue types profiled from RNA-Seq
data prepared by (Brawand et al., 2011). As explained in the main text, a distribution of
percent-spliced-in (PSI) was estimated for each exon and tissue. From this distribution, three
real-values were calculated by summing the probability mass over equally split intervals of 0 to
0.33 (low), 0.33 to 0.66 (medium), and 0.66 to 1 (high). They represent the probability that the
given exon within a tissue type has PSI value ranging from these intervals, hence are soft
assignments into each category. The models were trained using these soft labels. Table S1
shows the distribution of exons in each category, counted by selecting the label with the largest
value.
Table S1. The number of exons classified as low, medium, and high for each mouse tissue.
Exons with large tissue variability (TV) are displayed in a separate column. The proportion of
medium category exons that have large tissue variability is higher than the other two categories.
Brain Heart Kidney Liver Testis
All TV All TV All TV All TV All TV
Low 1782 579 1191 460 1287 528 1001 413 1216 452
Medium 669 456 384 330 345 294 254 220 346 270
High 5229 1068 4060 919 4357 941 3606 757 4161 887
Total 7680 2103 5635 1709 5989 1763 4861 1390 5723 1609
19. Heart MLR 84.6"0.1 73.1"0.3 83.6"0.1
Downloaded from http://bioinformatics.oxfordjournals.3 We present three sets of results that compare the test perform-ance
BNN 89.2"0.4 75.2"0.3 88.0"0.4
DNN 89.3"0.5 79.4"0.9 88.3"0.6
BNN 91.1"0.3 74.7"0.3 89.5"0.2
DNN 90.7"0.6 79.7"1.2 89.4"1.1
結果:先行研究との比較
RESULTS
of the BNN, DNN and MLR for splicing pattern predic-tion.
The first is the PSI prediction from the LMH code tested on
all exons. The second is the PSI prediction evaluated only on
targets where there are large Deep variations learning across of the tissues splicing for a code
given
exon. These are events where "PSI!"0.15 for at least one pair
of tissues, third result • LMH
to evaluate the tissue specificity of the model. The
shows how code
well the code (all)
can classify "PSI between
the five tissue types. Hyperparameter tuning was used in all
methods. The averaged predictions from all partitions and
folds are used to evaluate the model’s performance on their cor-responding
Kidney MLR 86.7"0.1 75.6"0.2 86.3"0.1
BNN 92.5"0.4 78.3"0.4 91.6"0.4
DNN 91.9"0.6 82.6"1.1 91.2"0.9
Liver MLR 86.5"0.2 75.6"0.2 86.5"0.1
BNN 92.7"0.3 77.9"0.6 92.3"0.5
DNN 92.2"0.5 80.5"1.0 91.1"0.8
• LMH
code
(high
Dssue
Testis MLR 85.6"0.1 72.3"0.4 85.2"0.1
BNN 91.1"0.3 75.5"0.6 90.4"0.3
DNN 90.7"0.6 76.6"0.7 89.7"0.7
variability)
Table 1. Comparison of the LMH code’s AUC performance on different
methods
(a) AUCLMH_All
test dataset. Similar to training, we tested on exons
and tissues that have at least 10 junction reads.
For the LMH code, as the same prediction target can be gen-erated
Tissue Method Low Medium High
by different input configurations, and there are two LMH
Brain MLR 81.3"0.1 72.4"0.3 81.5"0.1
outputs, we BNN compute the 89.2predictions "0.4 for 75.2all "input 0.3 combinations
88.0"0.4
containing DNN the particular 89.3tissue "0.5 and average 79.4"them 0.9 into 88.3a single
"0.6
prediction for testing. To assess the stability of the LMH predic-tions,
Heart MLR 84.6"0.1 73.1"0.3 83.6"0.1
BNN 91.1"0.3 74.7"0.3 89.5"0.2
DNN 90.7"0.6 79.7"1.2 89.4"1.1
we calculated the percentage of instances in which there is
a prediction from one tissue input configuration that does not
agree with another tissue input configuration in terms of class
membership, for all exons and tissues. Of all predictions, 91.0%
agreed with each other, 4.2% have predictions that are in adja-cent
Kidney MLR 86.7"0.1 75.6"0.2 86.3"0.1
BNN 92.5"0.4 78.3"0.4 91.6"0.4
DNN 91.9"0.6 82.6"1.1 91.2"0.9
Liver MLR 86.5"0.2 75.6"0.2 86.5"0.1
classes (i.e. low and medium, or medium and high), and 4.8%
BNN 92.7"0.3 77.9"0.6 92.3"0.5
DNN 92.2"0.5 80.5"1.0 91.1"0.8
otherwise. Of those predictions that agreed with each other,
85.9% correspond to the correct class label on test data,
51.2% for the predictions with adjacent classes and 53.8% for
the remaining predictions. This information can be used to assess
the confidence of the predicted class labels. Note that predictions
spanning adjacent classes may be indicative that the PSI value is
somewhere between the two classes, and the above analysis using
hard class labels can underestimate the confidence of the model.
Testis MLR 85.6"0.1 72.3"0.4 85.2"0.1
BNN 91.1"0.3 75.5"0.6 90.4"0.3
DNN 90.7"0.6 76.6"0.7 89.7"0.7
(b) AUCLMH_TV
BNN:Bayeisian
NN
[Xiong
et
al.,
2011],
MLR:
MulDnomial
LogisDc
Regression
Tissue Method Low Medium High
(b) AUCLMH_TV
Tissue Method Low Medium High
Brain MLR 71.1"0.2 58.8"0.2 70.8"0.1
BNN 77.9"0.5 61.1"0.5 76.5"0.7
DNN 82.8"1.0 69.5"1.1 81.1"0.4
Heart MLR 73.9"0.3 58.6"0.4 72.7"0.1
BNN 78.1"0.3 58.9"0.3 75.7"0.3
DNN 82.0"1.1 67.4"1.3 79.7"1.2
Kidney MLR 79.7"0.3 64.3"0.2 79.4"0.2
BNN 83.9"0.5 66.4"0.5 83.3"0.6
DNN 86.2"0.6 73.2"1.3 85.3"1.2
Liver MLR 80.1"0.5 63.7"0.3 79.4"0.3
BNN 84.9"0.7 65.4"0.7 84.4"0.7
DNN 87.7"0.6 69.4"1.2 84.8"0.8
Testis MLR 77.3"0.2 60.8"0.3 77.0"0.1
BNN 81.1"0.5 63.9"0.9 81.0"0.5
DNN 84.6"1.1 67.8"0.9 83.5"0.9
Notes: " indicates 1 standard deviation; top performances are shown in bold.
subset of events that exhibit large tissue variability. Here, the
DNN significantly outperforms the BNN in all categories and
20. 先行研究のモデル
S3 Model Architectures
Genomic
Features
…
…
L tissue 1
M tissue 1
H tissue 1
L tissue 2
M tissue 2
H tissue 2
L tissue 5
M tissue 5
H tissue 5
…
Low-Medium-
High Code
Fig. S3. Architecture of the Bayesian neural network (Xiong et al., 2011) used for comparison,
where low-medium-high predictions are made separately for each tissue.
L tissue i
21. 結果:先行研究との比較
• DNI
code
Table 2. Comparison of the DNI code’s performance in terms of the AUC for decrease versus increase (AUCDvI) and change versus no change
(AUCChange)
(a) AUCDvI (b) AUCChange
– {B,D}NN-‐MLR:
Table 2a shows the AUCDvI for classifying decrease versus
increase inclusion for all and DNN outperform • {pairs of tissue. Both the B,D}BNN-NNMLR でLMH
the DNN-MLR
by a good margin.
Comparing the DNN with DNN-MLR, the DNN shows some
gain in differentiating brain and heart AS patterns from other
tissues. The performance of differentiating the remaining tissues
(kidney, liver and testis) with each other is similar between the
DNN and DNN-MLR. We note that the similarity between the
DNN and DNN-MLR in terms of performance can be due to
the use of soft labels for training. Using MLR directly on the
codeを出力
• LMH
codeを入力とするMLRでDNI
codeを予測
Method Brain
versus
Heart
Brain
versus
Kidney
Brain
versus
Liver
Brain
versus
Testis
Heart
versus
Kidney
Heart
versus
Liver
Heart
versus
Testis
Kidney
versus
Liver
Kidney
versus
Testis
Liver
versus
Testis
Change
versus
No change
MLR 50.3"0.2 48.8"0.8 48.3"1.1 51.2"0.5 50.0"1.5 47.8"1.7 51.1"0.5 49.4"0.8 51.9"0.5 51.3"0.6 74.7"0.1
BNN-MLR 65.3"0.3 73.7"0.2 69.1"0.4 72.9"0.5 72.6"0.3 66.7"0.4 68.3"0.7 54.7"0.6 65.0"0.8 65.0"0.9 76.6"0.8
DNN-MLR 77.9"0.1 83.0"0.1 81.6"0.1 82.3"0.2 82.4"0.1 81.3"0.1 82.4"0.1 76.8"0.5 79.9"0.2 79.1"0.1 79.9"0.8
DNN 79.4"0.7 83.3"0.8 82.5"0.6 82.9"0.7 86.1"1.0 85.1"1.1 84.8"0.8 76.2"1.0 82.5"1.0 81.8"1.3 86.5"1.0
Note: " indicates 1 standard deviation; top performances are shown in bold.
Table 3. Performance of the DNN evaluated on a different RNA-Seq
experiment
(a) AUCLMH_All
Tissue Low Medium High
Brain 88.1"0.5 76.1"1.0 87.0"0.6
Heart 90.7"0.5 78.4"1.3 89.0"1.0
M.K.K.Leung et al.