MICROARRAY CLASSIFICATION USING COMPUTATIONAL I N T E L L I G E N T T E C H N I Q U ES By Smitarani Satpathy (Roll no:- 312VMIT11025) Guided by Prof. SWATI VIPSITA
Introduction Bioinformatics can be defined as the application of computer technology to the management of biological information, encompassing a study of the inherent genetic information, molecular structure, resulting biochemical functions and the exhibited phenotypic properties. The data mining techniques are effectively used to extract meaningful relationships from these data. The application of data mining techniques in Bioinformatics are effectively used to extract meaningful relationships from these data. Biological data mining is an emerging field of research and development posing challenges and providing possibilities in this direction.
Continued The major pattern recognition in data mining task considered here are clustering, classification , feature selection and rule generation. The broad areas of Bioinformatics are Genomic sequence, protein structure, gene expression micro arrays and gene regulatory networks
Related Work Cancer gene can find from microarray data by using supervised machine learning algorithms. Also abnormal chromosome data can read by this process. Tumor can diagnosis from the microarray data by some intelligent techniques.
WHY TO USE INTELLIGENTTECHNIQUES like NN, GA, FL, PSO? INTELLIGENT techniques possess the real challenge to handle huge amount of biological data as these data mostly contain noisy samples. Since the work entails processing huge amounts of incomplete or ambiguous biological data, the Neural network, Fuzzy logic, Roughset etc, concept can be used for handling . uncertainty, ambiguity etc
MOTIVATION… WHY TO GO FOR MICROARRAY CLASSIFICATION???... The most appealing feature is that information about the sequence of DNA is not required to construct and use the microarrays (basically matching and manipulating when the input sequence is large is a cumbersome task!!) WHY TO USE INTELLIGENT TECHNIQUES like NN, GA, FL, PSO? INTELLIGENT techniques possess the real challenge to handle huge amount of biological data as these data mostly contain noisy samples.
OBJECTIVE The main objective of the work will be to implement neural networks, GA, hybrid neuro-genetic techniques, to maximize the performance accuracy of the classifier. Correct classification is of great concern to Biologists and Researchers as correct drug need to be discovered for the treatment of patients.
MICROARRAY ?Definition A microarray is a multiplex lab-on-a-chip. It is a 2-D array on a solid substrate (usually a glass slide or silicon thin film- cell) that assays large amount of biological material using high throughput screening methods.
TYPES OF MICROARRAY DNA Microarray Protein Microarray Peptide Microarray Tissue Microarray Cellular Microarray Chemical compound Microarray Antibody Microarray Carbohydrate Microarray
MICROARRAY CLASSIFICATION Initially, interest focused on genes co-expressing across sets of experimental conditions, implying essentially the use of CLUSTERING techniques. Recently, focus is on finding genes differentially expressed among distinct classes of experiments or correlated to diverse clinical outcomes as well as building predictions.
Basic Concept of Gene Expression Same set of genes in cells in all tissues of an organism. Central dogma: Gene -> mRNA -> Protein Why do tissues function differently? Activity of genes varies over Tissues External stimuli and perturbation Gene expression is the measure of gene activity. GENE EXPRESSION= #mRNA
Application Areas: SNP Detection.(A single-nucleotide polymorphism (SNP, pronounced snip) is a DNA sequence variation occurring when a single nucleotide) Evolution and Ecological Genomics. Drug Discovery and Development. Gene Expression. Tumor Classification.
CONCLUSION It can finally be concluded that microarray technology possesses the real challenge for implementation in the medical field of disease diagnosis due to the following advantages: Microarrays are useful in detecting smaller changes than routine karyotypes in case of chromosomal abnormalities.. Maximum Speed (there can be as many as 150 copies of an array of 12000 genes printed in only one day). Relatively cheap to use. User friendly(Neither radioactive nor toxic). Adaptable and Comprehensive. Study of many genes simultaneously. Can measure the expression level of thousands of genes in a single experiment.
REFERENCES1. Sushmita Mitra and Yoichi Hayashi, “Bioinformatics with Soft Computing”, IEEE Transaction on Systems, Man and Cybernetics, vol. 36(5), pp. 616-635, 2006.2. Zhenqiu Liu, Dechang Chen, and Halima Bensmail, “ Gene expression data classification using Kernel PCA”, Jr. of Biomedicine and Biotechnology, vol. 5(2), pp.155- 159, 2005.3. Caio Soares, Lacey Montgomery, Kenneth Rouse and Juan E. Gilbert, “Automating Microarray Classification using General Regression Neural Networks”, 2008 Seventh International Conference on Machine Learning and Applications, pp. 508- 513, 2008.