Cloud applications - Protein Structure Predication and gene expression data analysis
Protein Structure Predication
gene expression data analysis
protein structure prediction
• Proteins are chains of amino acids joined
together by peptide bonds.
• Many conformations of this chain are possible
due to the rotation of the chain about each atom.
• Protein structure is these conformational changes
that are responsible for differences in the three
dimensional structure of proteins.
Why we are using cloud computing
• It require high computing capabilities and often
operate on large data- sets that cause extensive
• Protein structure prediction is a computationally
intensive task that is fundamental to different
types of research in the life sciences
Benefits of protein structure
• Manually 3D structure determination is difficult, slow and
• Structure helps in the design of new drugs for the
treatment of diseases.
• The geometric structure of a protein cannot be
directly inferred from the sequence of genes that
compose its structure, but it is the result of
complex computations aimed at identifying the
structure that minimizes the required energy.
• In the above figure the web portal enables
scientist not to worry about predictions task, all
work is done by cloud service.
Machines divides the pattern recognition problem
into three phases:
• and a final phase.
these phases executes in parallel to reduce the
computational time of the prediction.
The prediction algorithm is then translated into a
task graph that is submitted to Aneka. Once the
task is completed, the middleware makes the
results available for visualization through the
Gene expression data analysis
• Gene expression profiling is the measurement of
the expression levels of thousands of genes at
once, Consequently, it is widely used for cancer
• It is also used in medical diagnosis and drug
• Cancer is a disease characterized by uncontrolled
cell growth and proliferation. This behavior occurs
because genes regulating the cell growth mutate.
This means that all the cancerous cells contain
• These uncontrolled growth develops different
types of tumors, In this context, gene expression
profiling is utilized to provide a more accurate
classification of tumors.
• The dimensionality of typical gene expression
datasets ranges from several thousands to over
tens of thousands of genes
• For these large classification is solved by
eXtended Classifier System(XCS) which has
been successfully utilized for classifying large
• Cloud-CoXCS, is a machine learning
classiﬁcation system for gene expression
datasets on the Cloud infrastructure. It extends
the XCS model by introducing a coevolutionary
• CoXCS divides the entire search space into sub
domains and employs the standard XCS
algorithm in each of these sub domains.