Introducing BioDec ...
● Bioinformatic turnkey
● Bioinformatics consulting
Via Calzavecchio 20/2 ● Biosequence DB
I-40033 Casalecchio di Reno (BO)
www.biodec.com ● Web applications
● System integration
Close ties to UniBo Biocomputing Unit
Bioinformatic turnkey solutions
• Integrated solutions:
• Bioinformation / Lab Data Management System - CMS-
based lab data, diary and workflow management;
• Molecular Anthropology - Haplogroup tree browsers;
• Plone4bio - CMS-based biosequence management and
• Decoders (predictors).
• Development, engineering and integration of custom software;
• Annotated databases of biosequences (e.g. genomes).
• Bioinformation management;
• Machine-learning methods and annotation pipelines;
• Web applications.
Bioinformation Management System
A bundle including Plone4Bio, the annotated databases, and
BioDec's own lab data management software, to collect, manage and
analyse laboratory data from specific molecular biology techniques.
• Lab diary, recording aims and conditions of the experiment, including
structured pages for the recording of experimental data, specific
of each technique. Data may include digital images, and may be
shared within a working group or an organization according to
flexible, entirely user-defined security levels. Presently supports:
Immunogenic assays (due 1st quarter 2010);
● Polymerase Chain Reaction techniques;
● Electrophoresis blot techniques.
• Customizable workflows are supported, and may differ for specific
techniques or Customer specifications, thus ensuring a full match
between the rules applied in the lab and the data lifecycle.
BioDec BMS screenshots
BioDec BMS - PCR test blot view
● An application for
● Allows to store,
and retrieve data
of a standard web
● Population subsets can be
easily selected (by location,
haplogroup, sex, MRCA) using
simple query forms, whose
reports also provide basic
statistics and charts on the
● By leveraging on Plone access-
control features, the
application can handle
selective access to stored data,
allowing fine-grained control on
what it can be accessed by an
anonymous vs an
authenticated user, so it can be
used both for internal
information sharing and data
● Data are stored in the
"Subject" data structure,
containing both personal and
● For each subject, the
application calculates on-the-
fly the haplogroup, based on
its tested UEPs, as well as the
most recent common ancestor
for each sexual lineage, based
on the stored population.
● The system also takes care of
checking data consistency and
flags the user for potential
errors, such as inconsistent or
conflicting UEPs or out-of-
range STRs within a given
● A Plone-based (http://plone.net), feature-rich graphic
BioSQL browser, to search and explore data and
metadata (annotations) from biosequence databases.
May integrate custom-made predictors (“Decoders”);
● We publicly released the base version, including an
example predictor and documentation, as open-source
software, available from http://plone4bio.org;
● Plone4Bio is reliable and modular, and is the basis of
BioDec commercial software bundles.
Plone4bio “Add new...” menu
Plone4bio LiveSearch result
Plone4Bio commercial release
User-managed pipeline for biosequence analysis and comparison,
● Full CMS features: several standard and user-defined content
types (including files, pages, RSS feeds...);
● Full set of Decoders: to annotate biosequences from any
● User-defined biosequences: customer's own biosequences
may be instantiated and populated;
● Annotated databases: our decoders have been used to
annotate and cross-link public databases sections from Uniprot,
NCBI and Ensembl, thus providing a reliable and meaningful
metadata set. Custom annotated databases are available on
Tools from machine learning
Known sequences (DB subsets) New sequence
SVM Rules SVM
Known • Artificial Neural Networks (ANNs)
structures • Hidden Markov Models (HMMs)
• Support Vector Machines (SVMs)
Protein sequence tools Protein Structure tools
● Transmembrane all-alpha sequence ● Interaction patch prediction
● Transmembrane all-beta sequence ● Fold recognition and modelling
● Signal peptide RNA tools
● GPI-anchoring prediction ● siRNA design
● Coiled-coil segment prediction
● Disordered region prediction Our toolbox is built to be MODULAR,
● Subcellular localization prediction so we can assemble analysis systems
according to your specifications.
● Sequence classification
Papers about our decoders have been published in journals such as
Bioinformatics, Journal of Molecular Biology, Nucleic Acids Research.
A BioDec Decoder: ZenDock
● Interaction patch decoder
● Analyzes protein solvent-
exposed surface for putative
returning a “fuzzy”
● Interactors are correlated
and grouped into patches.
● Results are mapped on the
protein 3D structure and
made available through a Int non-Int
Case study - Searching for the “membrane
fraction” of a bacterial proteome
• Angler, a Pipeline for the annotation of proteome biosequences
such as prokaryotic proteomes, is used to select candidates for
• The annotation results (“Proteome Atlases”) are published in
Angler classifies gram- Proteome Predictions:
negative proteome ✔ Signal peptides
sequences into nine
different classes, including Generate Classify:
✔ Alpha-helical TMP
the all-beta membrane profiles 9 classes
✔ Fold recognition
proteins and the soluble
✔ Coiled coils
secreted proteins, most
relevant to vaccine ✔ Disordered regions
EcoGene Knowledge Base (E. Coli K12)
All-alpha TMP 92.3% 92.6%
All-beta TMP 86.7% 75.0%
Soluble 96.9% 95.9%
All-alpha TMPs represent less than 20% of a proteome.
All-beta TMPs represent less than 5% of a proteome.
Outer Membrane Inner Membrane
(Rhodobacter capsulatus) (Halobacterium salinarum)
Case Study - siRNA Design
• Sirena, a siRNA design engine, used two very
large, consistent and independent data sets from
the literature: one for fitting, the other for testing.
• Based on machine-learning: Neural Network.
• Prediction is performed on a 19nt input window.
• Input is the sequence and the sequence composition for each
window in the mRNA sequence.
• Output is an estimated knock-down efficiency (Q).
• Around 70% of the reported candidates are expected to have
experimental knock-down efficiency greater than 80%.
• Successfully field-tested by TargetHerpes virologists.
Case study - Discovering new
receptors of pharmaceutical interest
I have a compound library active
on some receptors subclasses.
Q.: Which sequences in the
Human genome may be
targeted by my library?
● Find all known, classified
● Build a MSA using both sequence
and 7TM-topology informations.
● From MSA, derive and partition a
● Train class-specific HMMs.
● Scan Human genome using our
“Septimus” tool to find putative
● Classify using the class HMMs.
● Consistent 7TM prediction
for all the targets, useful for
● Some newly-classified
targets have been
– New targets
– New lead molecules
Case study - blocking protein-protein interaction
● ZenDock: multiple analysis on each
available structure then consensus.
● Predicted a new potential interaction
● Very good agreement between
prediction and experiments.
● Peptide found is now a new-drug lead.
Via Calzavecchio 20/2
I-40033 Casalecchio di Reno (BO)
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.