4. Language Overview
• BEL statements capture knowledge
• BEL annotations provide information about
statements
– Citation, experimental context, etc.
• BEL terms are composed using BEL functions applied
to namespace values
• BEL relationships connect BEL terms
4
5. BEL Statements
• Basic statement types:
• Complex statement type:
– A causal statement can be used as the target term of a causal statement
5
Term Expression Relationship Term Expression
Term Expression
complex(p(HGNC:CCND1), p(HGNC:CDK4))
p(HGNC:CCND1) directlyIncreases kin(p(HGNC:CDK4))
Term Expression Causal Relationship Causal Statement
p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))
6. BEL Annotations
• Annotations provide information about one or more BEL
Statements
6
SET Citation = {"PubMed", "J Mol Med", "12682725", "2003-
03-14","Limbourg FP|Liao JK",""}
SET Evidence = "high-dose steroid treatment decreases
vascular inflammation and ischemic
tissue damage after myocardial infarction and stroke
through direct vascular effects involving the
nontranscriptional activation of eNOS"
SET Species = "9606"
SET Tissue = "Vascular System"
SET Disease = "Stroke"
a(CHEBI:corticosteroid) -| bp(MESHD:"Inflammation")
7. BEL Terms
• BEL terms have the following components:
– Function
• Required
• Can be nested to create complex terms
– Namespace Abbreviation
• Optional
– Value
• Required
• If a namespace is given, the value is found in that namespace
• BEL terms from different namespaces are unified during
compilation using information in the BEL Namespace
Equivalence documents
7
f (ns:value)
8. BEL Functions
• Types of functions:
– Abundances
– Modifications of abundances
– Processes
– Activities
– Transformations
• Abundances and processes are applied directly to
namespace values
• All other functions are applied to abundance
functions!
9. BEL Functions
• BEL Functions enable
representation of
different aspects of a
value
– E.g., AKT1 (EGID:207)
can be represented as a
• Gene
• RNA
• Protein
• Modified Protein
• Activity
9
10. Abundance Functions
10
• Most abundance functions take namespace values
– complexAbundance() can take a namespace value OR a list of
abundance terms
– compositeAbundance() must take a list of abundance terms
Short Form Long Form Example Example Description
a() abundance() a(CHEBI:water) the abundance of water
p() proteinAbundance() p(HGNC:IL6) the abundance of human IL6 protein
complex() complexAbundance()
complex(NCH:"AP-1 Complex") the abundance of the AP-1 complex
complex(p(MGI:Fos),
p(MGI:Jun))
the abundance of the complex comprised
of mouse Fos and Jun proteins
composite() compositeAbundance()
composite(p(HGNC:IL6),
a(CHEBI:dexamethasone))
the abundances of IL6 protein and
dexamethasone, together
g() geneAbundance() g(HGNC:ERBB2) the abundance of the ERBB2 gene (DNA)
m() microRNAabundance() m(MGI:Mir21) the abundance of mouse Mir21 microRNA
r() rnaAbundance() r(HGNC:IL6) the abundance of human IL6 RNA
11. Modification Functions
• Modifications are functions used as arguments within
abundance functions
– Post-translational modifications
– Sequence variants (mutations, polymorphisms)
11
Short Form Long Form Example Example Description
pmod() proteinModification()
p(HGNC:AKT1, pmod(P))
the abundance of human AKT1 protein modified
by phosphorylation
p(MGI:Rela, pmod(A, K))
the abundance of mouse Rela protein acetylated
at an unspecified lysine
p(HGNC:HIF1A, pmod(H, N, 803))
the abundance of human HIF1A protein
hydroxylated at asparagine 803
sub() substitution() p(HGNC:PIK3CA, sub(E, 545, K))
the abundance of the human PIK3CA protein in
which glutamic acid 545 has been substituted
with lysine
trunc() truncation() p(HGNC:ABCA1, trunc(1851))
the abundance of human ABCA1 protein that
has been truncated at amino acid residue 1851
via introduction of a stop codon
fus() fusion()
p(HGNC:BCR, fus(HGNC:JAK2, 1875,
2626))
the abundance of a fusion protein of the 5'
partner BCR and 3' partner JAK2, with the
breakpoint for BCR at 1875 and JAK2 at 2626
p(HGNC:BCR, fus(HGNC:JAK2))
the abundance of a fusion protein of the 5'
partner BCR and 3' partner JAK2
12. Process Functions
• Processes include biological phenomena that occur at the
level of the cell or organism
12
Short Form Long Form Example Example Description
bp() biologicalProcess() bp(GO:"cellular senescence")
the biological process cellular
senescence
path() pathology()
path(MESHD:"Pulmonary Disease,
Chronic Obstructive")
the pathology COPD
13. Activity Functions
• Applied to protein and complex abundances to specify the frequency of
events resulting from the molecular activity of the abundance
– This distinction is useful for proteins whose activities are regulated by post-
translational modification
13
Short Form Long Form Example Example Description
cat() catalyticActivity() cat(p(RGD:Sod1)) the catalytic activity of rat Sod1 protein
chap() chaperoneActivity() chap(p(HGNC:CANX))
the events in which the human CANX (Calnexin) protein functions as a
chaperone to aid the folding of other proteins
gtp() gtpBoundActivity() gtp(p(PFH:"RAS Family")) the GTP-bound activity of RAS Family protein
kin() kinaseActivity()
kin(complex(NCH:"AMP-
activated protein kinase
complex"))
the kinase activity of the AMP-activated protein kinase complex
act() molecularActivity() act(p(HGNC:TLR4))
the ligand-bound activity of the human non-catalytic receptor protein
TLR4; a more specific activity function is not applicable to TLR4 protein
pep() peptidaseActivity() pep(p(RGD:Ace)) the peptidase activity of the Rat angiotensin converting enzyme (ACE)
phos() phosphataseActivity() phos(p(HGNC:DUSP1)) the phosphatase activity of human DUSP1 protein
ribo() ribosylationActivity() ribo(p(HGNC:PARP1)) the ribosylation activity of human PARP1 protein
tscript() transcriptionalActivity() tscript(p(MGI:Trp53)) the transcriptional activity of mouse TRP53 (p53) protein
tport() transportActivity()
tport(complex(NCH:"ENaC
Complex"))
the frequency of ion transport events mediated by the epithelial
sodium channel (ENaC) complex
14. Transformation Functions
• Transformations are events in which one class of abundance is
transformed or changed into a second class of abundance
14
Short Form Long Form Example Example Description
deg() degradation() deg(r(HGNC:MYC)) the degradation of human MYC RNA
sec() cellSecretion() sec(p(MGI:Il6)) the secretion of mouse Il6 protein
surf() cellSurfaceExpression() surf(p(RGD:Fas))
the translocation of Rat Fas protein to
the cell surface
tloc() translocation()
tloc(p(HGNC:NFE2L2), MESHCL:Cytoplasm,
MESHCL:"Cell Nucleus")
the event in which human NFE2L2
protein is translocated from the
cytoplasm to the nucleus
rxn() reaction()
rxn(reactants(a(CHEBI:phophoenolpyruvate),
a(CHEBI:ADP)),products(a(CHEBI:pyruvate),
a(CHEBI:ATP)))
the event in which the reactants
phosphoenolpyruvate and ADP are
converted into the products pyruvate
and ATP
15. BEL Relationships
• Causal relationships
– increases, directlyIncreases, decreases, directlyDecreases, rateLimiting
StepOf, causesNoChange
• Correlative relationships
– negativeCorrelation, positiveCorrelation, association
• Biomarker relationships
– biomarkerFor, prognosticBiomarkerFor
• Assignment to groups
– hasMember, hasComponent, hasMembers, hasComponents
• Other
– isA, subProcessOf
• Genomic relationships
– transcribedTo, translatedTo, orthologousTo
15
17. Knowledge Capture – Example 1
• From published paper describing effects of Tnf in rat
chondrocytes
17
18. Knowledge Capture – Example 1
18
SET Citation = {"PubMed","Arthritis Res Ther.","19144181"}
SET Species = "10116"
SET Cell = "Chondrocytes"
SET Evidence = "we identified the relative changes in
transcript levels of the extracellular matrix components
Agc1, Hapln1, and Col2a1, proteases Mmp-9 and Mmp-12, as
well as the inflammatory cytokine macrophage Csf-1 (Figure
3). TNFα decreased Agc1 and Hapln1 (Figure 3a, b) and
increased Mmp-9 and Mmp-12 (Figure 3e, f)"
p(RGD:Tnf) -> r(RGD:Mmp9)
p(RGD:Tnf) -> r(RGD:Mmp12)
p(RGD:Tnf) -| r(RGD:Acan) // Agc1 = Acan
p(RGD:Tnf) -| r(RGD:Hapln1)
Perturbation (source term)
= Tnf protein
Measurements (target
terms) = RNA abundance
In-line
comment
Experimental context = Rat
chondrocytes
Text from
paper
supporting
statements
Reference
19. Knowledge Capture – Example 2
19
SET Citation = {"PubMed", "Anticancer Agents Med
Chem. 2010 Oct 1;10(8):617-24.","21182469"}
SET Evidence = "One non-synonymous SNP 538G>A
(Gly180Arg) has been found to greatly
affect the function and stability of de novo
synthesized ABCC11 (Arg180) variant
protein. The SNP variant lacking N-linked
glycosylation is recognized as a misfolded
protein in the endoplasmic reticulum (ER) and readily
undergoes proteasomal
degradation. "
p(HGNC:ABCC11, sub(G,180,R)) =|
p(HGNC:ABCC11, pmod(G,N))
p(HGNC:ABCC11, pmod(G,N)) =| deg(p(HGNC:ABCC11))
Gly180Arg variant ABCC11
protein lacks glycosylation
ABCC11 glycosylation
blocks degradation
• Protein variants and post-translational modifications
20. Knowledge Capture – Example 3
• Microarray data – can use probe set ID as identifier
20
SET Citation = {"PubMed","J Exp Med. 2006 Nov
27;203(12):2763-77.","17116732"}
SET Evidence = "Table S1. Affymetrix U133 Plus 2.0
GeneChip array data showing transcripts in HDLECs
up- or down-regulated by a factor of at least
twofold (P < 0.1) after stimulation with TNF-α."
SET Tissue = "Endothelium, Lymphatic"
p(HGNC:TNF) -> r(HGU133P2:205476_at)
p(HGNC:TNF) -> r(HGU133P2:215101_s_at)
p(HGNC:TNF) -> r(HGU133P2:214974_x_at)
p(HGNC:TNF) -> r(HGU133P2:203868_s_at)
p(HGNC:TNF) -| r(HGU133P2:235683_at)
p(HGNC:TNF) -| r(HGU133P2:235150_at)
p(HGNC:TNF) -| r(HGU133P2:205258_at)
21. Knowledge Capture – Example 4
• Protein modifications and activities
21
SET Citation = {"PubMed","Proc Natl Acad Sci U S A 2000 Oct
24 97(22) 11960-5","11035810"}
SET Evidence = "GSK-3 activity is inhibited through
phosphorylation of serine 21 in GSK-3 alpha and serine 9 in
GSK-3 beta."
SET Species = "9606"
p(HGNC:GSK3A,pmod(P,S,21)) =| kin(p(HGNC:GSK3A))
p(HGNC:GSK3B,pmod(P,S,9)) =| kin(p(HGNC:GSK3B))
SET Evidence = "These serine residues of GSK-3 have been
previously identified as targets of protein kinase B
(PKB/Akt)"
kin(p(PFH:"AKT Family")) => p(HGNC:GSK3A,pmod(P,S,21))
kin(p(PFH:"AKT Family")) => p(HGNC:GSK3B,pmod(P,S,9))
New Evidence Line;
Citation and Species
still apply to
statements that
follow
Editor's Notes
Statement Annotations Each BEL Statement can optionally be annotated to express knowledge about the statement itself. In this example, two reserved annotation types, citation and evidence, are used to record (1) the knowledge source cited to support the relationship expressed by the statement and (2) the text within the cited knowledge source that supports the relationship. Two additional predefined annotation types, tissue and disease, are used to record information about the biological system in which the statement holds, that the reduction in tissue damage occurs in vascular tissue (whether it occurs elsewhere is not specified) and that this effect happens in the context of the disease “Stroke” (whether it would occur in other situations of vascular tissue damage is not specified).------More detail:BEL allows annotations to be defined to meet the needs of the knowledge designer. The user can define and use their own annotations, or use one or more provided by the BEL Framework. Annotation Types are defined within a BEL Document and each Annotation Type has the following characteristics: - A unique name within a BEL Document - A pre-specified domain of allowable values(Optional) usage information and a description. Examples of Annotation Type names might be Species, ExperimentType, Dosage, and ExposureTime. Each Annotation Type must have a domain of allowable values associated with it. BEL supports three (3) ways in which domain values can be specified: - An externally specified enumerated list, such as the set of NCBI Taxonomy IDs- An internally specified enumerated list- A regular expressionInternally specified lists can be defined within a BEL Document. These lists enumerate the set of allowable domain values for statements using the Annotation Type within the BEL Document. For example, an annotation type named ‘dosage’ might have the domain values {“LOW”, “MEDIUM”, “HIGH”} specified as a list. Annotation Types defined using a regular expression domain allow the knowledge designer to specify which strings are allowed for statements using the Annotation Type within the BEL Document. For example, a regular expression such as [-+]?[0-9]*\.?[0-9]+ can be used to constrain the annotation type to only allow floating-point numbers. Statement Annotations can be used to specify information about:- the biological system in which the facts represented by the statement hold or were demonstrated, - the experimental methods used to demonstrate the facts, and - the knowledge source on which the statement is based, such as the citation and the specific text supporting the statement. Examples of annotations that could be associated with a BEL Statement are the: - PubMed id specifying the publication in which the findings were reported, - Species, tissue, and cellular location in which the observations were made, and Dosage, exposure and recovery time for an experimental result. Reserved Annotation Types The following commonly used Annotation Types are reserved by BEL. Users may not define statement annotation types with these names. These annotation types have been selected to promote interoperability of knowledge by the use of a common contextual vocabulary. - Citation Enables BEL Statements to be annotated with the knowledge source cited to support the relationship expressed by the statement. - Evidence Enables BEL Statements to be annotated with the exact evidence line from a citation that supports the relationship expressed by the statement. Other Available Annotation Types The BEL Framework currently provides nineteen (19) additional, predefined annotation types that can be used. These annotations cover species, cell lines, tissues, diseases, cellular locations, and other commonly used standardized annotation types.