The Characteristics of DNA Splicing Languages via Yusof-Goode Approach

THE CHARACTERISTICS OF DNA SPLICING
LANGUAGES VIA YUSOF-GOODE APPROACH
MUHAMMAD AZRIN BIN AHMADMUHAMMAD AZRIN BIN AHMAD
FIRST ASSESSMENT
Doctor of Philosophy (Mathematics)- Fast Track
Supervisors
11
ASSOC PROF DR NOR HANIZA SARMIN (MAIN),ASSOC PROF DR NOR HANIZA SARMIN (MAIN), 22
DR FONG WAN HENG (CO)DR FONG WAN HENG (CO)
1
Department of Mathematical Sciences, Faculty of Science,
2
Ibnu Sina Institute for Fundamental Science Studies
Universiti Teknologi Malaysia, 81310 UTM Johor Bahru, Johor.
33
DR YUHANI YUSOF (CO)DR YUHANI YUSOF (CO)
3
Faculty of Industrial Science & Technology
Universiti Malaysia Pahang, 26300 UMP Gambang, Pahang.

PRESENTATION OUTLINEPRESENTATION OUTLINE
INTRODUCTION
Background
of the Research
Problem
Statement
Objectives of
the Research
Scope of the
Research
Significance
of the Research
LITERATURE
REVIEW
DNA and Its
Structure
Restriction
Enzyme
Mathematical
Model
The
Development of
Splicing System
and Languages
RESEARCH
METHODOLOGY
Research Design
and Procedure
Operational
Framework
Gantt Chart and
Schedule
STATUS OF
RESEARCH
What Had Been
Done?
What Need To
Be Done?
2

 Deoxyribonucleic acid (DNA) has two important functions which
are protein synthesis and also self replication.
 The splicing system which was introduced by Head [3] explained
about the recombinant behaviours
of DNA under the framework of Formal Language Theory.
 Four nucleotides which are Adenine (A), Guanine (G), Cytosine (C)
and Thymine (T) can be paired as [AT], [GC], [CG], [TA] or
simply presented as a, g, c and t [5].
 The splicing operation includes cutting by restriction enzyme and
also pasting by the existence of appropriate ligase.
Background of the Research
4
[3] Head, T. Formal Language Theory and DNA : An Analysis of the Generative Capacity of Specific Recombinant Behaviors. Bulletin of
Mathematical Biology. 1987. 49: 737 – 759.
[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.
[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.

Background of the Research
5
 Various studies in splicing system has led to the formulation of
Yusof-Goode (Y-G) Splicing System.
 The resulting language from splicing system (called splicing
language) can be categorized into two types [6]: adult/inert and
limit language.
 The extension of limit language which is n-th order limit language
has been defined in [6].
 This research is narrowed to second order limit language where
its existence and characteristics will be further studied.
[6] Goode, E. and Pixton, D. Splicing to the Limit. Lecture Notes in Computer Science. 2004. 2950: 189-201.[6] Goode, E. and Pixton, D. Splicing to the Limit. Lecture Notes in Computer Science. 2004. 2950: 189-201.

Statement of Problem
1. How to determine the existence
of second order limit language in
Y-G splicing system? And what
characteristics does it posses?
3. What are the mechanisms that
relate second order limit language
with different classes and variants of
splicing system?
6
2. What are the sufficient conditions
for the second order limit language
to exist in a Y-G splicing system and
other variants of splicing system?
4. How to conduct a wet-lab experiment
and develop a mathematical model to
validate the existence of second order
limit language? Which method can be
used to compare those results in
mathematical and biological point of
view?

7
Objectives of the Research
2. To provide the sufficient
conditions on the existence of
second order limit language in
splicing system.
1. To determine the
existence of second
order limit language
and study its
characteristics.
3. To relate the
existence of second
order limit languages
among variants of
splicing system.
4. To develop and verify a
mathematical model that can
validate the existence of second
order limit languages.

This research will only focus on the second order
limit language with at most two initial strings and
at most two rules. The splicing system used will
include Y-G splicing system which is restricted to Y-
G rule and also other classes of splicing systems.
Scope of the Research
8

9
Significance of the Research

DNA and Its Structure
11
[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.

DNA and Its Structure (cont.)
12
[1] Tamarin, R. H. Principle of Genetics. 7th
. ed. USA: The MacGraw-Hill Companies. 2001.[1] Tamarin, R. H. Principle of Genetics. 7th
. ed. USA: The MacGraw-Hill Companies. 2001.

Restriction Enzyme
A restriction enzyme is found in bacteria. It plays
the role to cut the DNA molecules at their crossing
sites. The recognition process that determines the
cutting site is acted by restriction endonuclease [5].
After that, restriction enzyme will clamp at the
crossing site and the cutting process will take place.
13
[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.[5] Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms. New York, London: Springer. 1998.

14
The four bases of DNA molecules which are known as a, g, c and t
are presented by initial set of alphabet. Besides that, the initial
molecule is presented by the initial string and rules (which
represent restriction enzymes) in splicing. Mathematically, it can be
seen as follows:
S = (A, I, R) where A is an alphabet made up of four bases; a, c, g
and t. The symbol I represents initial string of dsDNA and R
represents rule of either left pattern (u; x, v : y; x, z), right pattern
(u, x; v : y, x; z), or both patterns (u, x, v : y, x, z)
Mathematical Model

The Development of Splicing System and
Languages
15
Y-G
Splicing System
2011
Pixton Splicing
System
1996
Paun Splicing
System
1996
Goode-Pixton
Splicing System
1999
Head Splicing
System
1987

16
Languages (cont.)
[3] Head, T. Formal Language Theory and DNA : An Analysis of the Generative Capacity of Specific Recombinant Behaviors. Bulletin of Mathematical Biology. 1987. 49: 737 – 759.
[9] Paun, Gh. On the Splicing Operation. Discrete Applied Mathematics. 1996. 70: 57-79.
[10] Pixton, D. Regularity of Splicing Languages. Discrete Applied Mathematics. 1996. 69: 101-124.
[12] Paun, G., Rozenberg, G., Salomaa, A. Computing by Splicing. Theoretical Computer Science. 2006. 168: 321-336.
[13] Bonizzoni, P., Ferretti, C., Mauri, G. and Zizza, R. Separating Some Splicing Models. Information Processing Letters. 2001. 79: 255-259.
[14] Yusof, Y., Sarmin, N. H., Goode, T. E., Mahmud, M. and Fong, W. H. An Extension of DNA Splicing System. Sixth International Conference on Bio-Inspired Computing: Theories and Application. September
27-29, 2011. Penang. 2011. 246-248.
[3] Head, T. Formal Language Theory and DNA : An Analysis of the Generative Capacity of Specific Recombinant Behaviors. Bulletin of Mathematical Biology. 1987. 49: 737 – 759.
[9] Paun, Gh. On the Splicing Operation. Discrete Applied Mathematics. 1996. 70: 57-79.
[10] Pixton, D. Regularity of Splicing Languages. Discrete Applied Mathematics. 1996. 69: 101-124.
[12] Paun, G., Rozenberg, G., Salomaa, A. Computing by Splicing. Theoretical Computer Science. 2006. 168: 321-336.
[13] Bonizzoni, P., Ferretti, C., Mauri, G. and Zizza, R. Separating Some Splicing Models. Information Processing Letters. 2001. 79: 255-259.
[14] Yusof, Y., Sarmin, N. H., Goode, T. E., Mahmud, M. and Fong, W. H. An Extension of DNA Splicing System. Sixth International Conference on Bio-Inspired Computing: Theories and Application. September
27-29, 2011. Penang. 2011. 246-248.
Variants
of Splicing
System

17
Languages (cont.)
[8] Yusof, Y. Bio Molecular Inspiration in DNA Splicing System. Ph.D. Thesis. Universiti Teknologi Malaysia (UTM); 2011.
[11] Laun, T. E. G. Constants and Splicing System. PhD. Thesis. State University of New York at Binghamton; 1999.
[15] Mateescu, A., Paun, Gh., Rozenberg, G. and Salomaa, A. Simple Splicing System. Discrete Applied Mathematics. 1998. 84: 145-163.
[16] Goode, E. and Pixton, D. Semi-simple Splicing Systems. In: Martin-Vide, C. and Mitrana, V. eds. Where Mathematics, Computer Science, Linguistics and Biology Meet.
Dordrecht: Kluwer Academic Publishers. 343-352; 2001.
[11] Laun, T. E. G. Constants and Splicing System. PhD. Thesis. State University of New York at Binghamton; 1999.
[15] Mateescu, A., Paun, Gh., Rozenberg, G. and Salomaa, A. Simple Splicing System. Discrete Applied Mathematics. 1998. 84: 145-163.
[16] Goode, E. and Pixton, D. Semi-simple Splicing Systems. In: Martin-Vide, C. and Mitrana, V. eds. Where Mathematics, Computer Science, Linguistics and Biology Meet.
Classes
of
Splicing
System

18
Languages (cont.)
[7] Sarmin, N. H. and Fong, W. H. Mathematical Modelling of Splicing System. First International Conference on natural Resources Engineering and Technology. July 24-25, 2006.
Putrajaya. 2006: 524-527.
[17] Laun, E. and Reddy, K. J. Wet Splicing Systems. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 1999. 48: 73-83.
[18] Kari, L. DNA Computing: The Arrival of Biological Mathematics. The Mathematical Intelligencer. 1997. 19(2): 9-22.
[7] Sarmin, N. H. and Fong, W. H. Mathematical Modelling of Splicing System. First International Conference on natural Resources Engineering and Technology. July 24-25, 2006.
Putrajaya. 2006: 524-527.
[17] Laun, E. and Reddy, K. J. Wet Splicing Systems. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 1999. 48: 73-83.
[18] Kari, L. DNA Computing: The Arrival of Biological Mathematics. The Mathematical Intelligencer. 1997. 19(2): 9-22.
Biological
Approach of
Splicing
System

19
Languages (cont.)
[6] Goode, E. and Pixton, D. Splicing to the Limit. Lecture Notes in Computer Science. 2004. 2950: 189-201.
[19] Lim, D. S. F. Splicing Systems and Languages. Master. Dissertation. Universiti Teknologi Malaysia (UTM); 2006.
[20] Goode, E. and DeLorbe, W. DNA Splicing System: An Ordinary Differential Equations Model and Simulation. Lecture Notes in Computer Science. 2008. 4848: 236-
245.
[19] Lim, D. S. F. Splicing Systems and Languages. Master. Dissertation. Universiti Teknologi Malaysia (UTM); 2006.
[20] Goode, E. and DeLorbe, W. DNA Splicing System: An Ordinary Differential Equations Model and Simulation. Lecture Notes in Computer Science. 2008. 4848: 236-
245.
The
Splicing
Language

Basic Definitions
20
Definition 1 [3]: Head Splicing System
A splicing system S = (A, I, B, C) consists of a finite alphabet A, a finite set I of
initial strings in A*, and finite sets B and C of triples (c, x, d) with c, x and d in A*.
Each such triple in B or C is called a pattern. For each such triple the string cxd is
called a site and the string x is called a crossing. Patterns in B are called left
patterns and patterns in C are called right patterns. The language L = L(S)
generated by S consists of the strings in I and all strings that can be obtained by
adjoining the words ucxfq and pexdv to L whenever ucxdv and pexfq are in L and
(c, x, d) and (e, x, f) are patterns of the same hand. A language L is a splicing
language if there exists a splicing system S for which L = L(S).

Basic Definitions (cont.)
21
Definition 2 [8]: Y-G Splicing System
If , where and and are elements of I,
then splicing using r produces the initial string I together
with and , presented in either order where
are the free monoid generated by A with the concatenation operation and 1 as the
identity element.
.
.

22
Definition 3 [6]: Transcient Language
A splicing language is called a transient splicing language if a set of
strings is eventually used up and disappear in a given system.
Definition 4 [6]: n-th Order Limit Language
Let Ln-1 be the set of second-order limit words of L, the set Ln of n-th
order limit words of L to be the set of first order limits of Ln-1. We
obtain Ln from Ln-1 by deleting the words that are transient in Ln-1.
Definition 5 [3]: Palindromic Rule
A string I of dsDNA is said to be palindromic if the sequence from
the left side of the upper single strand is equal with the sequence
from the right side of the lower single strand.
Basic Definitions (cont.)

Research Design and Procedure
1. Literature review on Formal Language Theory, DNA structures and its related
information, splicing system and splicing languages.
 To examine the basic concepts of Formal Language Theory that will be used in
splicing system.
 To explore the structure of DNA and the processes which will take place inside it that
boost the idea of splicing system.
 To study the splicing system and the mechanism of it. In this research, the formation
of splicing language is studied carefully because most of the results come from it. In
addition, the types of splicing language will be explored too as different splicing
system will generate distinct types of splicing language.
2. Determine the existence of second order limit language and its characteristics
 To find the existence of second order limit language with the presence of at most two
initial string and two rules.
 To explore the characteristics of second order limit language and present those
characteristics by theorems.
24

25
(cont.)3. Investigate sufficient conditions of second order limit language
 To define the methods of recognizing second order splicing
language.
 To search conditions based on the rules of splicing system
where the second order limit language exists.
 To present all those conditions by theorems and provide the
proofs.
4. Find the relation of second order limit language among other types
of splicing language
 To present the relation of second order limit language with
other types of splicing system like self-closed splicing system,
semi-null splicing language and others.

(cont.)5. Construct a mathematical model to validate the existence of second
order limit language
 To develop a mathematical model of splicing system to validate
the existence of second order limit language using the
restriction enzyme from New England Biolabs Catalogue.
 To provide a mathematical analysis from limit graph so that the
results later can be compared with the wet-lab experiment
results.
6. Conduct a wet-lab experiment
 To study and construct the procedures of conducting a wet-lab
experiment.
 To carry out the wet-lab experiment.
26

Gantt chart and Schedule (cont.)
29

STATUS OF RESEARCH
-What Had Been Done?

Second Order Limit Language
31
Let L1 be the set of second order limit words of L,
the set L2 of 2-nd order limit words of L to be the
set of first order limits of L1. We obtain L2 from L1
by deleting words that are transient in L1.

32
Conjecture 1
If the combination of two splicing languages of first stage splicing
under the stated rule has different length from those two splicing
languages of first stage splicing, then second order limit language is
identified and existed.
Conjecture 2
If the resulted splicing language that is derived from first stage
splicing is different from the resulted splicing language, then it is
second order limit language.
Mechanism of Recognizing the Second Order Limit
Language

Biological Examples of Second Order and Non-Second
Order Limit Language
33
Example 1
Let be a Y-G splicing system consisting of two restriction enzymes
namely FauI and AciI: for with
where for this case we choose r = (r1 : r2) where r1 = (cccgcttaa;cg,1) and r2 =
(c;cg,c) respectively such that and initial strings I = {αcccgcttaacgβ}
where . When splicing occurs, the following splicing languages are
generated:

34
Order Limit Language (cont.)
Based on the rules stated above, when the resulted splicing languages are being spliced,
new splicing languages are obtained. They are listed as below:

Order Limit Language (cont.)
35
Example 2
Suppose is a Y-G splicing system consisting only one restriction
enzyme namely AciI with then an element of R such that
and initial string The following is the resulted splicing
language:
After that, the splicing among the language of first stage splicing will result in the
same molecules as the previous hence creates no new molecules at all. Thus, no
second order limit language is produced.

36
Conjecture 3
If the splicing system is null-context splicing system with the
presence of a rule and initial string which consists of two times
crossing site of the restriction enzyme in the initial string, then the
second order limit language exist.
Conjecture 4
If a splicing system is self-closed splicing system, then the second
order limit language does not exist.
Relation of Second Order Limit Language Among the
Variants of Splicing System

Characterization of the Second Order Limit
Language
37
Theorem 1
If the rule of a splicing system is itself palindromic, then
there will be no second order limit language.

38
Language (cont.)
Proof
Suppose is a Y-G splicing system and the rule, of selected
restriction enzyme is palindromic. Let us consider a case where there is a given rule
for where a and b is complement to each other. So, the
splicing occurs as below,
where
The splicing process among the resulted splicing languages do not produce distinct
language as the splicing between those languages again produce the same language
as the previous one.

Language (cont.)
39
Now, let there be two rules. So, let for where a and b
is complement to each other. Therefore, the splicing occurs as below,
Again, the splicing among the resulted language do not produce the distinct splicing
language hence no second order limit language is detected. Assume for k-th number
of rules, no second order limit language exists. By the hypothesis, no second order
limit language exists in (k+1)-th iteration of splicing. □

40
Theorem 2
An initial string that contains two recognition sites
of two rules with identical crossing sites produce
second order limit language.
Language (cont.)

Language (cont.)
41
Proof
We prove by contradiction. Suppose no second order limit language exist. Assume
the Y-G splicing system, and the rule, have two crossing sites of
two different rules in the form of where a is complement
to b, and k is complement to k’ and vice versa. The splicing occurs and produces one
of the following:

42
Language (cont.)
By splicing those two resulted splicing languages using the rules presented above, a
new splicing language is produced as given below:
The new splicing language, is a distinct splicing language
(the resulted splicing language from the first splicing can be referred to Example
4.1). Thus contradict to the assumption above. Hence the original statement is true.
□
From the above theorem, we have the following immediate result.

43
Language (cont.)
Corollary 1
If only an initial string and a rule is involved, then the second order limit language
does not exist.

STATUS OF RESEARCH
-What Need To Be
Done?

45
 The characteristic of second order limit language will be further
studied from the properties of rule such as the effect on right and
left context and others.
 The sufficient conditions for the existing of second order limit
language will be focused more upon the choice of initial string
from lambda phage and also on the properties of rules.
What Need To Be Done?

46
 More classes of splicing system will be implemented to obtain
more characterization of second order limit language and also its
properties.
 The standard procedure of handling wet-lab experiment will be
revised based on the New England Biolabs manual and its
websites, and also from past researchers that also work on this
laboratory experiment.
 The limit graph of second order limit language will be constructed
in order to make comparison and also to analyse the results
through the data obtained.
What Need To Be Done?

REFERENCES
47
1. Tamarin, R. H. Principle of Genetics. 7th
. ed. USA: The MacGraw-Hill Companies.
2001.
2. Linz, P. An Introduction to Formal Languages and Automata. Fourth Edition. USA:
Jones and Bartlett Publishers. 2006.
3. Head, T. Formal Language Theory and DNA : An Analysis of the Generative Capacity
of Specific Recombinant Behaviors. Bulletin of Mathematical Biology. 1987. 49: 737 –
759.
4. Dwyer, C. and Lebeck, A. Introducton to DNA Self Assembled Computer Design.
Boston, London: Artech House, Inc. 2008.
5. Gheorghe, P., Rozenberg, G., Salomaa, A. DNA Computing New Computing Paradigms.
New York, London: Springer. 1998.
6. Goode, E. and Pixton, D. Splicing to the Limit. Lecture Notes in Computer Science.
2004. 2950: 189-201.

REFERENCES (cont.)
48
7. Sarmin, N. H. and Fong, W. H. Mathematical Modelling of Splicing System. First
International Conference on natural Resources Engineering and Technology. July 24-25,
2006. Putrajaya. 2006: 524-527.
8. Yusof, Y. Bio Molecular Inspiration in DNA Splicing System. Ph.D. Thesis. Universiti
Teknologi Malaysia (UTM); 2011.
9. Paun, Gh. On the Splicing Operation. Discrete Applied Mathematics. 1996. 70: 57-79.
10. Pixton, D. Regularity of Splicing Languages. Discrete Applied Mathematics. 1996. 69:
101-124.
11. Laun, T. E. G. Constants and Splicing System. PhD. Thesis. State University of New
York at Binghamton; 1999.
12. Paun, G., Rozenberg, G., Salomaa, A. Computing by Splicing. Theoretical Computer
Science. 2006. 168: 321-336.

REFERENCES (cont.)
49
13. Bonizzoni, P., Ferretti, C., Mauri, G. and Zizza, R. Separating Some Splicing Models.
Information Processing Letters. 2001. 79: 255-259.
14. Yusof, Y., Sarmin, N. H., Goode, T. E., Mahmud, M. and Fong, W. H. An Extension of
DNA Splicing System. Sixth International Conference on Bio-Inspired Computing:
Theories and Application. September 27-29, 2011. Penang. 2011. 246-248.
15. Mateescu, A., Paun, Gh., Rozenberg, G. and Salomaa, A. Simple Splicing System.
Discrete Applied Mathematics. 1998. 84: 145-163.
16. Goode, E. and Pixton, D. Semi-simple Splicing Systems. In: Martin-Vide, C. and
Mitrana, V. eds. Where Mathematics, Computer Science, Linguistics and Biology Meet.
17. Laun, E. and Reddy, K. J. Wet Splicing Systems. DIMACS Series in Discrete
Mathematics and Theoretical Computer Science. 1999. 48: 73-83.

REFERENCES (cont.)
50
18. Kari, L. DNA Computing: The Arrival of Biological Mathematics. The Mathematical
Intelligencer. 1997. 19(2): 9-22.
19. Lim, D. S. F. Splicing Systems and Languages. Master. Dissertation. Universiti
Teknologi Malaysia (UTM); 2006.
20. Goode, E. and DeLorbe, W. DNA Splicing System: An Ordinary Differential Equations
Model and Simulation. Lecture Notes in Computer Science. 2008. 4848: 236-245.

51
1. Yuhani Yusof, Nor Haniza Sarmin, Fong Wan Heng, T. Elizabeth Goode and
Muhammad Azrin Ahmad. “An Analysis of Four Variants of Splicing System”.
Proceedings of the 20th National Symposium on Mathematical Sciences
(SKSM20) AIP Conf. Proc. 1522, 888 – 895 (2013).
2. Muhammad Azrin Ahmad, Nor Haniza Sarmin, Fong Wan Heng, Yuhani
Yusof. “On the Characteristics of Second Order Limit Language”. The Asia
Mathematical Conference 2013 (AMC 2013). 1 – 4 July 2013. (Poster
Session).
3. Muhammad Azrin Ahmad, Nor Haniza Sarmin, Yuhani Yusof, Fong Wan
Heng. “Exploring the New Type of Splicing Language”. The 2013 International
Conference on Mathematics and Its Application (ICMA 2013). 18 – 21 August
2013. (Submitted).
Publications

ACKNOWLEDGEMENT
52
 EXAMINERS
For their time and useful comments
 SUPERVISORS
Assoc Prof Dr Nor Haniza Sarmin,
Dr Fong Wan Heng,
Dr Yuhani Yusof
 SCHOLARSHIP
MyBrain15 MYPhD

The Characteristics of DNA Splicing Languages via Yusof-Goode Approach

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (20)

Similar to The Characteristics of DNA Splicing Languages via Yusof-Goode Approach

Similar to The Characteristics of DNA Splicing Languages via Yusof-Goode Approach (20)

Recently uploaded

Recently uploaded (20)

The Characteristics of DNA Splicing Languages via Yusof-Goode Approach