7. A Symptoms checker - Disease
B Metabolic engineering
C Cancer Neoantigens
Familienaam Voornaam E-mail Opleidingsprogramma
Davey Lucas Lucas.Davey@UGent.be C CMBIOISB
David Sven Sven.David@UGent.be B IMCELB
Engelen Yanou Yanou.Engelen@UGent.be ? IMCELB
Ezquerro Marrodán Elsa Elsa.EzquerroMarrodan@UGent.be C IXGAEX
Georis Raphaël Raphael.Georis@UGent.be B IMCELB
Gilis Jeroen Jeroen.Gilis@UGent.be B CMBIOISB
Lashkari Samira Samira.Lashakri@UGent.be A CMBIOISB
Recer Karmen Karmen.Recer@UGent.be C IXGAEX
Schindfessel Cédric Cedric.Schindfessel@UGent.be B IMCELB
Silva Marta Marta.Silva@UGent.be B CMBIOISB
Silva Meneses Rodrigo Rodrigo.Meneses@UGent.be C CMBIOISB
Strybol Pieter-Paul PieterPaul.Strybol@UGent.be B/C CMBIOIBE
Taelman Steff Steff.Taelman@UGent.be A CMBIOIBE
Tóth Máté István MateIstvan.Toth@UGent.be C EXGAEX
Toulmé Coralyne Coralyne.Toulme@UGent.be C IXGAEX
Van hoyweghen Sergej Sergej.Vanhoyweghen@UGent.be ? IMCELB
Willems Thomas Thomas.Willems@UGent.be ? IMCELB
Wojciulewitsch Coralie Coralie.Wojciulewitsch@UGent.be A IMCELB
Yekimov Illya Illya.Yekimov@UGent.be A IMCELB
8. For the project I suggest to take the following
steps (individual or in group - maybe setup a
collaborative tool like slack)
1. What it is about ? What do you want to
achieve ? For who ?
2. identify information resources - think about
a basic data-model
3. Draw (mockup) an interface, don't be
constrained by technical consideration - think
ouside the box:)
12. Install BIOSQL locally
• Get latest version of mysql (MAMP,
mariaDB)
• Download biosqldb-mysql.sql
• Remove type=innodb
• Launch database server
• Connect using toad (port 8889)
• Create database biosql;
• Set as active database
• Use worksheet to execute biosqldb-
mysql.sql
13. #Connecting to a BioSQL database -http://biopython.org/wiki/BioSQL
from Bio import Entrez
from Bio import SeqIO
from BioSQL import BioSeqDatabase
#db= pymysql.connect(host = "localhost",port=8889,user="root",passwd="root",db="db")
server = BioSeqDatabase.open_database(driver = "pymysql",host = "localhost",port=8889,user="root",passwd="root",db="db")
#db = server.new_database("test")
db = server["test"]
import pprint
Entrez.email = "A.N.Other@example.com"
handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", id="6273291,6273290,6273289")
print ("Loading into BIOSQL")
count = db.load(SeqIO.parse(handle, "genbank"))
print ("Loaded %i records" % count)
server.adaptor.commit()
for seq_record in SeqIO.parse(handle, "genbank"):
print (seq_record.id, seq_record.description[:50] + "...")
print ("Sequence length %i," % len(seq_record))
print ("%i features," % len(seq_record.features))
print ("from: %s" % seq_record.annotations["source"])
pprint.pprint(seq_record)
# pprint ("Loading")
#load into BIOSQL
# db.load_seqrecord(seq_record)
14.
15. Example 3-tier model in biological database
http://www.bioinformatics.be
Example of different interface to the same back-end database (MySQL)
17. Slide 17Prepared 3/21/2018
What IS Apache, Anyway?
Open-Source Web server originally based on
NCSA server
Available on over 160 varieties of Unix -- and
Windows NT
Over 56% of Internet Web servers run Apache or
an Apache derivative
Graph copyright Netcraft (<http://www.netcraft.com/survey/>)
18. Slide 18Prepared 3/21/2018
Configuring Apache
Choosing functionality
– Apache functionality is available
through modules which are either built
into or loaded into the server
Server instructions
– Apache reads its run-time configuration
instructions from text files
– No GUI available
– 182 configuration directives in base
package
19. Slide 19Prepared 3/21/2018
Configuring Apache
(continued)
When used with -d option, server reads
httpd.conf; -f allows use of a different
name
After httpd.conf, server reads
srm.conf, then access.conf (unless
the latter two have been renamed by
ResourceConfig and AccessConfig
directives, respectively)
Consider combining these into a single
file for simplicity
20. Slide 20Prepared 3/21/2018
Logfiles
Two basic logfiles
– Access log -- who’s been visiting your
server and what they wanted
– Error log -- problems the server has
encountered and things it has noticed
Can be configured for each virtual host,
or for entire server
Access log format can be customised
21. Simple Web form
<html>
<head><title>simple form</title></head>
<body>
<form name="simpleForm" method="put"
action="simpleHandler.cgi">
Your email address:
<input type="text" name="email">
<input type="submit" value="Submit">
</form>
</body>
</html>
22. Interacting with Web Forms
typically need to generate the form (which
may be a normal static Web page), then
• validate user input
• process user input
• generate a response
dynamically.
these three steps may be done within the
Web browser (client-side) or within the
Web server (server-side) or some
combination of both.
23. CGI
Common Gateway Interface
mechanism for a Web browser to send data
to a Web server
allow browser to submit data to a program
running on the server
• program is often called a ‘CGI script’
• typically written in Perl, PHP or ASP
• can also be a ‘real’ program (e.g.
written in C)
24. CGI (2)
used primarily for form submission
can also be used to upload local files
‘CGI’ URLs often contain ‘?’ and ‘&’
characters - but don’t have to!
output from CGI usually dynamic and
therefore not cached
26. Brief History of PHP
PHP (PHP: Hypertext Preprocessor) was created by Rasmus Lerdorf in
1994. It was initially developed for HTTP usage logging and server-side
form generation in Unix.
PHP 2 (1995) transformed the language into a Server-side embedded
scripting language. Added database support, file uploads, variables,
arrays, recursive functions, conditionals, iteration, regular expressions,
etc.
PHP 3 (1998) added support for ODBC data sources, multiple platform
support, email protocols (SNMP,IMAP), and new parser written by Zeev
Suraski and Andi Gutmans .
PHP 4 (2000) became an independent component of the web server for
added efficiency. The parser was renamed the Zend Engine. Many
security features were added.
PHP 5 (2004) adds Zend Engine II with object oriented programming,
robust XML support using the libxml2 library, SOAP extension for
interoperability with Web Services, SQLite has been bundled with PHP
27. Why is PHP used?
1. Easy to Use
Code is embedded into HTML. The PHP code is enclosed in special start and
end tags that allow you to jump into and out of "PHP mode".
<html>
<head>
<title>Example</title>
</head>
<body>
<?php
echo "Hi, I'm a PHP script!";
?>
</body>
</html>
28. Getting Started
1. How to escape from HTML and enter PHP mode
• PHP parses a file by looking for one of the special tags that
tells it to start interpreting the text as PHP code. The parser then
executes all of the code it finds until it runs into a PHP closing tag.
Starting tag Ending tag Notes
<?php ?> Preferred method as it allows the use of
PHP with XHTML
<? ?> Not recommended. Easier to type, but has
to be enabled and may conflict with XML
<script language="php"> ?> Always available, best if used when
FrontPage is the HTML editor
<% %> Not recommended. ASP tags support was
added in 3.0.4
<?php echo “Hello World”; ?>
PHP CODE HTMLHTML
29. Getting Started
2. Simple HTML Page with PHP
• The following is a basic example to output text using
PHP.
<html><head>
<title>My First PHP Page</title>
</head>
<body>
<?php
echo "Hello World!";
?>
</body></html>
Copy the code onto your web server and save it as “test.php”.
You should see “Hello World!” displayed.
Notice that the semicolon is used at the end of each line of PHP
code to signify a line break. Like HTML, PHP ignores whitespace
between lines of code. (An HTML equivalent is <BR>)
30. Getting Started
3. Using conditional statements
• Conditional statements are very useful for displaying specific
content to the user. The following example shows how to display
content according to the day of the week.
<?php
$today_dayofweek = date(“w”);
if ($today_dayofweek == 4){
echo “Today is Thursday!”;
}
else{
echo “Today is not Thursday.”;
}
?>
31. Getting Started
3. Using conditional statements
The if statement checks the value of $today_dayofweek
(which is the numerical day of the week, 0=Sunday… 6=Saturday)
• If it is equal to 4 (the numeric representation of Thurs.) it will display
everything within the first { } bracket after the “if()”.
• If it is not equal to 4, it will display everything in the second { } bracket
after the “else”.
<?php
$today_dayofweek = date(“w”);
if ($today_dayofweek == 4){
echo “Today is Thursday!”;
}
else{
echo “Today is not Thursday.”;
}
?>
32. Getting Started
3. Using conditional statements
If we run the script on a Thursday, we should see:
“Today is Thursday”.
On days other than Thursday, we will see:
“Today is not Thursday.”
<?php
$today_dayofweek = date(“w”);
if ($today_dayofweek == 4){
echo “Today is Thursday!”;
}
else{
echo “Today is not Thursday.”;
}
?>
33. example
• HTML file greet.html has
<form action="greet.php" method="get"><p>
your last name: <input type="text"
name="lastname"/></p></form>
• PHP file greet.php has
<?php
print "Hello ";
print $_GET['lastname'];
?>
in addition to the usual HTML stuff.
34. WHAT is PEAR
The PHP Extension and Application Repository, or PEAR, is a framework and
distribution system for PHP code components. Stig S. Bakken founded the PEAR
project in 1999 to promote the re-use of code that performs common functions.
The project has the goals of:
• providing a structured library of code
• maintaining a system for distributing code and for managing code
packages
• promoting a standard coding-style
35. PEAR DB
DB is a database abstraction layer providing:
* an OO-style query API
* portability features that make programs written for one DBMS work with other
DBMS's
* a DSN (data source name) format for specifying database servers
* prepare/execute (bind) emulation for databases that don't support it natively
* a result object for each query response
* portable error codes
* sequence emulation
* sequential and non-sequential row fetching as well as bulk fetching
* formats fetched rows as associative arrays, ordered arrays or objects
* row limit support
* transactions support
* table information interface
* DocBook and phpDocumentor API documentation
36. To access a database through PEAR DB, you have to create a data source
name (DSN) that specifies the appropriate PEAR DB backend for your
database and the parameters necessary to connect to the database.
DSN syntax:
phptype(dbsyntax)://username:password@protocol+hostspec/database?option=value
for example: (mysql)
Connecting to Databases through PEAR DB
37. How to connect and disconnect
Connecting to Databases through PEAR DB
38. Connecting to a database, creating a
table, and inserting a record.
39. Some Database Functions
Query function
$d->query takes an SQL command as its string
argument
Sends query to database server for execution
$d–>setErrorHandling(PEAR_ERROR_DIE)
Terminate program and print default error
messages if any subsequent errors occur
40. Retrieval Queries from Database
Tables
$q
Variable that holds query result
$q->fetchRow() retrieve next record in query
result and control loop
$allresult = $d->getAll(query)
Holds all the records in a query result in a single
variable called $allresult
41. Summary
PHP scripting language
Very popular for Web database programming
PHP basics for Web programming
Data types
Database commands include:
Creating tables, inserting new records, and
retrieving database records
Looping over a query result
42. SQL consists of only 4 statements, sometimes
referred to as CRUD:
–Create - INSERT - to store new data
–Read - SELECT - to retrieve data
–Update - UPDATE - to change or modify
data.
–Delete - DELETE - delete or remove data
Structured Query Language
43. Relationships within Relational Database
• Relationship classifications
– 1:1
– 1:M
– M:N
• E-R Model
– ERD Maps E-R model
– Entities - of tabellen. Ze worden weergegeven door
rechthoeken met daarin de naam van het
entiteitstype en soms een opsomming van de
attributen
– Relationships - dit zijn verbanden tussen de
entiteittypen. We worden weergegeven door lijnen
tussen de verbonden eniteitstypen
44. ERD Symbols
• Rectangles represent entities
• Diamonds represent the relationship(s) between
the entities
• “1” side of relationship
– Number 1 in Chen Model
– Bar crossing line in Crow’s Feet Model
• “Many” relationships
– Letter “M” and “N” in Chen Model
– Three pronged “Crow’s foot” in Crow’s Feet Model
54. Applications of B+-Trees
(1) Search key of B+-Tree is primary key for data file, index
is dense (data file may or may not be sorted by pk):
– There is one key-pointer pair in a leaf for every
record of the data file
(2) Data file is sorted by its primary key:
– B+-Tree is a sparse index with one key-pointer
pair at a leaf for each block of the file
(3) Data file is sorted by an attribute that is not a key
(search key for B+-Tree):
– For each value K that appears in the data file,
there is one key-pointer pair at the leaf.
Pointer goes to the first of the records that
have K as their sort-key value
55. • Database ontwerpen - Gegevensmodellering
• Welke informatie zal de database moeten
kunnen leveren ?
• Welke gegevens moet de database bevatten
om in de vastgestelde informatiebehoefte te
voorzien ?
• Welke gegevens zijn beschikbaar ?
• Hoe is het verband tussen de benodigde
gegevens ?
56. • Database ontwerpen - Gegevensmodellering
• Top-down (beginnen bij de grote lijnen)
• Bottum-Up (beginned bij detail)
• Een veld veranderen of bijmaken is veel moeilijker
dan een record invoegen
57. Normalization
Normalisatie (komen tot een gegevensmodel waarin geen enkel feit
redundant, dat will zeggen meer dan één keer, voorkomt
Nulde normaalvorm: begin
Gegevensmodel uit initiele informatieanalyse, typische herhaalde
attributen of herhaalde groepen van attributen
Eerste normaalvorm: verwijder herhaalde groepen
Tweede normaalvorm: bekijk samengestelde sleutels
Derde normaalvorm: bekijk transitieve afhankelijkheden
58. Normalization
• Normalization is used to design a set of relation
schemas that is optimal from the point of view of
database updating
• The normalization starts from a universal relation
schema
• There are six normal forms, of which only three are
based on functional dependencies
• Normal forms define to which extent we should
normalize
• The Synthesis algorithm and the Decomposition
algorithm represent the formal normalization
methods
59. Finishing Database Design
• To complete a database schema design,
after the normalization is done, one has to
define interrelation constraints (referential
integrity constraints), as well
• Normalization results in a set of relation
schema
• That design is suitable for efficient database
update
• But, it can slow down execution of queries
• Sometimes, it is advisable to undertake
controlled de normalization
60. Privileges in SQL
• select: allows read access to relation,or the
ability to query using the view
– Example: grant users U1, U2, and U3 select
authorization on the branch relation:
grant select on branch to U1, U2, U3
• insert: the ability to insert tuples
• update: the ability to update using the SQL
update statement
• delete: the ability to delete tuples.
• all privileges: used as a short form for all the
allowable privileges
61. Authorization
Forms of authorization on parts of the database:
• Read - allows reading, but not modification of data.
• Insert - allows insertion of new data, but not modification of existing
data.
• Update - allows modification, but not deletion of data.
• Delete - allows deletion of data.
Forms of authorization to modify the database schema
Index - allows creation and deletion of indices.
• Resources - allows creation of new relations.
• Alteration - allows addition or deletion of attributes in a relation.
• Drop - allows deletion of relations.
62. Authorization Specification in SQL
• The grant statement is used to confer authorization
grant <privilege list>
on <relation name or view name> to <user list>
• <user list> is:
– a user-id
– public, which allows all valid users the privilege
granted
– A role
• Granting a privilege on a view does not imply granting
any privileges on the underlying relations.
• The grantor of the privilege must already hold the
privilege on the specified item (or be the database
administrator).
63. Revoking Authorization in SQL
• The revoke statement is used to revoke authorization.
revoke <privilege list>
on <relation name or view name> from <user list>
• Example:
revoke select on branch from U1, U2, U3
• <privilege-list> may be all to revoke all privileges the revokee
may hold.
• If <revokee-list> includes public, all users lose the privilege
except those granted it explicitly.
• If the same privilege was granted twice to the same user by
different grantees, the user may retain the privilege after the
revocation.
• All privileges that depend on the privilege being revoked are
also revoked.
64. Data Dictionary and System Catalog
• meta-gegevens, data-dictionary: een soort
geautomatiseerd naslagwerk met een overzicht over
alle gebruikers, gegevens en geheugens
• Data dictionary
– Provides detailed account of all tables found within
database
– Metadata
– Attribute names and characteristics
• System catalog
– Detailed data dictionary
– System-created database
– Stores database characteristics and contents
– Tables can be queried just like any other tables
– Automatically produces database documentation
65. Functions and Procedures
• SQL:1999 supports functions and procedures
– Functions/procedures can be written in SQL itself, or in an
external programming language
– Functions are particularly useful with specialized data types
such as images and geometric objects
• Example: functions to check if polygons overlap, or to compare
images for similarity
– Some database systems support table-valued functions,
which can return a relation as a result
• SQL:1999 also supports a rich set of imperative
constructs, including
– Loops, if-then-else, assignment
• Many databases have proprietary procedural
extensions to SQL that differ from SQL:1999
66. Procedural Extensions and Stored Procedures
• SQL provides a module language
– Permits definition of procedures in SQL, with
if-then-else statements, for and while loops,
etc.
• Stored Procedures
– Can store procedures in the database
– then execute them using the call statement
– permit external applications to operate on the
database without knowing about internal
details
68. Transaction Processing - Basics
• A transaction is a logical unit of a
database processing
• Transaction processing systems
include large databases and hundreds
of concurrent users
• Examples of these systems are:
– airline reservations,
– banking,
– credit card processing,
– supermarket checkout, and
– similar systems
69. Multi - User Database Systems
• One way to classify DBMSs is according to the
number of concurrent users:
– single user
– multi-user
• Majority of database systems are of a multi - user
type
• Concurrent (or simultaneous from the user point of
view) database usage is possible thanks to
computer multiprogramming
• Multiprogramming operating systems execute some
commands of one process, then suspend this
process and execute some commands of another
process
• After a while, the execution of the first process is
resumed at the point where it was interrupted
• This type of process execution is called
interleaving
70. The Notion of a Transaction
• A transaction is a logical unit of a database
processing that includes one or more
database access operations (read and write )
• Each execution of a transaction program is a
transaction
• If a transaction finishes successfully, all
data it has changed are visible to other
transactions
• If a transaction fails for any reason, DBMS
has to undo all the changes that the
transaction made against the database
71. Transactions (continued)
• In multi – user transaction processing
systems, users execute database transactions
concurrently
• Most often, concurrent means interleaved
• The users can attempt to modify the same
database items at the same time, and that is
potential source of database inconsistency
• Checking database integrity constraints is
not enough to protect a database from
threats induced by its concurrent usage
72. Commit
• A transaction reaches its commit point when all of its
operations that access the database have been
executed successfully and the effect of all transaction
operations on the database have been recorded in the
log
• Beyond the commit point, the effect of a transaction
is assumed to be permanently recorded in the
database
• If a transaction does not reach its commit point and
there is no [commit, T ]
record in the log file, this transaction has to be rolled
back
• Read committed protocol:
– If a transaction T updates a database item A, other transactions can read A only after T
has committed
73. Sources of Database Inconsistency
Uncontrolled execution of database
transactions in a multi – user
environment can lead to database
inconsistency
• There is a number of possible sources
of database inconsistency
• The typical ones are:
– lost update problem,
– dirty read problem, and
– unrepeatable read problem
74. Lost Update Problem
T1
T2
read_item ( X )
X = X – N
write_item (X )
read_item (X)
X = X + M
write_item (X)
t
i
m
e
After termination of T2, X = X + M.
T1's update to X is lost because
T2 wrote over X
Generally, lost update
problem is characterized by:
•T2 reads X,
•T1 writes X, and
•T2 writes X
75. Dirty Read Problem
T1
T2
read_item ( X )
X = X – N
write_item (X )
read_item ( Y )
T1 fails
read_item (X)
X = X + M
write_item (X)
t
i
m
e
Generally, dirty read
problem is characterized
by:
•T1 writes X,
•T2 reads X, and
•T1 fails
Since T1 failed, DBMS is
going to undo the changes
it made against the
database
T2 has already read item X =
X - N value, and that value is
going to be altered by DBMS
back to X
76. Unrepeatable Read Problem
T1
T2
read_item ( X )
read_item (X )
read_item (X)
X = X + M
write_item (X)
t
i
m
e
Transaction T1 has got two different values of X in two subsequent
reads, because T2 has changed it in the meantime
Even if T1 didn't execute the second read command, it would use a
stale X value, and that's another form of the unrepeatable read problem
Generally, unrepeatable read
problem is characterized by:
•T1 reads X,
•T2 writes X, and
•T1 reads X
77. Prevention of Concurrency Anomalies
• Lost update, dirty read and unrepeatable
read are called concurrency anomalies
• The concurrency control part of a DBMS
has the task to prevent these problems
• DBMS is responsible to ensure that either
all operations of a transaction are
successfully executed and their effect is
permanently stored in the database, or it
happens as if the transaction were even not
started
• The effect of a partially executed
transaction has to be undone
78. Types of Failures
• A transaction can be partially
executed due to:
– A computer failure(hardware, software,
network,…)
– A transaction error (overflow, division by
zero,…)
– An exception condition (lack of data)
– A concurrency control enforcement (dead lock,
timeout,…)
– An abort command in the transaction program
79. Transaction State Transition Diagram
Failed
abort
begin
transaction
Active
read,
write
Partially
committed
end
transaction
Committed
commit
Terminated
Program
command
Transaction
state
80. Log File
• To be able to recover from failures
DBMS maintains a log file
• Typically, a log file contains records
with following contents:
[start_transaction, T ] (*T is transaction
id*)
[write_item, T, X, old_value, new_value]
[read_item,T, X ] (*optional*)
[commit, T ]
[abort, T ]
81. Summary
• Executing transaction in an interleaved way
may bring a database in an inconsistent state
• Transaction anomalies are:
– Lost update,
– Dirty read, and
– Unrepeatable read
• A DBMS is responsible to ensure that either
all operations of a transaction are successfully
executed, or it is rolled back
• Log file records all important events (start,
read, write, commit)
• When a transaction reaches its commit point,
everything is safely stored in a database (or a
log file)
83. Views and Decision Support
• OLAP queries are typically aggregate queries.
– Precomputation is essential for interactive response
times.
– The CUBE is in fact a collection of aggregate
queries, and precomputation is especially important:
lots of work on what is best to precompute given a
limited amount of space to store precomputed
results.
• Warehouses can be thought of as a collection
of asynchronously replicated tables and
periodically maintained views.
– Has renewed interest in view maintenance!
84. View Modification (Evaluate On Demand)
CREATE VIEW RegionalSales(category,sales,state)
AS SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid
SELECT R.category, R.state, SUM(R.sales)
FROM RegionalSales AS R GROUP BY R.category, R.state
SELECT R.category, R.state, SUM(R.sales)
FROM (SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid) AS R
GROUP BY R.category, R.state
View
Query
Modified
Query
85. View Materialization (Precomputation)
• Suppose we precompute RegionalSales and store
it with a clustered B+ tree index on
[category,state,sales].
– Then, previous query can be answered by an index-
only scan.
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R.category=“Laptop”
GROUP BY R.state
SELECT R.state, SUM(R.sales)
FROM RegionalSales R
WHERE R. state=“Wisconsin”
GROUP BY R.category
Index on precomputed view
is great!
Index is less useful (must
scan entire leaf level).
86. Materialized Views
• A view whose tuples are stored in the database
is said to be materialized.
– Provides fast access, like a (very high-level) cache.
– Need to maintain the view as the underlying tables
change.
– Ideally, we want incremental view maintenance
algorithms.
• Close relationship to data warehousing, OLAP,
(asynchronously) maintaining distributed
databases, checking integrity constraints, and
evaluating rules and triggers.
87. Issues in View Materialization
• What views should we materialize, and
what indexes should we build on the
precomputed results?
• Given a query and a set of materialized
views, can we use the materialized
views to answer the query?
• How frequently should we refresh
materialized views to make them
consistent with the underlying tables?
(And how can we do this
incrementally?)