2. Outline
• Today:
– introduction to databases
– Introduction on how to work with MSAccess
• Next coming days: practical excercises with
MSAccess
3. Data mining
• Exploration of data
• Prerequisite: data should be available in a
minable format - database
• Database = electronic document storing data
– Non-relational: 1 bulk system with non-related
items (eg. Msexcel files, text-documents, non-
related-tables)
– Relational: all items (tables) are linked to each
other (see further)
4. Why using a database
• Relational database:
– All your data is stored in 1 file
• Easy to retrieve data
• Easy to backup
– Data and metadata stored together
• Data ...
• Metadata: data about the data (documentation)
– Many data-files contain undocumented values:
– Species A has an abundance of 17 ( meaning of value 17?)
5. Why using a database
• All data in a good relational designed database
is only stored once:
– Example: species list typing errors
• Nudora thorakista
• Nudora thorrakista
• Nudora thorakhista
• Nudora thorakisa
– 1 species species richness calculation: 4
– Solution: 1 table with each species 1 record and
use it as a reference
6. Why using a database
• Data is much more rigid ...
– More difficult to make errors
– E.g. Sorting in excell
11. Table designs ...
• A table consists of a series of Columns ...
• Each record as such:
– Different fields
– Design of table must be done
before data is entered
– Each field: name, data type
– Each field can also by formatted layout
Record
ColumnField
12. Table designs ...
• Field types:
– Numeric – integer/double
– Text
– Date/Time
– Memo
– Autonumber ID
– Yes/No
13. Task on field types:
• 12
• 15 jan 1988
• hallo
• 12,456
• 12:56
• Azdazdazd azdda zda azdd dad zd dadazdzd
azdazddazdd azdazd azdazd dzdzdzzd ada zzd
azdaz dda azd da az d z azdzadazd a zd a azd
azd z dd da a z a z zd d ddaa zd
• 09:89
14. Special field in a table: key
• A key = a unique identifier for a record
– Example: pasport number:
• Number in a database which is unique and relates to all data
about you
– Each record in a table gets also a key
– This key is used to link tables to each other
– Example:
• Nudora sp1 – id: 123776
• Nudora sp2 – id: 34688
– Advantage: species name changes: linked taxa remain
linked
15. Linking tables through id’s
• Storing numbers is most effecient way to store
data:
• Nudora sp1 is found in the north sea with a
density of 32
• Species 123776 is found in station 2 (North
sea) with a density of 32
• Record in table density becomes:
123776 | 2 | 32
16. Setting up relations between tables
• Relations: links between tables
• Connecting tables through certain fields in a
rigid way to each other
• Advantage: database becomes a strong unity
• Types of relations:
– 1 to many
– Many to many ( = 2 times 1 to many)
17. Examples of relations
• Table places: field country (numeric)
• Table countries – list of countries,
each country has unique id
• Relation is made between:
– Field country in places
– Field id in country
• One to many relation: 1 record in table
country linked to multiple records in places
• No deleting of countries possible
Places
Country
18. Examples of relations
• Many to many
• Id of sample
• Id of species
• Table density: unique combination of sample,
species ...
Species
Sample
Density
19. Queries
• All data in database:
– Next step: get it out again
– Selections on 1 table: by using filters
– Selections on multiple tables: using queries
– Queries can be saved and reused
– Queries can be the basis for new queries
32. Exporting data
• From msaccess it is possible to export to
different formats!
• Tables, queries, ...
• Exports can be used to do further data mining:
– Through MSExcell making graphs
– To do statistical analysis
35. Step by step demonstration
• Open a database
• Different items in database
• Open tables, sorting, filtering
• Table design
• Relationships
• Queries
36. Query operators
= equals
> Larger than
< Smaller than
>= larger than or equals
Between ... And ...
Is null
Like ...
Not like ...
38. Query operators
and both true
or at least 1 true
< Smaller than
>= larger than or equals
Between ... And ...
Is null
Like ...
Not like ... >"q*" and <"u*" VOORNAAM René, Robbie, Stefan, Stijn, Tim, Tristam
="r*" or "s*" VOORNAAM Robbie, Stefan, Stijn