BSYS 2060 Lecture 2 – Data Literacy & Data Modeling
Agenda
•   Announcements
•   Digital Literacy
•   Data Modeling
•   Relationship Types
•   Example
Bring headphones to all labs from here on in!




    Source: CC by 2.0, http://www.flickr.com/photos/bobjudge/3569973941/
https://zenportfolios.ca/bcit-bsys-2060-2012/week-0/
http://www.lynda.com/
Announcements
• If you didn’t already do the “Week 0” exercise, you
  should do ASAP!
• Sign up for Lynda.com ASAP
   – $21.67 US for duration of semester
   – Videos to watch will be assigned shortly after today’s
     lecture
   – Bring your headphones to all future labs!
• Join the BSYS 2060 group on zenportfolios.ca if you
  didn’t already
   – Upload an avatar if you didn’t already
   – We will start using this online group more in the weeks to
     come, especially for the project
If you didn’t do this already…
Go to zenportfolios.ca, log-in, click on Groups, search for 2060
Then join the BSYS-2060-2012 group
Digital Literacy
http://www.nytimes.com/2012/03/28/technology/for-an-edge-on-the-internet-computer-code-gains-a-following.html?pagewanted=all
“Inasmuch as you need to know how to read
English, you need to have some
understanding of the code that builds the
Web,” said Sarah Henry, 39, an investment
manager who lives in Wayne, Pa. “It is
fundamental to the way the world is
organized and the way people think about
things these days.”
We live in an information age where data is king.
Data often most valuable asset of a company
e.g. Aeroplan worth more than Air Canada
We live in a data mash-up world.
WordPress that uses open source   Twitter API / Integration
MySQL database back-end
Take control of your data. Best thing is to buy your own domain name
and build your own site and web presence there.




       http://jacobjpope.com
Where does data belong?
Diaspora           The average middle aged person has their data spread out over
                   A 1000 different database locations


                                                                Many “fly by
  Self-hosted                                                   night” Web 2.0
  Website                                                       apps
  containing all
  your data




 You control                                                   You don’t
 completely                                                    control the
                                                               data at all
  Not so easy...                                                   convenient

                             Trend is towards you controlling your data
Who owns data?
Building a database for an organization?
What is the organization’s mission?
What is the project’s mission?
What data is required to make good
decisions or for other reporting needs?
What data is currently being collected?
How is the data collected being used?
What data should be collected?
What are the main data entities?
These will become your tables
What are the required fields and field data types?
What are the relationships between the tables?
Select your tools?
e.g. MS Access, MS SQL, MySQL, Oracle
What is being used now if anything?
Be careful of
 the hammer!
 To a hammer, everything
 looks like a nail. If you
 only know MS Access,
 you may see always see
 MS Access as the solution
 even when it’s not. If you
 only know MS Excel, it
 seems like the perfect
 choice!


Source:
http://www.flickr.com/photos/fixersphotos/3199566032/
Normalization

A brief introduction to the first three
               forms...
“Normalization”
• In the field of relational database design,
  normalization is a systematic way of ensuring
  that a database structure is suitable for
  general-purpose querying and free of certain
  undesirable characteristics—insertion, update,
  and deletion anomalies—that could lead to a
  loss of data integrity.

    Codd, E.F. The Relational Model for Database Management: Version 2. Addison-Wesley (1990), p. 271
“...insertion, update, and deletion
            anomalies...”




      Until a Course Code is assigned to this record, it can not
      be inserted in the table
“...insertion, update, and deletion
            anomalies...”




      An edit made to one record may not be made to ALL
      records for the same employee
“...insertion, update, and deletion
            anomalies...”




      If the Course Code is deleted the information for the
      Faculty Member will be lost.
Three Normal Forms
• 1NF
  – Eliminate repeating groups
  – No redundant data
• 2NF
  – Eliminate independent data
  – All fields depend on Primary Key
• 3NF
  – Eliminate dependency on non-key fields
  – Fields do not depend on each other
Un-Normalized Table (i.e. Excel)
First Normal Form (1NF)

                 0NF – “un-normalized”




                                 1NF
Second Normal Form (2NF)

                       1NF




                           2NF
Third Normal Form (3NF)

                          2NF




                          3NF

Lecture2 slides-march-29

  • 1.
    BSYS 2060 Lecture2 – Data Literacy & Data Modeling
  • 2.
    Agenda • Announcements • Digital Literacy • Data Modeling • Relationship Types • Example
  • 3.
    Bring headphones toall labs from here on in! Source: CC by 2.0, http://www.flickr.com/photos/bobjudge/3569973941/
  • 4.
  • 5.
  • 7.
    Announcements • If youdidn’t already do the “Week 0” exercise, you should do ASAP! • Sign up for Lynda.com ASAP – $21.67 US for duration of semester – Videos to watch will be assigned shortly after today’s lecture – Bring your headphones to all future labs! • Join the BSYS 2060 group on zenportfolios.ca if you didn’t already – Upload an avatar if you didn’t already – We will start using this online group more in the weeks to come, especially for the project
  • 8.
    If you didn’tdo this already… Go to zenportfolios.ca, log-in, click on Groups, search for 2060 Then join the BSYS-2060-2012 group
  • 9.
  • 10.
  • 11.
    “Inasmuch as youneed to know how to read English, you need to have some understanding of the code that builds the Web,” said Sarah Henry, 39, an investment manager who lives in Wayne, Pa. “It is fundamental to the way the world is organized and the way people think about things these days.”
  • 12.
    We live inan information age where data is king.
  • 13.
    Data often mostvaluable asset of a company e.g. Aeroplan worth more than Air Canada
  • 14.
    We live ina data mash-up world.
  • 16.
    WordPress that usesopen source Twitter API / Integration MySQL database back-end
  • 17.
    Take control ofyour data. Best thing is to buy your own domain name and build your own site and web presence there. http://jacobjpope.com
  • 18.
  • 19.
    Diaspora The average middle aged person has their data spread out over A 1000 different database locations Many “fly by Self-hosted night” Web 2.0 Website apps containing all your data You control You don’t completely control the data at all Not so easy... convenient Trend is towards you controlling your data
  • 24.
  • 26.
    Building a databasefor an organization?
  • 27.
    What is theorganization’s mission?
  • 28.
    What is theproject’s mission?
  • 29.
    What data isrequired to make good decisions or for other reporting needs?
  • 30.
    What data iscurrently being collected?
  • 31.
    How is thedata collected being used?
  • 32.
    What data shouldbe collected?
  • 33.
    What are themain data entities? These will become your tables
  • 34.
    What are therequired fields and field data types?
  • 35.
    What are therelationships between the tables?
  • 36.
    Select your tools? e.g.MS Access, MS SQL, MySQL, Oracle What is being used now if anything?
  • 37.
    Be careful of the hammer! To a hammer, everything looks like a nail. If you only know MS Access, you may see always see MS Access as the solution even when it’s not. If you only know MS Excel, it seems like the perfect choice! Source: http://www.flickr.com/photos/fixersphotos/3199566032/
  • 38.
    Normalization A brief introductionto the first three forms...
  • 39.
    “Normalization” • In thefield of relational database design, normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity. Codd, E.F. The Relational Model for Database Management: Version 2. Addison-Wesley (1990), p. 271
  • 41.
    “...insertion, update, anddeletion anomalies...” Until a Course Code is assigned to this record, it can not be inserted in the table
  • 42.
    “...insertion, update, anddeletion anomalies...” An edit made to one record may not be made to ALL records for the same employee
  • 43.
    “...insertion, update, anddeletion anomalies...” If the Course Code is deleted the information for the Faculty Member will be lost.
  • 44.
    Three Normal Forms •1NF – Eliminate repeating groups – No redundant data • 2NF – Eliminate independent data – All fields depend on Primary Key • 3NF – Eliminate dependency on non-key fields – Fields do not depend on each other
  • 45.
  • 46.
    First Normal Form(1NF) 0NF – “un-normalized” 1NF
  • 47.
    Second Normal Form(2NF) 1NF 2NF
  • 48.
    Third Normal Form(3NF) 2NF 3NF

Editor's Notes

  • #2 http://www.google.ca/imgres?imgurl=http://tctechcrunch.files.wordpress.com/2010/03/binary_data.jpg&imgrefurl=http://techcrunch.com/2010/03/16/big-data-freedom/&usg=__Sh_4EN0mwwOH35FqrsJRtvJEZuA=&h=600&w=800&sz=187&hl=en&start=30&zoom=1&tbnid=7t1PdGOKt490rM:&tbnh=127&tbnw=169&ei=ONmUTf_3KYf6swPKweDVBQ&prev=/images%3Fq%3Ddata%26um%3D1%26hl%3Den%26sa%3DN%26biw%3D1024%26bih%3D574%26tbs%3Disch:10%2C976&um=1&itbs=1&iact=rc&dur=284&oei=LtmUTY7OAs_OiAKz0cX_CA&page=3&ndsp=15&ved=1t:429,r:7,s:30&tx=105&ty=34&biw=1024&bih=574
  • #13 http://www.flickr.com/photos/alismith44/269843032/