Your SlideShare is downloading. ×
01 intro
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

01 intro

196
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
196
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CPS 216: Advanced Database Systems Shivnath Babu Fall 2006
  • 2. Outline for Today
    • What this class is about: Data management
    • What we will cover in this class
    • Logistics
  • 3. Data Management Query Query Query D ata B ase M anagement S ystem (DBMS) Data Application
  • 4. Example: At a Company Employee Department Query 1: Is there an employee named “Nemo”? Query 2: What is “Nemo’s” salary? Query 3: How many departments are there in the company? Query 5: What is the name of “Nemo’s” department? Query 4: How many employees have Salary >= 80K? Query 6: How many employees are there in the “ Accounts” department? … 85K 76K 79K 120K Salary … … … … … 34 Ray 52 … 89 Gill 40 … 156 Dory 20 … 12 Nemo 10 … DeptID Name ID … … … … Marketing 156 … HR 89 … Accounts 34 … IT 12 … Name ID
  • 5. D ata B ase M anagement S ystem (DBMS) DBMS Data High-level Query Q Answer Translates Q into best execution plan for current conditions, runs plan
  • 6. Example: Store that Sells Cars Cars Owners Owners of Honda Accords who are <= 23 years old 156 Accord Honda … … … 89 Cooper Mini 34 Camry Toyota 12 Accord Honda OwnerID Model Make 21 Dory 156 … … … 36 Gill 89 42 Ray 34 22 Nemo 12 Age Name ID Filter (Make = Honda and Model = Accord) Join (Cars.OwnerID = Owners.ID) 156 12 OwnerID 21 Dory 156 Accord Honda 22 Nemo 12 Accord Honda Age Name ID Model Make Filter (Age <= 23)
  • 7. D ata B ase M anagement S ystem (DBMS) DBMS Data High-level Query Q Answer Translates Q into best execution plan for current conditions, runs plan Keeps data safe and correct despite failures, concurrent updates, online processing, etc.
  • 8. DBMS is multi-user
    • Example Get account balance from database; If balance > amount of withdrawal then balance = balance - amount of withdrawal; dispense cash; store new balance into database;
    • Homer at ATM1 withdraws $100
    • Marge at ATM2 withdraws $50
    • Initial balance = $400, final balance = ?
      • Should be $250 no matter who goes first
  • 9. Final balance = $300 read balance; $400 if balance > amount then balance = balance - amount; $300 write balance; $300 read balance; $400 if balance > amount then balance = balance - amount; $350 write balance; $350 Homer withdraws $100: Marge withdraws $50:
  • 10. Final balance = $350 read balance; $400 if balance > amount then balance = balance - amount; $300 write balance; $300 read balance; $400 if balance > amount then balance = balance - amount; $350 write balance; $350 Homer withdraws $100: Marge withdraws $50:
  • 11. Concurrency control in DBMS
    • Similar to concurrent programming problems
      • But data is not all in main-memory
    • Appears similar to file system concurrent access?
      • Approach taken by MySQL initially; now MySQL offers better alternatives
    • But want to control at much finer granularity
        • Or else one withdrawal would lock up all accounts!
  • 12. Recovery in DBMS
    • Example: balance transfer decrement the balance of account X by $100; increment the balance of account Y by $100;
    • Scenario 1: Power goes out after the first instruction
    • Scenario 2: DBMS buffers and updates data in memory (for efficiency); before they are written back to disk, power goes out
    • Log updates; undo / redo during recovery
  • 13. D ata B ase M anagement S ystem (DBMS) DBMS Data High-level Query Q Answer Translates Q into best execution plan for current conditions, runs plan Keeps data safe and correct despite failures, concurrent updates, online processing, etc.
  • 14. Summary of modern DBMS features
    • Persistent storage of data
    • Logical data model; declarative queries and updates ! physical data independence
    • Multi-user concurrent access
    • Safety from system failures
    • Performance, performance, performance
      • Massive amounts of data (terabytes ~ petabytes)
      • High throughput (thousands ~ millions transactions per minute)
      • High availability ( ¸ 99.999% uptime)
  • 15. Modern DBMS Architecture Applications OS SQL File system API calls DBMS Disk(s) Parser Query Optimizer Query Executor Storage Manager Logical query plan Physical query plan Access method API calls Storage system API calls
  • 16. Course Outline
    • 50% of the class is about core DBMS concepts
      • Query execution, query optimization, transactions, recovery, etc.
      • Textbook material
    • 50% of the class is on “what is happening today in data management”
      • Data streams
      • Web search – Google, Yahoo!
      • XML and data integration
      • Data mining
      • Sensor data management
  • 17. Using a Traditional DBMS User/Application Loader Table R Table S Query Result Result … Query …
  • 18. New Approach for Data Streams User/Application Stream Query Processor Register Continuous Query (Standing Query) Input streams Result
  • 19. Example Continuous (Standing) Queries
    • Web
      • Amazon’s best sellers over last hour
    • Network Intrusion Detection
      • Track HTTP packets with destination address matching a prefix in given table and content matching “*.ida”
    • Finance
      • Monitor NASDAQ stocks between $20 and $200 that have moved down more than 2% in the last 20 minutes
  • 20. Course Outline
    • 50% of the class is about core DBMS concepts
      • Query execution, query optimization, transactions, recovery, etc.
      • Textbook material
    • 50% of the class is on “what is happening today in data management”
      • Data streams
      • Web search – Google, Yahoo!
      • XML and data integration
      • Data mining
      • Sensor data management
  • 21. New Challenges in DBMSs DBMS Data TeraBytes  PetaBytes <CD> <TITLE>Empire B.</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia </COMPANY> <PRICE>10.90</PRICE> </CD> High-level Query Q Answer
  • 22. Course Logistics
    • Recommended reference: Database Systems: The Complete Book , by H. Garcia-Molina, J. D. Ullman, and J. Widom
    • Web site: http://www.cs.duke.edu/education/courses/fall06/cps216
    • Grading:
      • Homework Assignments 15%
      • Project 25%
      • Midterm 25%
      • Final 35%
  • 23. Summary: Data Management is Important
    • Core aspect of most sciences and engineering today
    • Core need in industry
    • Cool mix of theory and systems
    • Chances are you will find something interesting even if you primary interest is elsewhere

×