SlideShare a Scribd company logo
1 of 5
Download to read offline
Overview
                                                                       We study the internal of DBMSs
                                                                           Principles of relational DBMSs
   Database Management Systems                                              • Emphasize on query & transaction processing
                                                                              techniques
                                                                           Advanced database systems & applications
                                                                            • OODBMS, XML database, data warehousing, OLAP,
                                                                              data mining
                      Prof. Weining Zhang
                                                                       Course work includes
                   Dept. of Computer Science
                                                                           Homework, 2 midterm exams, no final exam
                University of Texas at San Antonio
                                                                           Programming assignments in Java


                                                                       W. Zhang                   Introduction                       2




Teaching Staff                                                        Communication
    Instructor: Prof. Weining Zhang                                    Web page:
             Office: SB 4.01.19
                                                                       http://www.cs.utsa.edu/~wzhang/cs5443/home
                                                                           Contains everything about the course: syllabus,
             Phone: 458-5557
                                                                           announcement, assignments, project, lecture notes, etc.
             Email: wzhang@cs.utsa.edu
             Office hour: MW 5:00 – 6:00 pm
                                                                       You should check course web pages regularly.
                          T 4:00-5:00 pm                               Mailing list: 5443@cs.utsa.edu
                          and by appointment                               Include your CS email address; you may need to forward
                                                                           emails to your regular email address




 W. Zhang                        Introduction                     3    W. Zhang                   Introduction                       4




Textbooks                                                             Other Textbooks
    Required textbook:                                                    Fundamentals of Database Systems, 5th ed., by
            Database Management systems,        3rd   ed., by             Elmasri & Navathe
            Ramakrishnan & Gehrke                                         Other database books in the Main Library
    Recommended textbook:
            Principles of Distributed Database Systems, by M.
            Ozsu & P. Valduriez
            Database System: the Complete Book, by Garcia-
            Molina, Ullman & Widom
            Database system concepts, 5th ed., by Silberschatz,
            Korth & Sudarshan


 W. Zhang                        Introduction                     5    W. Zhang                   Introduction                       6
Prerequisite                                                                             Grading
 CS3743 or equivalent, or extensive experience with                                          Programming assignments 20%
 database & DB application
                                                                                             Homework 20%
 Strong Java programming skills
                                                                                             Midterm I 25%
 Data structures, algorithms, OO programming, etc.
                                                                                             Midterm II 25%
 Mathematics including logic, sets, algebra, …
                                                                                             Intangibles 10%




 W. Zhang                              Introduction                                 7     W. Zhang                     Introduction                      8




Programming Assignments                                                                  Introduction to Database Systems
    Implement several components of a simple DBMS                                         A database system consists of
    called Minibase (Java version), such as,                                                  Database management system: the software
            Buffer Manager                                                                    Databases: the data
            Heap File                                                                     A DBMS needs to provide
            Hash-based Index                                                                   persistent data storage
            Relational operators
                                                                                               declarative query language for efficient data retrieval
            Query processing
                                                                                               shared access to data by different applications
    Work in groups of 2                                                                        data security
    Programming in Java, on Linux or Windows,                                                  data integrity …
    recommend using Eclipse IDE

 W. Zhang                              Introduction                                 9     W. Zhang                     Introduction                      10




An RDBMS Architecture                                                                    Storage Management
                                                                                          Data is stored on disks, and processed in the main
  Web forms              Application front end                 SQL interface
                                                                                          memory
                                SQL Commands                                              Since disk I/Os are costly, search structures, such as,
                                                                     Query                indexes, must be used to achieve efficient data access
                        Parser                 Optimizer
                                                                     Evaluation
 Concurrency        Plan Executor        Operator Evaluator Engine
                                                                                          DBMS components that manage different types of
 Control                                                                                  storage include
 Xction Man              File & Access Methods
                                                                       Recovery               Disk Manager: manages pages on disk drive
                                                                       Manager                Buffer Manager: manages pages in main memory buffer
 Lock Man            Buffer Manager & Disk Manager                                DBMS



                  Index files                         Sys. catalog
                                  Data files
 W. Zhang                              Introduction                                11     W. Zhang                     Introduction                      12
File Organization                                          Query Processing
 Data records are logically organized in files and          DBMS evaluates declarative queries by executing an
 physically stored on disk pages                            optimal query plan that is expressed using relational
 File organization must consider the format and size of     algebraic operations.
 data records                                               A DBMS must evaluate algebraic operations
 In addition to simple files of raw data, DBMS also         efficiently.
 maintains search structures, such as,                      The algorithms and the costs of relational algebraic
     Ordering                                               operations, such as, selection and join, depend
     Hashing                                                critically on
     Indexing                                                   types of query condition
 to reduce access costs                                         specifics of file organizations

 W. Zhang                 Introduction                13    W. Zhang                     Introduction                   14




Query Optimization                                         Transaction Processing
                                                            A transaction models the execution of a database
    For easy of use, query languages are declarative.
                                                            application, which typically updates the data in
    The system must figure out an efficient evaluation
                                                            databases.
    plan
                                                            Transaction management must deal with concurrent
    The goal is to answer a query with as few disk I/O
                                                            transactions and possible system failures.
    as possible
    The system uses statistics of the data & heuristics
    to decide how to process the query




 W. Zhang                 Introduction                15    W. Zhang                     Introduction                   16




Recovery                                                   Concurrency Control
 The recovery manager protects data integrity in case of    Concurrent execution of application programs is
 system crash.                                              essential for good DBMS performance.
 The system guarantees that either all operations of a          Need to keep CPU busy while performing I/O operations
                                                                (frequent & relatively slow).
 transaction or none of them are performed, and updates
 made by completed transactions are persistent.             Interleaving actions of different user programs can lead
                                                            to inconsistency: e.g., check is cleared while account
                                                            balance is being computed.
                                                            Concurrency control subsystem ensures such problems
                                                            don’t arise: users can pretend they are using a single-
                                                            user system.

 W. Zhang                 Introduction                17    W. Zhang                     Introduction                   18
Advanced Hashing & Indexing                                        Distributed DBMS
 Relational DBMS support hashing & B+ tree indexing                 Modern corporations have data, control, & application
 New DBMSs & DB applications need more                              distributed globally
 sophisticated search structures                                    Multiple databases at geographically dispersed
     Hashing with variable size hash table or multiple keys         locations need to cooperate to answer queries with
     Indexes for spatial, multidimensional data (common in          distributed data
     multimedia DBSs, Data warehousing, OLAP, …)                    Concurrent transaction processing and recovery are still
                                                                    major issues




 W. Zhang                    Introduction                     19    W. Zhang                   Introduction                20




Parallel DBMS                                                      XML & Semistructured DBMS
 Both centralized and distributed databases may use                 Data in RDB, OODB, & ORDB are structured (with
 multiple processors to evaluate queries                            rigid schemas)
 Parallel system architecture requires new algorithms               Data on the Web (and other applications) are
 for query evaluation and optimization                              semistructured
 Performance concerns include                                           HTML, XML, Text, …
     Ability to scale up                                            Need new concepts and techniques
     Ability to speed up                                                Data model, query language
                                                                        Query processing & optimization
                                                                        Storage management
                                                                        Update, transaction processing, CC, …


 W. Zhang                    Introduction                     21    W. Zhang                   Introduction                22




Data Warehousing & OLAP                                            Data Mining
 Corporations need to put all available data into use               Data contains important patterns useful for making
 when making vital business decisions                               sound business decisions
 Need to have technology to integrate data from all                 Databases need tools to discover knowledge embedded
 sources, and keep them up to date                                  in data
 Need advanced tools to analyze, summarize, and view                    Associations
 data in various ways                                                   Clusters
 Issues:                                                                Classifications
     Data cube model                                                Useful for business trend analysis, fraud detection,
     OLAP operations                                                diagnosis, market prediction, …
     Query processing, indexing, views, …

 W. Zhang                    Introduction                     23    W. Zhang                   Introduction                24
Topics                                             Topics (cont.)
 Relational algebra and calculus                     Distributed Database Systems
 Storage & File Management                              Database design
     Disk manager, buffer manager,                      Query processing & optimization
     Indexing, hashing                                  Concurrency control & recovery
 Query Evaluation & Optimization                     Parallel Database systems
     Access methods, selection, joins, etc.          XML databases
     Query optimization methods
                                                     Data Warehousing and OLAP
 Transaction Processing
                                                     Data Mining, …
     Crash Recovery
     Concurrency Control

 W. Zhang                     Introduction    25    W. Zhang                  Introduction   26

More Related Content

Viewers also liked

Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasadpaddu123
 
Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasadpaddu123
 
Relatório de desempenho digital intergastro
Relatório de desempenho digital   intergastroRelatório de desempenho digital   intergastro
Relatório de desempenho digital intergastroINTERGASTRO & TRAUMA
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chartbeathard1962
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chartbeathard1962
 

Viewers also liked (9)

Doc1
Doc1Doc1
Doc1
 
Mics capital presentation
Mics capital presentationMics capital presentation
Mics capital presentation
 
Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasad
 
Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasad
 
Relatório de desempenho digital intergastro
Relatório de desempenho digital   intergastroRelatório de desempenho digital   intergastro
Relatório de desempenho digital intergastro
 
ER model
ER modelER model
ER model
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chart
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chart
 
Dms01
Dms01Dms01
Dms01
 

Similar to 01 intro

Mis582 final exam_study_guide
Mis582 final exam_study_guideMis582 final exam_study_guide
Mis582 final exam_study_guidedclouds
 
Nosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understandingNosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understandingHUSNAINAHMAD39
 
What Is Super Key In Dbms
What Is Super Key In DbmsWhat Is Super Key In Dbms
What Is Super Key In DbmsTheresa Singh
 
Database.docx
Database.docxDatabase.docx
Database.docxRUBAB79
 
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptxINTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptxrenadmajid789
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxodane3
 
1677091759369776.pdf
1677091759369776.pdf1677091759369776.pdf
1677091759369776.pdfJanoakre
 
Database management system
Database management systemDatabase management system
Database management systemRizwanHafeez
 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1Sonia Mim
 
File system vs database
File system vs databaseFile system vs database
File system vs databaseSanthiNivas
 

Similar to 01 intro (20)

Mis582 final exam_study_guide
Mis582 final exam_study_guideMis582 final exam_study_guide
Mis582 final exam_study_guide
 
DBMS introduction
DBMS introductionDBMS introduction
DBMS introduction
 
Nosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understandingNosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understanding
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
354 ch1
354 ch1354 ch1
354 ch1
 
Database System
Database SystemDatabase System
Database System
 
What Is Super Key In Dbms
What Is Super Key In DbmsWhat Is Super Key In Dbms
What Is Super Key In Dbms
 
Case mis ch05
Case mis ch05Case mis ch05
Case mis ch05
 
Database.docx
Database.docxDatabase.docx
Database.docx
 
DBMS PART 1.docx
DBMS PART 1.docxDBMS PART 1.docx
DBMS PART 1.docx
 
ICT L5+.pptx
ICT L5+.pptxICT L5+.pptx
ICT L5+.pptx
 
MADHU.pptx
MADHU.pptxMADHU.pptx
MADHU.pptx
 
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptxINTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
 
project report
project reportproject report
project report
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
 
oracle intro
oracle introoracle intro
oracle intro
 
1677091759369776.pdf
1677091759369776.pdf1677091759369776.pdf
1677091759369776.pdf
 
Database management system
Database management systemDatabase management system
Database management system
 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1
 
File system vs database
File system vs databaseFile system vs database
File system vs database
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

01 intro

  • 1. Overview We study the internal of DBMSs Principles of relational DBMSs Database Management Systems • Emphasize on query & transaction processing techniques Advanced database systems & applications • OODBMS, XML database, data warehousing, OLAP, data mining Prof. Weining Zhang Course work includes Dept. of Computer Science Homework, 2 midterm exams, no final exam University of Texas at San Antonio Programming assignments in Java W. Zhang Introduction 2 Teaching Staff Communication Instructor: Prof. Weining Zhang Web page: Office: SB 4.01.19 http://www.cs.utsa.edu/~wzhang/cs5443/home Contains everything about the course: syllabus, Phone: 458-5557 announcement, assignments, project, lecture notes, etc. Email: wzhang@cs.utsa.edu Office hour: MW 5:00 – 6:00 pm You should check course web pages regularly. T 4:00-5:00 pm Mailing list: 5443@cs.utsa.edu and by appointment Include your CS email address; you may need to forward emails to your regular email address W. Zhang Introduction 3 W. Zhang Introduction 4 Textbooks Other Textbooks Required textbook: Fundamentals of Database Systems, 5th ed., by Database Management systems, 3rd ed., by Elmasri & Navathe Ramakrishnan & Gehrke Other database books in the Main Library Recommended textbook: Principles of Distributed Database Systems, by M. Ozsu & P. Valduriez Database System: the Complete Book, by Garcia- Molina, Ullman & Widom Database system concepts, 5th ed., by Silberschatz, Korth & Sudarshan W. Zhang Introduction 5 W. Zhang Introduction 6
  • 2. Prerequisite Grading CS3743 or equivalent, or extensive experience with Programming assignments 20% database & DB application Homework 20% Strong Java programming skills Midterm I 25% Data structures, algorithms, OO programming, etc. Midterm II 25% Mathematics including logic, sets, algebra, … Intangibles 10% W. Zhang Introduction 7 W. Zhang Introduction 8 Programming Assignments Introduction to Database Systems Implement several components of a simple DBMS A database system consists of called Minibase (Java version), such as, Database management system: the software Buffer Manager Databases: the data Heap File A DBMS needs to provide Hash-based Index persistent data storage Relational operators declarative query language for efficient data retrieval Query processing shared access to data by different applications Work in groups of 2 data security Programming in Java, on Linux or Windows, data integrity … recommend using Eclipse IDE W. Zhang Introduction 9 W. Zhang Introduction 10 An RDBMS Architecture Storage Management Data is stored on disks, and processed in the main Web forms Application front end SQL interface memory SQL Commands Since disk I/Os are costly, search structures, such as, Query indexes, must be used to achieve efficient data access Parser Optimizer Evaluation Concurrency Plan Executor Operator Evaluator Engine DBMS components that manage different types of Control storage include Xction Man File & Access Methods Recovery Disk Manager: manages pages on disk drive Manager Buffer Manager: manages pages in main memory buffer Lock Man Buffer Manager & Disk Manager DBMS Index files Sys. catalog Data files W. Zhang Introduction 11 W. Zhang Introduction 12
  • 3. File Organization Query Processing Data records are logically organized in files and DBMS evaluates declarative queries by executing an physically stored on disk pages optimal query plan that is expressed using relational File organization must consider the format and size of algebraic operations. data records A DBMS must evaluate algebraic operations In addition to simple files of raw data, DBMS also efficiently. maintains search structures, such as, The algorithms and the costs of relational algebraic Ordering operations, such as, selection and join, depend Hashing critically on Indexing types of query condition to reduce access costs specifics of file organizations W. Zhang Introduction 13 W. Zhang Introduction 14 Query Optimization Transaction Processing A transaction models the execution of a database For easy of use, query languages are declarative. application, which typically updates the data in The system must figure out an efficient evaluation databases. plan Transaction management must deal with concurrent The goal is to answer a query with as few disk I/O transactions and possible system failures. as possible The system uses statistics of the data & heuristics to decide how to process the query W. Zhang Introduction 15 W. Zhang Introduction 16 Recovery Concurrency Control The recovery manager protects data integrity in case of Concurrent execution of application programs is system crash. essential for good DBMS performance. The system guarantees that either all operations of a Need to keep CPU busy while performing I/O operations (frequent & relatively slow). transaction or none of them are performed, and updates made by completed transactions are persistent. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. Concurrency control subsystem ensures such problems don’t arise: users can pretend they are using a single- user system. W. Zhang Introduction 17 W. Zhang Introduction 18
  • 4. Advanced Hashing & Indexing Distributed DBMS Relational DBMS support hashing & B+ tree indexing Modern corporations have data, control, & application New DBMSs & DB applications need more distributed globally sophisticated search structures Multiple databases at geographically dispersed Hashing with variable size hash table or multiple keys locations need to cooperate to answer queries with Indexes for spatial, multidimensional data (common in distributed data multimedia DBSs, Data warehousing, OLAP, …) Concurrent transaction processing and recovery are still major issues W. Zhang Introduction 19 W. Zhang Introduction 20 Parallel DBMS XML & Semistructured DBMS Both centralized and distributed databases may use Data in RDB, OODB, & ORDB are structured (with multiple processors to evaluate queries rigid schemas) Parallel system architecture requires new algorithms Data on the Web (and other applications) are for query evaluation and optimization semistructured Performance concerns include HTML, XML, Text, … Ability to scale up Need new concepts and techniques Ability to speed up Data model, query language Query processing & optimization Storage management Update, transaction processing, CC, … W. Zhang Introduction 21 W. Zhang Introduction 22 Data Warehousing & OLAP Data Mining Corporations need to put all available data into use Data contains important patterns useful for making when making vital business decisions sound business decisions Need to have technology to integrate data from all Databases need tools to discover knowledge embedded sources, and keep them up to date in data Need advanced tools to analyze, summarize, and view Associations data in various ways Clusters Issues: Classifications Data cube model Useful for business trend analysis, fraud detection, OLAP operations diagnosis, market prediction, … Query processing, indexing, views, … W. Zhang Introduction 23 W. Zhang Introduction 24
  • 5. Topics Topics (cont.) Relational algebra and calculus Distributed Database Systems Storage & File Management Database design Disk manager, buffer manager, Query processing & optimization Indexing, hashing Concurrency control & recovery Query Evaluation & Optimization Parallel Database systems Access methods, selection, joins, etc. XML databases Query optimization methods Data Warehousing and OLAP Transaction Processing Data Mining, … Crash Recovery Concurrency Control W. Zhang Introduction 25 W. Zhang Introduction 26