This presentation discusses the following topics:
Concepts
Component Architecture for a DDBMS
Distributed Processing
Parallel DBMS
Advantages of DDBMSs
Disadvantages of DDBMSs
Homogeneous DDBMS
Heterogeneous DDBMS
Open Database Access and Interoperability
Multidatabase System
Functions of a DDBMS
Reference Architecture for DDBMS
Components of a DDBMS
Fragmentation
Transparencies in a DDBMS
Date’s 12 Rules for a DDBMS
This presentation discusses the following topics:
Introduction
Objectives
What is Data Mining?
Data Mining Applications
Data Warehousing
Advantages and disadvantages
Trends and Current Issues
Future Research Possibilities
This presentation discusses the following topics:
What is File Management System?
What is Database Management System?
File system vs Database Management System
Limitations of File Based System
Advantages of Database Management System
DBMS Environment
Examples of Database Applications
Limitation of Database Management System
This presentation discusses the following topics:
Architecture
Database System Components
Generating Reports
Database Components
Types of Data
The DBMS
Design Tools
DBMS Engine
Database Schema
Components of Applications
Database Development Terminology and followed by a Quiz
This presentation discusses the following topics:
Object Oriented Databases
Object Oriented Data Model(OODM)
Characteristics of Object oriented database
Object, Attributes and Identity
Object oriented methodologies
Benefit of object orientation in programming language
Object oriented model vs Entity Relationship model
Advantages of OODB over RDBMS
This presentation will discusses about the following topics: Importance of Data Models
Basic Building Blocks
Business Rules
Translating Business Rules into Data Models
Evolution of Data Models
Hierarchical Data Model
Network Data Model
Relational Data Model
Entity Relational Model
Object Model
Summary
Followed by a Quiz
This presentation discusses the following topics:
What is Recovery ?
Database Recovery techniques
System log
Working of Commit and Roll back
Recovery techniques
Backup techniques
This presentation discusses the following topics:
What is XML?
Syntax of XML Document
DTD (Document Type Definition)
XML Schema
XML Query Language
XML Databases
Oracle JDBC
The following topics are discussed in this presentation
Data and Information
Database
Database Management System
Objectives
Advantages
Components
Architecture
This presentation discusses the following topics:
Introduction
Objectives
What is Data Mining?
Data Mining Applications
Data Warehousing
Advantages and disadvantages
Trends and Current Issues
Future Research Possibilities
This presentation discusses the following topics:
What is File Management System?
What is Database Management System?
File system vs Database Management System
Limitations of File Based System
Advantages of Database Management System
DBMS Environment
Examples of Database Applications
Limitation of Database Management System
This presentation discusses the following topics:
Architecture
Database System Components
Generating Reports
Database Components
Types of Data
The DBMS
Design Tools
DBMS Engine
Database Schema
Components of Applications
Database Development Terminology and followed by a Quiz
This presentation discusses the following topics:
Object Oriented Databases
Object Oriented Data Model(OODM)
Characteristics of Object oriented database
Object, Attributes and Identity
Object oriented methodologies
Benefit of object orientation in programming language
Object oriented model vs Entity Relationship model
Advantages of OODB over RDBMS
This presentation will discusses about the following topics: Importance of Data Models
Basic Building Blocks
Business Rules
Translating Business Rules into Data Models
Evolution of Data Models
Hierarchical Data Model
Network Data Model
Relational Data Model
Entity Relational Model
Object Model
Summary
Followed by a Quiz
This presentation discusses the following topics:
What is Recovery ?
Database Recovery techniques
System log
Working of Commit and Roll back
Recovery techniques
Backup techniques
This presentation discusses the following topics:
What is XML?
Syntax of XML Document
DTD (Document Type Definition)
XML Schema
XML Query Language
XML Databases
Oracle JDBC
The following topics are discussed in this presentation
Data and Information
Database
Database Management System
Objectives
Advantages
Components
Architecture
The document discusses different types of data models including logical, physical, and record-based models. It describes key concepts of data models like entities, attributes, relationships and different relationship types. Specific models covered are hierarchical, network, and relational with details on their structure, advantages and disadvantages.
The document discusses advanced database management systems (ADBMS). It provides background on how databases have become essential in modern society and outlines new applications like multimedia databases, geographic information systems, and data warehouses. The document then covers the history of database applications from early hierarchical and network systems to relational databases and object-oriented databases needed for e-commerce. It also discusses how database capabilities have been extended to support new applications involving scientific data, images, videos, data mining, spatial data, and time series data.
DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007...Beat Signer
The document discusses database management system (DBMS) architectures and components. It describes the main components of a DBMS, including the DML preprocessor, query compiler, DDL compiler, and catalog manager. It then outlines several common DBMS architectures such as teleprocessing, file-server, two-tier client-server, and three-tier client-server architectures. The three-tier architecture separates the presentation, application, and data tiers for increased scalability and flexibility.
The document discusses database management systems (DBMS) and their advantages over traditional file-based data storage. It notes that a DBMS allows for controlled data access and defines, creates, and maintains databases. It then outlines some problems with traditional file systems like data redundancy, isolation, and lack of atomicity and integrity. The document concludes that a DBMS provides benefits like reducing redundancy, enforcing integrity constraints, improving security, flexibility, and data sharing compared to file systems.
The document provides an overview of database systems, including:
1) Database systems store and manage large amounts of related data and provide efficient access to that data. They solve problems with traditional file-based data storage like redundancy, data integrity, and concurrent access.
2) Databases are made up of structured data models like the relational model and object-oriented models. They include languages for defining, manipulating, and querying data.
3) Database management systems provide storage, query processing, transaction management, and an abstraction of the data through multiple levels including physical, logical and view levels.
This document provides an introduction to database concepts including definitions of data, information, and databases. It discusses the data processing cycle and differences between manual and computerized data processing. It also describes database users like system analysts, application programmers, and end users. Additionally, it covers database features such as redundancy control, data integrity, data sharing, and security. It discusses data abstraction, database models including hierarchical, network and relational models, as well as normalization. Other topics include database architecture, physical and logical data independence, and entity-relationship diagrams.
This presentation discusses about the following topics:
Physical Database Design Phase
Steps in Physical Database Design
Files and Records
Index Classification
Hash-Based Indexes
Tree Indexes
B+ Tree
Ordered Indexes
Primary Indexes
Clustering Indexes
Secondary Indexes
Indexed Sequential
This document discusses database systems and file-based systems. It defines key terms like data, information, knowledge, wisdom. It provides examples to illustrate database concepts like maintaining inventory and processing credit card transactions. It describes problems with traditional file-based systems when trying to cross-reference or analyze information across files. Finally, it uses a real estate example to demonstrate how a file-based system with separate departmental files worked and the role of data processing staff.
The document provides an overview of database management systems (DBMS). It discusses DBMS applications, why DBMS are used, different users of databases, data models and languages like SQL. It also summarizes key components of a DBMS including data storage, query processing, transaction management and database architecture.
This document provides an overview of object-oriented programming and Java. It defines object-oriented programming as organizing programs around objects and their interfaces rather than functions. The key concepts of OOP discussed include classes, objects, encapsulation, inheritance, polymorphism, and abstraction. It also provides details on the history and characteristics of Java, the most popular language for OOP. The document is serving as course material for a programming paradigms class focusing on OOP using Java.
Introduction to Database Management Systems Reem Sherif
The document provides an introduction and overview of database management systems (DBMS) including basic concepts, structured query language (SQL), and non-SQL databases. It outlines a course agenda covering these topics over two days, and then delves into explanations of key concepts such as the file-based system approach and its limitations, definitions of database terminology, database users, database system architecture, data models, and entity relationship modeling. Examples are also provided to illustrate database design and entity relationship diagrams.
The document discusses key aspects of the ETL (extraction, transformation, and loading) process used to update data warehouses. It describes the two main strategies for building a data warehouse - the enterprise-wide top-down approach and the bottom-up data mart approach. The document also outlines the major steps in ETL including data extraction, transformation, data staging, data cleansing, data loading, and managing metadata.
Introduction To Database Management Systemcpjcollege
Database Management System (DMBS)
• Collection of interrelated data • Set of programs to access the data • DMBS contains information about a particular enterprise • DBMS provides an environment that it both convenient and efficient to use
1) The document discusses the importance of databases and the internet for libraries and information centers. It defines databases as organized collections of digital data that can be accessed electronically.
2) Examples of common databases mentioned include DELNET, WorldCat, and MEDLINE. The internet is described as a network of networks that allows users worldwide access to communicate, access information, and more.
3) The document outlines many applications of databases and the internet for library management functions and services, including acquisitions, reference services, document delivery, and more. It argues that internet-based library services are cheaper, rapid, accessible remotely 24/7, and allow information manipulation and sharing worldwide.
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...vtunotesbysree
This document contains solved questions and answers from a past data warehousing and data mining exam. It includes questions on operational data stores, extract transform load (ETL) processes, online transaction processing (OLTP) vs online analytical processing (OLAP), data cubes, and data pre-processing approaches. The responses provide detailed explanations and examples for each topic.
Current trends in database management systems include multimedia databases that store various media formats along with descriptive metadata, distributed databases that allow data to be shared across networked sites, document-oriented databases where records can have varying formats and fields rather than fixed tables, and mobile and embedded databases increasingly used in devices and sensors to configure settings and store operational data. These trends reflect demands for managing diverse data types, enabling data sharing, flexible document structures, and data capture in everyday objects.
This document discusses concepts related to distributed database management systems (DDBMS). It defines a distributed database as a logically interrelated collection of shared data distributed over a computer network. A DDBMS manages the distributed database and makes the distribution transparent to users. The document covers distributed database design topics like fragmentation, allocation, and replication of data across multiple sites. It also discusses various types of transparency that a DDBMS provides, such as distribution, transaction, and performance transparency.
Distributed database management systems (DDBMS) allow data to be spread across multiple computer sites connected by a network. A DDBMS provides location transparency so users can access data without knowing its physical location. It also coordinates transactions that involve data stored at multiple sites. DDBMS architectures include transaction managers, data managers, and transaction coordinators to process transactions and subtransactions across distributed data.
The document discusses different types of data models including logical, physical, and record-based models. It describes key concepts of data models like entities, attributes, relationships and different relationship types. Specific models covered are hierarchical, network, and relational with details on their structure, advantages and disadvantages.
The document discusses advanced database management systems (ADBMS). It provides background on how databases have become essential in modern society and outlines new applications like multimedia databases, geographic information systems, and data warehouses. The document then covers the history of database applications from early hierarchical and network systems to relational databases and object-oriented databases needed for e-commerce. It also discusses how database capabilities have been extended to support new applications involving scientific data, images, videos, data mining, spatial data, and time series data.
DBMS Architectures and Features - Lecture 7 - Introduction to Databases (1007...Beat Signer
The document discusses database management system (DBMS) architectures and components. It describes the main components of a DBMS, including the DML preprocessor, query compiler, DDL compiler, and catalog manager. It then outlines several common DBMS architectures such as teleprocessing, file-server, two-tier client-server, and three-tier client-server architectures. The three-tier architecture separates the presentation, application, and data tiers for increased scalability and flexibility.
The document discusses database management systems (DBMS) and their advantages over traditional file-based data storage. It notes that a DBMS allows for controlled data access and defines, creates, and maintains databases. It then outlines some problems with traditional file systems like data redundancy, isolation, and lack of atomicity and integrity. The document concludes that a DBMS provides benefits like reducing redundancy, enforcing integrity constraints, improving security, flexibility, and data sharing compared to file systems.
The document provides an overview of database systems, including:
1) Database systems store and manage large amounts of related data and provide efficient access to that data. They solve problems with traditional file-based data storage like redundancy, data integrity, and concurrent access.
2) Databases are made up of structured data models like the relational model and object-oriented models. They include languages for defining, manipulating, and querying data.
3) Database management systems provide storage, query processing, transaction management, and an abstraction of the data through multiple levels including physical, logical and view levels.
This document provides an introduction to database concepts including definitions of data, information, and databases. It discusses the data processing cycle and differences between manual and computerized data processing. It also describes database users like system analysts, application programmers, and end users. Additionally, it covers database features such as redundancy control, data integrity, data sharing, and security. It discusses data abstraction, database models including hierarchical, network and relational models, as well as normalization. Other topics include database architecture, physical and logical data independence, and entity-relationship diagrams.
This presentation discusses about the following topics:
Physical Database Design Phase
Steps in Physical Database Design
Files and Records
Index Classification
Hash-Based Indexes
Tree Indexes
B+ Tree
Ordered Indexes
Primary Indexes
Clustering Indexes
Secondary Indexes
Indexed Sequential
This document discusses database systems and file-based systems. It defines key terms like data, information, knowledge, wisdom. It provides examples to illustrate database concepts like maintaining inventory and processing credit card transactions. It describes problems with traditional file-based systems when trying to cross-reference or analyze information across files. Finally, it uses a real estate example to demonstrate how a file-based system with separate departmental files worked and the role of data processing staff.
The document provides an overview of database management systems (DBMS). It discusses DBMS applications, why DBMS are used, different users of databases, data models and languages like SQL. It also summarizes key components of a DBMS including data storage, query processing, transaction management and database architecture.
This document provides an overview of object-oriented programming and Java. It defines object-oriented programming as organizing programs around objects and their interfaces rather than functions. The key concepts of OOP discussed include classes, objects, encapsulation, inheritance, polymorphism, and abstraction. It also provides details on the history and characteristics of Java, the most popular language for OOP. The document is serving as course material for a programming paradigms class focusing on OOP using Java.
Introduction to Database Management Systems Reem Sherif
The document provides an introduction and overview of database management systems (DBMS) including basic concepts, structured query language (SQL), and non-SQL databases. It outlines a course agenda covering these topics over two days, and then delves into explanations of key concepts such as the file-based system approach and its limitations, definitions of database terminology, database users, database system architecture, data models, and entity relationship modeling. Examples are also provided to illustrate database design and entity relationship diagrams.
The document discusses key aspects of the ETL (extraction, transformation, and loading) process used to update data warehouses. It describes the two main strategies for building a data warehouse - the enterprise-wide top-down approach and the bottom-up data mart approach. The document also outlines the major steps in ETL including data extraction, transformation, data staging, data cleansing, data loading, and managing metadata.
Introduction To Database Management Systemcpjcollege
Database Management System (DMBS)
• Collection of interrelated data • Set of programs to access the data • DMBS contains information about a particular enterprise • DBMS provides an environment that it both convenient and efficient to use
1) The document discusses the importance of databases and the internet for libraries and information centers. It defines databases as organized collections of digital data that can be accessed electronically.
2) Examples of common databases mentioned include DELNET, WorldCat, and MEDLINE. The internet is described as a network of networks that allows users worldwide access to communicate, access information, and more.
3) The document outlines many applications of databases and the internet for library management functions and services, including acquisitions, reference services, document delivery, and more. It argues that internet-based library services are cheaper, rapid, accessible remotely 24/7, and allow information manipulation and sharing worldwide.
VTU 7TH SEM CSE DATA WAREHOUSING AND DATA MINING SOLVED PAPERS OF DEC2013 JUN...vtunotesbysree
This document contains solved questions and answers from a past data warehousing and data mining exam. It includes questions on operational data stores, extract transform load (ETL) processes, online transaction processing (OLTP) vs online analytical processing (OLAP), data cubes, and data pre-processing approaches. The responses provide detailed explanations and examples for each topic.
Current trends in database management systems include multimedia databases that store various media formats along with descriptive metadata, distributed databases that allow data to be shared across networked sites, document-oriented databases where records can have varying formats and fields rather than fixed tables, and mobile and embedded databases increasingly used in devices and sensors to configure settings and store operational data. These trends reflect demands for managing diverse data types, enabling data sharing, flexible document structures, and data capture in everyday objects.
This document discusses concepts related to distributed database management systems (DDBMS). It defines a distributed database as a logically interrelated collection of shared data distributed over a computer network. A DDBMS manages the distributed database and makes the distribution transparent to users. The document covers distributed database design topics like fragmentation, allocation, and replication of data across multiple sites. It also discusses various types of transparency that a DDBMS provides, such as distribution, transaction, and performance transparency.
Distributed database management systems (DDBMS) allow data to be spread across multiple computer sites connected by a network. A DDBMS provides location transparency so users can access data without knowing its physical location. It also coordinates transactions that involve data stored at multiple sites. DDBMS architectures include transaction managers, data managers, and transaction coordinators to process transactions and subtransactions across distributed data.
The document provides an introduction to database principles. It discusses the limitations of file-based data management systems and how database management systems were developed to overcome these limitations. The key components of a database system are described, including the database, database management system, and application programs. Roles in a database environment like database administrators and end users are defined. Advantages of relational database management systems like controlling redundancy and improving data integrity are outlined. Some disadvantages of database systems like complexity and increased hardware costs are also noted.
This document provides an overview of database systems and concepts. It discusses how a database management system (DBMS) stores and manages data, defines various DBMS functions like security management and query languages, and describes different approaches to database development like the systems development life cycle and prototyping. It also explains the three schema architecture including the external, conceptual, and internal schemas and different levels of data abstraction.
Distributed databases allow data to be shared across a computer network while being stored on multiple machines. A distributed database management system (DDBMS) allows for the management of distributed databases and makes the distribution transparent to users. Key concepts in distributed DBMS design include fragmentation, allocation, and replication of data across multiple sites. Transparency, performance, and handling failures and concurrency are important considerations for DDBMS.
Chapter-6 Distribute Database system (3).pptlatigudata
A distributed database management system (DDBMS) governs logically related data distributed across interconnected computer systems. A DDBMS manages a distributed database while making the distribution transparent to users. Distributed databases provide advantages like improved performance through storing data closer to where it is needed, easier expansion, and increased reliability through redundancy. However, distributed databases also introduce challenges around increased complexity, lack of standards, and security concerns.
The document discusses key concepts related to databases including data, information, database management systems (DBMS), database design, and entity relationship modeling. It defines data as raw unorganized facts and information as organized, meaningful data. A database is a collection of organized data that can be easily accessed, managed and updated. Effective database design involves conceptual, logical and physical data modeling to structure data and relationships. The entity relationship model uses entities, attributes, and relationships to graphically represent data structures and relationships.
CP 121_2.pptx about time to be implementflyinimohamed
The document discusses database concepts and architecture. It covers the three levels of data abstraction - physical, logical, and external levels. It also describes the three schema architecture, including the physical, conceptual, and external schemas. This architecture provides data independence and allows mappings between the different levels. The document also discusses different types of database systems such as single-user, multi-user, centralized, distributed, parallel, and client/server databases.
A distributed database management system (DDBMS) governs the storage and processing of logically related data over interconnected computer systems where both data and processing are distributed among several sites. A DDBMS has functions like application interfaces, validation, transformation, query optimization, mapping, security, backup/recovery, concurrency control, and transaction management to ensure data consistency across database fragments. Components of a DDBMS include workstations or remote devices that form the network, network components in each device, communications media to transfer data, transaction processors at each device, and data processors at each site to store and retrieve local data.
Distributed database design refers to the following problem: given a database and its workload, how should the database be split and allocated to sites so as to optimize certain objective function (e.g., to minimize the resource consumption in processing the query workload).
Distributed database management systemsDhani Ahmad
This chapter discusses distributed database management systems (DDBMS). A DDBMS governs storage and processing of logically related data across interconnected computer systems. The chapter covers DDBMS components, levels of data and process distribution, transaction management, and design considerations like data fragmentation, replication, and allocation. Transparency and optimization techniques aim to make the distributed nature transparent to users.
1. The document discusses key concepts related to database systems including the definition of a database, database management systems (DBMS), data models, database classification, data integrity, query optimization, structured query language (SQL), parallel databases, and object-relational mapping (ORM).
2. It provides details on common data models like hierarchical, network, and relational models. It also describes concepts like database architecture, data definition language, data manipulation language, and distributed databases.
3. Control questions are provided at the end to test understanding of database concepts like the difference between a database and data set, components of a database system, and main elements of a database.
This document discusses distributed database management systems (DDBMS). It explains that a DDBMS governs storage and processing of logically related data across interconnected computer systems. The document outlines different levels of data and process distribution, including single-site processing with single-site data (SPSD), multiple-site processing with single-site data (MPSD), and multiple-site processing with multiple-site data (MPMD). The key components of a DDBMS are transaction processors and data processors that allow for distributed querying and management of data across networked sites.
Attributes are properties or characteristics that describe entities. In the EMPLOYEE entity example, attributes could include:
- Employee ID
- Name
- Date of birth
- Address
- Salary
These attributes describe and provide information about each employee entity instance. Attributes help define and differentiate entity instances from each other.
The document discusses concepts, functions, architecture, and design of distributed database management systems (DDBMS). It covers topics such as data allocation strategies, distributed relational database design, levels of transparency provided by DDBMSs, and Date's 12 rules for distributed database management. The overall goal of a DDBMS is to manage distributed databases across a computer network while hiding the distribution from users.
The document provides an overview of databases and data modeling. It introduces key concepts such as database management systems (DBMS), database design phases, database actors and workers, and advantages of the DBMS approach over traditional file processing. Some key points covered include the self-describing nature of databases, program-data independence, multiple views of data, transaction processing, roles of database administrators and designers, and how DBMSs provide controlled data redundancy and security. An example university database structure is also presented.
Database SystemsDesign, Implementation, and ManagementOllieShoresna
Database Systems:
Design, Implementation, and
Management
Tenth Edition
Chapter 12
Distributed Database Management
Systems
The Evolution of Distributed Database
Management Systems
• Distributed database management system
(DDBMS)
– Governs storage and processing of logically related
data over interconnected computer systems
– Both data and processing functions are distributed
among several sites
• 1970s - Centralized database required that
corporate data be stored in a single central site
– Usually a mainframe computer
– Data access via dumb terminals
Database Systems, 10th Edition 2
Database Systems, 10th Edition 3
• Wasn’t responsive to need for faster response times
and quick access to information
• Slow process to approve and develop new application
The Evolution of Distributed Database
Management Systems
Database Systems, 10th Edition 4
• Social and technological changes led to change
• Businesses went global; competition was now in
cyberspace not next door
• Customer demands and market needs required Web-
based services
• rapid development of low-cost, smart mobile devices
increased the demand for complex and fast networks to
interconnect them – cloud based services
• Multiple types of data (voice, image, video, music)
which are geographically distributed must be managed
The Evolution of Distributed Database
Management Systems
Database Systems, 10th Edition 5
• As a result, businesses had to react quickly to
remain competitive. This required:
• Rapid ad hoc data access became crucial in
the quick-response decision making
environment
• Distributed data access to support
geographically dispersed business units
The Evolution of Distributed Database
Management Systems
Database Systems, 10th Edition 6
• The following factors strongly influenced the shape of the
response
• Acceptance of the Internet as the platform for data access
and distribution
• The mobile wireless revolution
• Created high demand for data access
• Use of “applications as a service”
• Company data stored on central servers but applications are
deployed “in the cloud”
• Increased focus on mobile BI
• Use of social networks increases need for on-the-spot
decision making
The Evolution of Distributed Database
Management Systems
Database Systems, 10th Edition 7
• The distributed database is especially desirable because
centralized database management is subject to problems such
as:
• Performance degradation as remote locations and distances
increase
• High cost to maintain and operate
• Reliability issues with a single site and need for data
replication
• Scalability problems due to a single location (space, power
consumption, etc)
• Organizational rigidity imposed by the database – might not
be able to support flexibility and agility required by modern
global organizations
The Evolution of Distributed Database
Management Systems
8
Distributed Processing and Distributed
Data ...
This document discusses distributed database architecture. It describes the advantages as improved processing power through distributed processing across multiple machines, removal of single point of failure, and easier expandability. The disadvantages include increased complexity, costs, security challenges, and difficulties with integrity control and standardization. Different architectures are shared-nothing, shared-disk, and shared-memory. Distributed databases are well-suited for applications involving multiple sites like manufacturing, military systems, and airline reservations.
1. The document discusses distributed database management systems (DDBMS), which store and manage logically related data over physically independent sites connected by a network.
2. A DDBMS has several advantages over a centralized database like improved performance, reliability, and scalability but also greater complexity, security risks, and training costs.
3. The distribution of data and processing in a DDBMS can take different forms from single-site systems to fully distributed systems across multiple sites and servers. Transaction management and concurrency control are important challenges in distributed databases.
A database is a shared collection of related data used to support organizational activities. A database management system (DBMS) is a computerized data system that allows users to perform operations on a database. DBMSs can be classified based on data model (relational, hierarchical, etc.), number of users supported (single or multi-user), and database distribution (centralized, distributed, homogeneous, heterogeneous). Database users include end users, application users, application programmers, sophisticated users, and database administrators.
This presentation discusses the following topics:
Basic features of R
Exploring R GUI
Data Frames & Lists
Handling Data in R Workspace
Reading Data Sets & Exporting Data from R
Manipulating & Processing Data in R
Association rule mining is used to find relationships between items in transaction data. It identifies rules that can predict the occurrence of an item based on other items purchased together frequently. Some key metrics used to evaluate rules include support, which measures how frequently an itemset occurs; confidence, which measures how often items in the predicted set occur given items in the predictor set; and lift, which compares the confidence to expected confidence if items were independent. An example association rule evaluated is {Milk, Diaper} -> {Beer} with support of 0.4, confidence of 0.67, and lift of 1.11.
This document discusses clustering, which is the task of grouping data points into clusters so that points within the same cluster are more similar to each other than points in other clusters. It describes different types of clustering methods, including density-based, hierarchical, partitioning, and grid-based methods. It provides examples of specific clustering algorithms like K-means, DBSCAN, and discusses applications of clustering in fields like marketing, biology, libraries, insurance, city planning, and earthquake studies.
Classification is a data analysis technique used to predict class membership for new observations based on a training set of previously labeled examples. It involves building a classification model during a training phase using an algorithm, then testing the model on new data to estimate accuracy. Some common classification algorithms include decision trees, Bayesian networks, neural networks, and support vector machines. Classification has applications in domains like medicine, retail, and entertainment.
The document discusses the assumptions and properties of ordinary least squares (OLS) estimators in linear regression analysis. It notes that OLS estimators are best linear unbiased estimators (BLUE) if the assumptions of the linear regression model are met. Specifically, it assumes errors have zero mean and constant variance, are uncorrelated, and are normally distributed. Violation of the assumption of constant variance is known as heteroscedasticity. The document outlines how heteroscedasticity impacts the properties of OLS estimators and their use in applications like econometrics.
This document provides an introduction to regression analysis. It discusses that regression analysis investigates the relationship between dependent and independent variables to model and analyze data. The document outlines different types of regressions including linear, polynomial, stepwise, ridge, lasso, and elastic net regressions. It explains that regression analysis is used for predictive modeling, forecasting, and determining the impact of variables. The benefits of regression analysis are that it indicates significant relationships and the strength of impact between variables.
MYCIN was an early expert system developed at Stanford University in 1972 to assist physicians in diagnosing and selecting treatment for bacterial and blood infections. It used over 600 production rules encoding the clinical decision criteria of infectious disease experts to diagnose patients based on reported symptoms and test results. While it could not replace human diagnosis due to computing limitations at the time, MYCIN demonstrated that expert knowledge could be represented computationally and established a foundation for more advanced machine learning and knowledge base systems.
The document discusses expert systems, which are computer applications that solve complex problems at a human expert level. It describes the characteristics and capabilities of expert systems, why they are useful, and their key components - knowledge base, inference engine, and user interface. The document also outlines common applications of expert systems and the general development process.
The Dempster-Shafer Theory was developed by Arthur Dempster in 1967 and Glenn Shafer in 1976 as an alternative to Bayesian probability. It allows one to combine evidence from different sources and obtain a degree of belief (or probability) for some event. The theory uses belief functions and plausibility functions to represent degrees of belief for various hypotheses given certain evidence. It was developed to describe ignorance and consider all possible outcomes, unlike Bayesian probability which only considers single evidence. An example is given of using the theory to determine the murderer in a room with 4 people where the lights went out.
A Bayesian network is a probabilistic graphical model that represents conditional dependencies among random variables using a directed acyclic graph. It consists of nodes representing variables and directed edges representing causal relationships. Each node contains a conditional probability table that quantifies the effect of its parent nodes on that variable. Bayesian networks can be used to calculate the probability of events occurring based on the network structure and conditional probability tables, such as computing the probability of an alarm sounding given that no burglary or earthquake occurred but two neighbors called.
This document discusses knowledge-based agents in artificial intelligence. It defines knowledge-based agents as agents that maintain an internal state of knowledge, reason over that knowledge, update their knowledge based on observations, and take actions. Knowledge-based agents have two main components: a knowledge base that stores facts about the world, and an inference system that applies logical rules to deduce new information from the knowledge base. The document also describes the architecture of knowledge-based agents and different approaches to designing them.
A rule-based system uses predefined rules to make logical deductions and choices to perform automated actions. It consists of a database of rules representing knowledge, a database of facts as inputs, and an inference engine that controls the process of deriving conclusions by applying rules to facts. A rule-based system mimics human decision making by applying rules in an "if-then" format to incoming data to perform actions, but unlike AI it does not learn or adapt on its own.
This document discusses formal logic and its applications in AI and machine learning. It begins by explaining why logic is useful in complex domains or with little data. It then describes logic-based approaches to AI that use symbolic reasoning as an alternative to machine learning. The document proceeds to explain propositional logic and first-order logic, noting how first-order logic improves on propositional logic by allowing variables. It also mentions other logics and their applications in areas like automated discovery, inductive programming, and verification of computer systems and machine learning models.
The document discusses production systems, which are rule-based systems used in artificial intelligence to model intelligent behavior. A production system consists of a global database, set of production rules, and control system. The rules fire to modify the database based on conditions. Different control strategies are used to determine which rules fire. Production systems are modular and allow knowledge representation as condition-action rules. Examples of applications in problem solving are provided.
The document discusses game playing in artificial intelligence. It describes how general game playing (GGP) involves designing AI that can play multiple games by learning the rules, rather than being programmed for a specific game. The document outlines how the minimax algorithm is commonly used for game playing, involving move generation and static evaluation functions to search game trees and determine the best move by maximizing or minimizing values at each level.
A study on “Diagnosis Test of Diabetics and Hypertension by AI”, Presentation slides for International Conference on "Life Sciences: Acceptance of the New Normal", St. Aloysius' College, Jabalpur, Madhya Pradesh, India, 27-28 August, 2021
A study on “impact of artificial intelligence in covid19 diagnosis”Dr. C.V. Suresh Babu
A study on “Impact of Artificial Intelligence in COVID-19 Diagnosis”, Presentation slides for International Conference on "Life Sciences: Acceptance of the New Normal", St. Aloysius' College, Jabalpur, Madhya Pradesh, India, 27-28 August, 2021
A study on “impact of artificial intelligence in covid19 diagnosis”Dr. C.V. Suresh Babu
Although the lungs are one of the most vital organs in the body, they are vulnerable to infection and injury. COVID-19 has put the entire world in an unprecedented difficult situation, bringing life to a halt and claiming thousands of lives all across the world. Medical imaging, such as X-rays and computed tomography (CT), is essential in the global fight against COVID-19, and newly emerging artificial intelligence (AI) technologies are boosting the power of imaging tools and assisting medical specialists. AI can improve job efficiency by precisely identifying infections in X-ray and CT images and allowing further measurement. We focus on the integration of AI with X-ray and CT, both of which are routinely used in frontline hospitals, to reflect the most recent progress in medical imaging and radiology combating COVID-19.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Diana Rendina
Librarians are leading the way in creating future-ready citizens – now we need to update our spaces to match. In this session, attendees will get inspiration for transforming their library spaces. You’ll learn how to survey students and patrons, create a focus group, and use design thinking to brainstorm ideas for your space. We’ll discuss budget friendly ways to change your space as well as how to find funding. No matter where you’re at, you’ll find ideas for reimagining your space in this session.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
Digital Artefact 1 - Tiny Home Environmental Design
Distributed databases
1. Department of Information Technology 1Data base Technologies (ITB4201)
Distributed Databases
Dr. C.V. Suresh Babu
Professor
Department of IT
Hindustan Institute of Science & Technology
2. Department of Information Technology 2Data base Technologies (ITB4201)
Action Plan
• Concepts
• Component Architecture for a DDBMS
• Distributed Processing
• Parallel DBMS
• Advantages of DDBMSs
• Disadvantages of DDBMSs
• Homogeneous DDBMS
• Heterogeneous DDBMS
• Open Database Access and Interoperability
• Multidatabase System
• Functions of a DDBMS
• Reference Architecture for DDBMS
• Components of a DDBMS
• Fragmentation
• Transparencies in a DDBMS
• Date’s 12 Rules for a DDBMS
• Quiz
3. Department of Information Technology 3Data base Technologies (ITB4201)
Concepts
Distributed Database.
A logically interrelated collection of shared data (and a description
of this data), physically distributed over a computer network.
Distributed DBMS.
Software system that permits the management of the distributed
database and makes the distribution transparent to users.
4. Department of Information Technology 4Data base Technologies (ITB4201)
Concepts
• Collection of logically-related shared data.
• Data split into fragments.
• Fragments may be replicated.
• Fragments/replicas allocated to sites.
• Sites linked by a communications network.
• Data at each site is under control of a DBMS.
• DBMSs handle local applications autonomously.
• Each DBMS participates in at least one global application.
5. Department of Information Technology 5Data base Technologies (ITB4201)
Component Architecture for a DDBMS
GDD
DDBMS
DC
Computer Network
DDBMS
DC LDBMS
LDBMS : Local DBMS component
DC : Data communication component
GDD : Global Data Dictionary
site 1
site 2 DB
GDD
6. Department of Information Technology 6Data base Technologies (ITB4201)
The Ideal Situation
• A single application should be able to operate transparently on
data that is:
– spread across a variety of different DBMS's
– running on a variety of different machines
– supported by a variety of different operating systems
– connected together by a variety of different communication networks
• The distribution can be geographical or local
7. Department of Information Technology 7Data base Technologies (ITB4201)
Workable definitionA distributed database system consists of a collection of sites
connected together via some kind of communications network, in
which :
– each site is a database system site in its own right;
– the sites agree to work together, so that a user at any
site can access data anywhere in the network exactly
as if the data were all stored at the user's own site
It is a logical union of real databases
• It can be seen as a kind of partnership among individual local
DBMS's
• Difference with remote access or distributed processing systems
• Temporary assumption: strict homogeneity
9. Department of Information Technology 9Data base Technologies (ITB4201)
6
Distributed Processing
• A centralized database that can be accessed over a computer
network.
10. Department of Information Technology 10Data base Technologies (ITB4201)
Parallel DBMS
• A DBMS running across multiple processors and disks designed
to execute operations in parallel, whenever possible, to
improve performance.
• Based on premise that single processor systems can no longer
meet requirements for cost-effective scalability, reliability, and
performance.
• Parallel DBMSs link multiple, smaller machines to achieve same
throughput as single, larger machine, with greater scalability
and reliability.
11. Department of Information Technology 11Data base Technologies (ITB4201)
Parallel DBMS
• Main architectures for parallel DBMSs are:
– a: Shared memory.
– b: Shared disk.
– c: Shared nothing.
13. Department of Information Technology 13Data base Technologies (ITB4201)
Advantages of DDBMSs
• Organizational Structure
• Shareability and Local Autonomy
• Improved Availability
• Improved Reliability
• Improved Performance
• Economics
• Modular Growth
14. Department of Information Technology 14Data base Technologies (ITB4201)
Disadvantages of DDBMSs
• Complexity
• Cost
• Security
• Integrity Control More Difficult
• Lack of Standards
• Lack of Experience
• Database Design More Complex
15. Department of Information Technology 15Data base Technologies (ITB4201)
Types of DDBMS
• Homogeneous DDBMS
• Heterogeneous DDBMS
16. Department of Information Technology 16Data base Technologies (ITB4201)
Homogeneous DDBMS
• All sites use same DBMS product.
• Much easier to design and manage.
• Approach provides incremental growth and allows increased
performance.
17. Department of Information Technology 17Data base Technologies (ITB4201)
Heterogeneous DDBMS
• Sites may run different DBMS products, with possibly different
underlying data models.
• Occurs when sites have implemented their own databases and
integration is considered later.
• Translations required to allow for:
– Different hardware.
– Different DBMS products.
– Different hardware and different DBMS products.
• Typical solution is to use gateways.
18. Department of Information Technology 18Data base Technologies (ITB4201)
Open Database Access and Interoperability
• Open Group has formed a Working Group to provide specifications that will create
database infrastructure environment where there is:
• Common SQL API that allows client applications to be written that do not need to
know vendor of DBMS they are accessing.
– Common database protocol that enables DBMS from one vendor to communicate directly with
DBMS from another vendor without the need for a gateway.
– A common network protocol that allows communications between different DBMSs.
• Most ambitious goal is to find a way to enable transaction to span DBMSs from
different vendors without use of a gateway.
19. Department of Information Technology 19Data base Technologies (ITB4201)
Multidatabase System (MDBS)
• DDBMS in which each site maintains complete autonomy.
• DBMS that resides transparently on top of existing database
and file systems and presents a single database to its users.
• Allows users to access and share data without requiring
physical database integration.
• Non-federated MDBS (no local users) and federated MDBS
(FMDBS).
20. Department of Information Technology 20Data base Technologies (ITB4201)
Functions of a DDBMS
• Expect DDBMS to have at least the functionality of a DBMS.
• Also to have following functionality:
– Extended communication services.
– Extended Data Dictionary.
– Distributed query processing.
– Extended concurrency control.
– Extended recovery services.
21. Department of Information Technology 21Data base Technologies (ITB4201)
Reference Architecture for DDBMS
• Due to diversity, no universally accepted architecture such as the
ANSI/SPARC 3-level architecture.
• A reference architecture consists of:
– Set of global external schemas.
– Global conceptual schema (GCS).
– Fragmentation schema and allocation schema.
– Set of schemas for each local DBMS conforming to 3-level ANSI/SPARC .
• Some levels may be missing, depending on levels of transparency
supported.
22. Department of Information Technology 22Data base Technologies (ITB4201)
Reference Architecture for DDBMS
23. Department of Information Technology 23Data base Technologies (ITB4201)
Reference Architecture for MDBS
• In DDBMS, GCS is union of all local conceptual schemas.
• In FMDBS, GCS is subset of local conceptual schemas (LCS),
consisting of data that each local system agrees to share.
• GCS of tightly coupled system involves integration of either
parts of LCSs or local external schemas.
• FMDBS with no GCS is called loosely coupled.
24. Department of Information Technology 24Data base Technologies (ITB4201)
Reference Architecture for Tightly-Coupled Federated MDBS
26. Department of Information Technology 26Data base Technologies (ITB4201)
Distributed Database Design
• Three key issues:
– Fragmentation.
– Allocation
– Replication
27. Department of Information Technology 27Data base Technologies (ITB4201)
Distributed Database Design
• Fragmentation
– Relation may be divided into a number of sub-relations, which are
then distributed.
• Allocation
– Each fragment is stored at site with "optimal" distribution.
• Replication
– Copy of fragment may be maintained at several sites.
28. Department of Information Technology 28Data base Technologies (ITB4201)
Fragmentation
• Definition and allocation of fragments carried out strategically
to achieve:
– Locality of Reference
– Improved Reliability and Availability
– Improved Performance
– Balanced Storage Capacities and Costs
– Minimal Communication Costs.
• Involves analyzing most important applications, based on
quantitative/qualitative information.
29. Department of Information Technology 29Data base Technologies (ITB4201)
Fragmentation
• Quantitative information may include:
– frequency with which an application is run;
– site from which an application is run;
– performance criteria for transactions and applications.
• Qualitative information may include transactions that are
executed by application, type of access (read or write), and
predicates of read operations.
30. Department of Information Technology 30Data base Technologies (ITB4201)
Data Allocation
• Four alternative strategies regarding placement of data:
– Centralized
– Partitioned (or Fragmented)
– Complete Replication
– Selective Replication
31. Department of Information Technology 31Data base Technologies (ITB4201)
Data Allocation
• Centralized
– Consists of single database and DBMS stored at one site with users
distributed across the network.
• Partitioned
– Database partitioned into disjoint fragments, each fragment assigned
to one site.
32. Department of Information Technology 32Data base Technologies (ITB4201)
Data Allocation
• Complete Replication
– Consists of maintaining complete copy of database at each site.
• Selective Replication
– Combination of partitioning, replication, and centralization.
33. Department of Information Technology 33Data base Technologies (ITB4201)
33
Comparison of Strategies for Data Distribution
34. Department of Information Technology 34Data base Technologies (ITB4201)
Why Fragment?
• Usage
– Applications work with views rather than entire relations.
• Efficiency
– Data is stored close to where it is most frequently used.
– Data that is not needed by local applications is not stored.
35. Department of Information Technology 35Data base Technologies (ITB4201)
Why Fragment?
• Parallelism
– With fragments as unit of distribution, transaction can be divided into
several subqueries that operate on fragments.
• Security
– Data not required by local applications is not stored and so not available to
unauthorized users.
• Disadvantages
– Performance
– Integrity.
36. Department of Information Technology 36Data base Technologies (ITB4201)
Correctness of Fragmentation
• Three correctness rules:
– Completeness
– Reconstruction
– Disjointness.
37. Department of Information Technology 37Data base Technologies (ITB4201)
Correctness of Fragmentation
• Completeness
– If relation R is decomposed into fragments R1, R2, ... Rn, each data
item that can be found in R must appear in at least one fragment.
• Reconstruction
• Must be possible to define a relational operation that will
reconstruct R from the fragments.
• Reconstruction for horizontal fragmentation is Union operation
and Join for vertical .
38. Department of Information Technology 38Data base Technologies (ITB4201)
Correctness of Fragmentation
• Disjointness
• If data item di appears in fragment Ri, then it should not
appear in any other fragment.
• Exception: vertical fragmentation, where primary key
attributes must be repeated to allow reconstruction.
• For horizontal fragmentation, data item is a tuple
• For vertical fragmentation, data item is an attribute.
39. Department of Information Technology 39Data base Technologies (ITB4201)
Types of Fragmentation
• Four types of fragmentation:
– Horizontal
– Vertical
– Mixed
– Derived.
• Other possibility is no fragmentation:
– If relation is small and not updated frequently, may be better not to
fragment relation.
40. Department of Information Technology 40Data base Technologies (ITB4201)
41
Horizontal and Vertical Fragmentation
42. Department of Information Technology 42Data base Technologies (ITB4201)
Horizontal Fragmentation
• This strategy is determined by looking at predicates used by
transactions.
• Involves finding set of minimal (complete and relevant)
predicates.
• Set of predicates is complete, if and only if, any two tuples in
same fragment are referenced with same probability by any
application.
• Predicate is relevant if there is at least one application that
accesses fragments differently.
43. Department of Information Technology 43Data base Technologies (ITB4201)
Transparencies in a DDBMS
• Distribution Transparency
– Fragmentation Transparency
– Location Transparency
– Replication Transparency
– Local Mapping Transparency
– Naming Transparency
44. Department of Information Technology 44Data base Technologies (ITB4201)
Transparencies in a DDBMS
• Transaction Transparency
– Concurrency Transparency
– Failure Transparency
• Performance Transparency
• DBMS Transparency
45. Department of Information Technology 45Data base Technologies (ITB4201)
Distribution Transparency
• Distribution transparency allows user to perceive database as
single, logical entity.
• If DDBMS exhibits distribution transparency, user does not
need to know:
– data is fragmented (fragmentation transparency),
– location of data items (location transparency),
– otherwise call this local mapping transparency.
• With replication transparency, user is unaware of replication of
fragments .
46. Department of Information Technology 46Data base Technologies (ITB4201)
Naming Transparency
• Each item in a DDB must have a unique name.
• DDBMS must ensure that no two sites create a database object
with same name.
• One solution is to create central name server. However, this
results in:
– loss of some local autonomy;
– central site may become a bottleneck;
– low availability; if the central site fails, remaining sites cannot create
any new objects.
47. Department of Information Technology 47Data base Technologies (ITB4201)
Transaction Transparency
• Ensures that all distributed transactions maintain distributed
database’s integrity and consistency.
• Distributed transaction accesses data stored at more than one
location.
• Each transaction is divided into number of sub-transactions,
one for each site that has to be accessed.
• DDBMS must ensure the indivisibility of both the global
transaction and each subtransactions.
48. Department of Information Technology 48Data base Technologies (ITB4201)
Concurrency Transparency
• All transactions must execute independently and be logically
consistent with results obtained if transactions executed one at
a time, in some arbitrary serial order.
• Same fundamental principles as for centralized DBMS.
• DDBMS must ensure both global and local transactions do not
interfere with each other.
• Similarly, DDBMS must ensure consistency of all sub-
transactions of global transaction.
49. Department of Information Technology 49Data base Technologies (ITB4201)
Concurrency Transparency
• Replication makes concurrency more complex.
• If a copy of a replicated data item is updated, update must be
propagated to all copies.
• Could propagate changes as part of original transaction,
making it an atomic operation.
• However, if one site holding copy is not reachable, then
transaction is delayed until site is reachable.
50. Department of Information Technology 50Data base Technologies (ITB4201)
Concurrency Transparency
• Could limit update propagation to only those sites currently
available. Remaining sites updated when they become
available again.
• Could allow updates to copies to happen asynchronously,
sometime after the original update. Delay in regaining
consistency may range from a few seconds to several hours.
51. Department of Information Technology 51Data base Technologies (ITB4201)
Failure Transparency
• DDBMS must ensure atomicity and durability of global
transaction.
• Means ensuring that sub-transactions of global transaction
either all commit or all abort.
• Thus, DDBMS must synchronize global transaction to ensure
that all sub-transactions have completed successfully before
recording a final COMMIT for global transaction.
• Must do this in presence of site and network failures.
52. Department of Information Technology 52Data base Technologies (ITB4201)
Performance Transparency
• DDBMS must perform as if it were a centralized DBMS.
– DDBMS should not suffer any performance degradation due to
distributed architecture.
– DDBMS should determine most cost-effective strategy to execute a
request.
53. Department of Information Technology 53Data base Technologies (ITB4201)
Performance Transparency
• Distributed Query Processor (DQP) maps data request into
ordered sequence of operations on local databases.
• Must consider fragmentation, replication, and allocation
schemas.
• DQP has to decide:
– which fragment to access;
– which copy of a fragment to use;
– which location to use.
54. Department of Information Technology 54Data base Technologies (ITB4201)
Performance Transparency
• DQP produces execution strategy optimized with respect to
some cost function.
• Typically, costs associated with a distributed request include:
– I/O cost;
– CPU cost;
– communication cost.
55. Department of Information Technology 55Data base Technologies (ITB4201)
Date’s 12 Rules for a DDBMS
• 0. Fundamental Principle
– To the user, a distributed system should look exactly like a non-
distributed system.
• 1. Local Autonomy
• 2. No Reliance on a Central Site
• 3. Continuous Operation
• 4. Location Independence
• 5. Fragmentation Independence
• 6. Replication Independence
56. Department of Information Technology 56Data base Technologies (ITB4201)
Date’s 12 Rules for a DDBMS
• 7. Distributed Query Processing
• 8. Distributed Transaction Processing
• 9. Hardware Independence
• 10. Operating System Independence
• 11. Network Independence
• 12. Database Independence
• Last four rules are ideals.
57. Department of Information Technology 57Data base Technologies (ITB4201)
Test Yourself
1. A distributed database has which of the following advantages over a centralized database?
A. Software cost
B. Software complexity
C. Slow Response
D. Modular growth
2. A autonomous homogenous environment is which of the following?
A. The same DBMS is at each node and each DBMS works independently.
B. The same DBMS is at each node and a central DBMS coordinates database access.
C. A different DBMS is at each node and each DBMS works independently.
D. A different DBMS is at each node and a central DBMS coordinates database access.
3. A transaction manager is which of the following?
A. Maintains a log of transactions
B. Maintains before and after database images
C. Maintains appropriate concurrency control
D. All of the above
4. Location transparency allows for which of the following?
A. Users to treat the data as if it is at one location
B. Programmers to treat the data as if it is at one location
C. Managers to treat the data as if it is at one location
D. All of the above
5. A heterogeneous distributed database is which of the following?
A. The same DBMS is used at each location and data are not distributed across all nodes.
B. The same DBMS is used at each location and data are distributed across all nodes.
C. A different DBMS is used at each location and data are not distributed across all nodes.
D. A different DBMS is used at each location and data are distributed across all nodes.
58. Department of Information Technology 58Data base Technologies (ITB4201)
1. A distributed database has which of the following advantages over a centralized database?
A. Software cost
B. Software complexity
C. Slow Response
D. Modular growth
2. A autonomous homogenous environment is which of the following?
A. The same DBMS is at each node and each DBMS works independently.
B. The same DBMS is at each node and a central DBMS coordinates database access.
C. A different DBMS is at each node and each DBMS works independently.
D. A different DBMS is at each node and a central DBMS coordinates database access.
3. A transaction manager is which of the following?
A. Maintains a log of transactions
B. Maintains before and after database images
C. Maintains appropriate concurrency control
D. All of the above
4. Location transparency allows for which of the following?
A. Users to treat the data as if it is at one location
B. Programmers to treat the data as if it is at one location
C. Managers to treat the data as if it is at one location
D. All of the above
5. A heterogeneous distributed database is which of the following?
A. The same DBMS is used at each location and data are not distributed across all nodes.
B. The same DBMS is used at each location and data are distributed across all nodes.
C. A different DBMS is used at each location and data are not distributed across all nodes.
D. A different DBMS is used at each location and data are distributed across all nodes.
Answers