Decision tree knowledge discovery through neural Networks
structure of decision tree and neural networks.
how they work?
Models
working
knowledge discovery
clustering
In this PPT, you will learn:
• The difference between data and information
• What a database is, the various types of databases, and why they are valuable assets for
decision making
• The importance of database design
• How modern databases evolved from file systems
• About flaws in file system data management
• The main components of the database system
• The main functions of a database management system (DBMS)
A database management system (DBMS) consists of an interrelated set of data and programs to access that data. The DBMS provides several levels of abstraction to simplify interaction between users and the stored data. It defines data structures to store information and mechanisms to manipulate the data while ensuring data safety, integrity, and security. The DBMS is controlled by a database administrator and provides advantages like reduced data redundancy, data sharing, and integrity. It uses data models and definition/manipulation languages to define, retrieve, modify, and maintain the stored data.
The document discusses parallel databases and their architectures. It introduces parallel databases as systems that seek to improve performance through parallelizing operations like loading data, building indexes, and evaluating queries using multiple CPUs and disks. It describes three main architectures for parallel databases: shared memory, shared disk, and shared nothing. The shared nothing architecture provides linear scale-up and speed-up but is more difficult to program. The document also discusses measuring performance improvements from parallelization through speed-up and scale-up.
The document discusses deductive databases and how they differ from conventional databases. Deductive databases contain facts and rules that allow implicit facts to be deduced from the stored information. This reduces the amount of storage needed compared to explicitly storing all facts. Deductive databases use logic programming through languages like Datalog to specify rules that define virtual relations. The rules allow new facts to be inferred through an inference engine even if they are not explicitly represented.
Data Models In Database Management SystemAmad Ahmad
This document discusses different types of data models used in database management systems (DBMS), including record-based, relational, network, hierarchical, and entity-relationship (ER) models. It provides an overview of key concepts like data, information, databases, and data models. For each model type, it describes how data is organized and represented. For example, it explains that the relational model organizes data into two-dimensional tables with attributes and tuples, while the hierarchical model structures data in a tree configuration. The ER model views data as entities and relationships between entities.
Decision tree knowledge discovery through neural Networks
structure of decision tree and neural networks.
how they work?
Models
working
knowledge discovery
clustering
In this PPT, you will learn:
• The difference between data and information
• What a database is, the various types of databases, and why they are valuable assets for
decision making
• The importance of database design
• How modern databases evolved from file systems
• About flaws in file system data management
• The main components of the database system
• The main functions of a database management system (DBMS)
A database management system (DBMS) consists of an interrelated set of data and programs to access that data. The DBMS provides several levels of abstraction to simplify interaction between users and the stored data. It defines data structures to store information and mechanisms to manipulate the data while ensuring data safety, integrity, and security. The DBMS is controlled by a database administrator and provides advantages like reduced data redundancy, data sharing, and integrity. It uses data models and definition/manipulation languages to define, retrieve, modify, and maintain the stored data.
The document discusses parallel databases and their architectures. It introduces parallel databases as systems that seek to improve performance through parallelizing operations like loading data, building indexes, and evaluating queries using multiple CPUs and disks. It describes three main architectures for parallel databases: shared memory, shared disk, and shared nothing. The shared nothing architecture provides linear scale-up and speed-up but is more difficult to program. The document also discusses measuring performance improvements from parallelization through speed-up and scale-up.
The document discusses deductive databases and how they differ from conventional databases. Deductive databases contain facts and rules that allow implicit facts to be deduced from the stored information. This reduces the amount of storage needed compared to explicitly storing all facts. Deductive databases use logic programming through languages like Datalog to specify rules that define virtual relations. The rules allow new facts to be inferred through an inference engine even if they are not explicitly represented.
Data Models In Database Management SystemAmad Ahmad
This document discusses different types of data models used in database management systems (DBMS), including record-based, relational, network, hierarchical, and entity-relationship (ER) models. It provides an overview of key concepts like data, information, databases, and data models. For each model type, it describes how data is organized and represented. For example, it explains that the relational model organizes data into two-dimensional tables with attributes and tuples, while the hierarchical model structures data in a tree configuration. The ER model views data as entities and relationships between entities.
The document discusses sequential covering algorithms for learning rule sets from data. It describes how sequential covering algorithms work by iteratively learning one rule at a time to cover examples, removing covered examples, and repeating until all examples are covered. It also discusses variations of this approach, including using a general-to-specific beam search to learn each rule and alternatives like the AQ algorithm that learn rules to cover specific target values. Finally, it describes how first-order logic can be used to learn more general rules than propositional logic by representing relationships between attributes.
This document discusses database security issues and threats. It outlines major vulnerabilities like unpatched software, improper configurations, and default passwords. Two major threats are application vulnerabilities and internal employees exploiting systems. The document recommends mitigation strategies like locking default usernames and passwords, enforcing strong password policies, auditing privileges, and following the principle of least privilege. It also provides examples of SQL injection attacks and recommends error handling and use of bind variables as solutions.
1. The document provides an overview of key concepts in data science and machine learning including the data science process, types of data, machine learning techniques, and Python tools used for machine learning.
2. It describes the typical 6 step data science process: setting goals, data retrieval, data preparation, exploration, modeling, and presentation.
3. Different types of data are discussed including structured, unstructured, machine-generated, graph-based, and audio/video data.
4. Machine learning techniques can be supervised, unsupervised, or semi-supervised depending on whether labeled data is used.
This document discusses text and web mining. It defines text mining as analyzing huge amounts of text data to extract information. It discusses measures for text retrieval like precision and recall. It also covers text retrieval and indexing methods like inverted indices and signature files. Query processing techniques and ways to reduce dimensionality like latent semantic indexing are explained. The document also discusses challenges in mining the world wide web due to its size and dynamic nature. It defines web usage mining as collecting web access information to analyze paths to accessed web pages.
This document provides an overview of databases and SQL. It defines a database as an organized collection of logically related data. It discusses different types of data and how data is transformed into information. The document also outlines the major components of SQL, including DDL, DML, DCL, and TCL statements. DDL is used to define the database structure, DML manages data, DCL controls privileges, and TCL manages transactions. Common SQL commands like SELECT, INSERT, UPDATE, DELETE are also highlighted.
This document discusses distributed query processing. It begins by defining what a query and query processor are. It then outlines the main problems in query processing, characteristics of query processors, and layers of query processing. The key layers are query decomposition, data localization, global query optimization, and distributed execution. Query decomposition takes a query expressed on global relations and decomposes it into an algebraic query on global relations.
Data mining primitives include task-relevant data, the kind of knowledge to be mined, background knowledge such as concept hierarchies, interestingness measures, and methods for presenting discovered patterns. A data mining query specifies these primitives to guide the knowledge discovery process. Background knowledge like concept hierarchies allow mining patterns at different levels of abstraction. Interestingness measures estimate pattern simplicity, certainty, utility, and novelty to filter uninteresting results. Discovered patterns can be presented through various visualizations including rules, tables, charts, and decision trees.
Adbms 11 object structure and type constructorVaibhav Khanna
Unique Identity:
An OO database system provides a unique identity to each independent object stored in the database.
This unique identity is typically implemented via a unique, system-generated object identifier, or OID
The main property required of an OID is that it be immutable
Specifically, the OID value of a particular object should not change.
This preserves the identity of the real-world object being represented.
Type Constructors:
In OO databases, the state (current value) of a complex object may be constructed from other objects (or other values) by using certain type constructors.
The three most basic constructors are atom, tuple, and set.
Other commonly used constructors include list, bag, and array.
This document provides an overview of database management systems and related concepts. It discusses data hierarchy, traditional file processing, the database approach to data management, features and capabilities of database management systems, database schemas, components of database management systems, common data models including hierarchical, network, and relational models, and the process of data normalization.
The following presentation represents database keys and its types, and also database relationship and its types with references. It will help you to know about what is keys and database relationship.
This document provides an overview of data warehousing. It defines a data warehouse as a central database that includes information from several different sources and keeps both current and historical data to support management decision making. The document describes key characteristics of a data warehouse including being subject-oriented, integrated, time-variant, and non-volatile. It also discusses common data warehouse architectures and applications.
The document discusses different database concepts:
1) A database is a collection of organized data that can be easily retrieved, inserted, and deleted. Database management systems (DBMS) like MySQL and Oracle are software used to manage databases.
2) The two main data models are the relational model, which organizes data into tables and relations, and the object-oriented model, which represents data as objects with properties and methods.
3) DBMS provide advantages like data sharing, backup/recovery, security, and independence between data and applications. However, they also have disadvantages such as higher costs and complexity.
Clustering: Large Databases in data miningZHAO Sam
The document discusses different approaches for clustering large databases, including divide-and-conquer, incremental, and parallel clustering. It describes three major scalable clustering algorithms: BIRCH, which incrementally clusters incoming records and organizes clusters in a tree structure; CURE, which uses a divide-and-conquer approach to partition data and cluster subsets independently; and DBSCAN, a density-based algorithm that groups together densely populated areas of points.
A database is a collection of data that can be used alone or combined to answer users' questions. A database management system (DBMS) provides programs to manage databases, control data access, and include a query language. When designing a database, it is important to structure the data so that specific records can be easily accessed, the database can respond to different questions, minimal storage is used, and redundant data is avoided. Key concepts in database design include entities, attributes, records, primary keys, foreign keys, and relationships between tables.
The document discusses different database models including hierarchical, network, relational, entity-relationship, object-oriented, object-relational, and semi-structured models. It provides details on the characteristics, structures, advantages and disadvantages of each model. It also includes examples and diagrams to illustrate concepts like hierarchical structure, network structure, relational schema, entity relationship diagrams, object oriented diagrams, and XML schema. The document appears to be teaching materials for a database management course that provides an overview of various database models.
This document discusses evaluation in information retrieval. It describes standard test collections which consist of a document collection, queries on the collection, and relevance judgments. It also discusses various evaluation measures used in information retrieval like precision, recall, F-measure, mean average precision, and kappa statistic which measure reliability of relevance judgments. R-precision and normalized discounted cumulative gain are also summarized as important single number evaluation measures.
The document discusses database normalization and functional dependencies. It defines normalization as imposing rules on database tables to eliminate anomalies during data manipulation. Functional dependency is defined as a relationship where a set of attributes determines another. The properties of functional dependencies - reflexive, augmentation, transitive, union, and decomposition - are explained and examples are given. Normalization and understanding functional dependencies helps design high quality databases without redundancies or anomalies.
The document discusses sequential covering algorithms for learning rule sets from data. It describes how sequential covering algorithms work by iteratively learning one rule at a time to cover examples, removing covered examples, and repeating until all examples are covered. It also discusses variations of this approach, including using a general-to-specific beam search to learn each rule and alternatives like the AQ algorithm that learn rules to cover specific target values. Finally, it describes how first-order logic can be used to learn more general rules than propositional logic by representing relationships between attributes.
This document discusses database security issues and threats. It outlines major vulnerabilities like unpatched software, improper configurations, and default passwords. Two major threats are application vulnerabilities and internal employees exploiting systems. The document recommends mitigation strategies like locking default usernames and passwords, enforcing strong password policies, auditing privileges, and following the principle of least privilege. It also provides examples of SQL injection attacks and recommends error handling and use of bind variables as solutions.
1. The document provides an overview of key concepts in data science and machine learning including the data science process, types of data, machine learning techniques, and Python tools used for machine learning.
2. It describes the typical 6 step data science process: setting goals, data retrieval, data preparation, exploration, modeling, and presentation.
3. Different types of data are discussed including structured, unstructured, machine-generated, graph-based, and audio/video data.
4. Machine learning techniques can be supervised, unsupervised, or semi-supervised depending on whether labeled data is used.
This document discusses text and web mining. It defines text mining as analyzing huge amounts of text data to extract information. It discusses measures for text retrieval like precision and recall. It also covers text retrieval and indexing methods like inverted indices and signature files. Query processing techniques and ways to reduce dimensionality like latent semantic indexing are explained. The document also discusses challenges in mining the world wide web due to its size and dynamic nature. It defines web usage mining as collecting web access information to analyze paths to accessed web pages.
This document provides an overview of databases and SQL. It defines a database as an organized collection of logically related data. It discusses different types of data and how data is transformed into information. The document also outlines the major components of SQL, including DDL, DML, DCL, and TCL statements. DDL is used to define the database structure, DML manages data, DCL controls privileges, and TCL manages transactions. Common SQL commands like SELECT, INSERT, UPDATE, DELETE are also highlighted.
This document discusses distributed query processing. It begins by defining what a query and query processor are. It then outlines the main problems in query processing, characteristics of query processors, and layers of query processing. The key layers are query decomposition, data localization, global query optimization, and distributed execution. Query decomposition takes a query expressed on global relations and decomposes it into an algebraic query on global relations.
Data mining primitives include task-relevant data, the kind of knowledge to be mined, background knowledge such as concept hierarchies, interestingness measures, and methods for presenting discovered patterns. A data mining query specifies these primitives to guide the knowledge discovery process. Background knowledge like concept hierarchies allow mining patterns at different levels of abstraction. Interestingness measures estimate pattern simplicity, certainty, utility, and novelty to filter uninteresting results. Discovered patterns can be presented through various visualizations including rules, tables, charts, and decision trees.
Adbms 11 object structure and type constructorVaibhav Khanna
Unique Identity:
An OO database system provides a unique identity to each independent object stored in the database.
This unique identity is typically implemented via a unique, system-generated object identifier, or OID
The main property required of an OID is that it be immutable
Specifically, the OID value of a particular object should not change.
This preserves the identity of the real-world object being represented.
Type Constructors:
In OO databases, the state (current value) of a complex object may be constructed from other objects (or other values) by using certain type constructors.
The three most basic constructors are atom, tuple, and set.
Other commonly used constructors include list, bag, and array.
This document provides an overview of database management systems and related concepts. It discusses data hierarchy, traditional file processing, the database approach to data management, features and capabilities of database management systems, database schemas, components of database management systems, common data models including hierarchical, network, and relational models, and the process of data normalization.
The following presentation represents database keys and its types, and also database relationship and its types with references. It will help you to know about what is keys and database relationship.
This document provides an overview of data warehousing. It defines a data warehouse as a central database that includes information from several different sources and keeps both current and historical data to support management decision making. The document describes key characteristics of a data warehouse including being subject-oriented, integrated, time-variant, and non-volatile. It also discusses common data warehouse architectures and applications.
The document discusses different database concepts:
1) A database is a collection of organized data that can be easily retrieved, inserted, and deleted. Database management systems (DBMS) like MySQL and Oracle are software used to manage databases.
2) The two main data models are the relational model, which organizes data into tables and relations, and the object-oriented model, which represents data as objects with properties and methods.
3) DBMS provide advantages like data sharing, backup/recovery, security, and independence between data and applications. However, they also have disadvantages such as higher costs and complexity.
Clustering: Large Databases in data miningZHAO Sam
The document discusses different approaches for clustering large databases, including divide-and-conquer, incremental, and parallel clustering. It describes three major scalable clustering algorithms: BIRCH, which incrementally clusters incoming records and organizes clusters in a tree structure; CURE, which uses a divide-and-conquer approach to partition data and cluster subsets independently; and DBSCAN, a density-based algorithm that groups together densely populated areas of points.
A database is a collection of data that can be used alone or combined to answer users' questions. A database management system (DBMS) provides programs to manage databases, control data access, and include a query language. When designing a database, it is important to structure the data so that specific records can be easily accessed, the database can respond to different questions, minimal storage is used, and redundant data is avoided. Key concepts in database design include entities, attributes, records, primary keys, foreign keys, and relationships between tables.
The document discusses different database models including hierarchical, network, relational, entity-relationship, object-oriented, object-relational, and semi-structured models. It provides details on the characteristics, structures, advantages and disadvantages of each model. It also includes examples and diagrams to illustrate concepts like hierarchical structure, network structure, relational schema, entity relationship diagrams, object oriented diagrams, and XML schema. The document appears to be teaching materials for a database management course that provides an overview of various database models.
This document discusses evaluation in information retrieval. It describes standard test collections which consist of a document collection, queries on the collection, and relevance judgments. It also discusses various evaluation measures used in information retrieval like precision, recall, F-measure, mean average precision, and kappa statistic which measure reliability of relevance judgments. R-precision and normalized discounted cumulative gain are also summarized as important single number evaluation measures.
The document discusses database normalization and functional dependencies. It defines normalization as imposing rules on database tables to eliminate anomalies during data manipulation. Functional dependency is defined as a relationship where a set of attributes determines another. The properties of functional dependencies - reflexive, augmentation, transitive, union, and decomposition - are explained and examples are given. Normalization and understanding functional dependencies helps design high quality databases without redundancies or anomalies.
This document discusses software services and cloud computing architectures. It begins by providing context on the growing service economy and how businesses are increasingly offering services rather than products. It then defines software-as-a-service (SaaS) and describes how SaaS delivers software applications over the internet, with updates and management occurring remotely. Finally, the document discusses service-oriented architectures and how they support the development and delivery of software services.
The document discusses architecture-centric software development processes. It describes traditional waterfall and iterative development models, and notes that iterative models allow for more flexibility to changing requirements. Agile development methods like eXtreme Programming (XP) are discussed, which emphasize iterative development, collaboration, and rapid delivery of working software. Key practices of XP are outlined, including user stories, testing, pair programming, refactoring, and continuous integration. The role of architecture in agile processes is also addressed.
This document discusses software product lines and product line architectures. It defines a software product line as a set of software systems that share a common set of features addressing a particular market segment. Product lines are developed from a common set of core assets in a prescribed way to reduce costs and increase reuse. A product line architecture is a common framework that standardizes components and maximizes reuse potential. It specifies common functionality and identifies variation points across related products. Variability management is important for providing flexibility without compromising commonality.
6 - Architetture Software - Model transformationMajong DevJfu
This document discusses model transformations in Model-Driven Architecture (MDA). It defines computation independent models (CIMs), platform independent models (PIMs), and platform specific models (PSMs). It explains that model transformations are used to map between these different abstraction levels and ensure consistency. It also discusses model mappings, approaches to transformations, and tools like EMF and ATL that support transformations in Eclipse.
5 - Architetture Software - Metamodelling and the Model Driven ArchitectureMajong DevJfu
The document discusses metamodeling and the Model Driven Architecture (MDA). It covers topics such as model driven engineering, metamodeling, metamodeling in UML, and the OMG technologies that support MDA. Metamodeling involves modeling modeling elements and their relationships. Metamodels define the structure of models, while models are instances that conform to metamodels. The MDA uses metamodels and models to develop and transform systems.
This document provides an overview of software architectures by presenting examples of architectures from various software systems. It begins with an introduction to software architecture and what it entails. It then shows numerous diagrams and visualizations of architectures for different types of systems, such as editors, compilers, operating systems, middleware, and web applications. These examples are intended to demonstrate common architectural patterns and styles. The document discusses analyzing and comparing the architectures visually and recognizing patterns within them.
The document discusses architectural styles and decomposition techniques for software systems. It describes layering and tiering as basic decomposition approaches, with layers representing different levels of abstraction and tiers representing peer modules within the same layer. Several common architectural styles are then introduced, including pipes and filters, repository, client/server, model-view-controller, service-oriented, and peer-to-peer. Closed and open layered architectures are contrasted, and examples of layered systems like virtual machines and the OSI model are provided. Finally, the document notes that complete decompositions often involve both layering and partitioning techniques.
The document discusses key concepts in software architecture, including:
1) Software architecture establishes the overall structure and organization of a system, including its components and relationships.
2) Architectural design involves decomposing a system into subsystems or modules to improve modifiability, reusability, and portability.
3) Key principles for architectural design include simplicity, modularity, low coupling, separation of concerns, abstraction, and postponing decisions.
1 - Architetture Software - Software as a productMajong DevJfu
This document discusses software as a product and industry. It covers how software is a key component in modern technologies and industries. The software industry has grown significantly in recent decades. The document discusses different types of software such as embedded software, middleware, and software as a service. It also covers topics like software architecture, engineering, components, ecosystems, and the challenges in developing software. Overall, the document provides an overview of software as an industrial product and the software development industry.
10 - Architetture Software - More architectural stylesMajong DevJfu
The Microkernel pattern partitions an operating system into isolated, minimal components that communicate through a small, fixed message-passing interface, allowing components to be developed and upgraded independently while maintaining overall system stability and security.
The document discusses architectural UML and provides information on:
1) The elements of a software architecture including views, models, and diagrams.
2) How UML can be used to represent different architectural views including design, process, development, and physical views.
3) An example of using UML models and diagrams to represent different views of a chess game architecture.
UML allows for extending diagrams and modeling elements through three main techniques:
1. Stereotypes allow applying tags to existing modeling elements like classes, associations, etc. to add domain-specific meaning.
2. Profiles extend UML with new modeling elements tailored for specific domains or platforms.
3. Extension mechanisms allow precisely defining new constructs that integrate with the UML metamodel. Together these techniques make UML extensible for multiple domains.
The document discusses metamodeling and the Model Driven Architecture (MDA). It provides an overview of model driven engineering and metamodeling. Specifically, it discusses how metamodels define the structure of models through concepts like classes and relationships. The Model Driven Architecture uses metamodels and modeling to develop software systems from models.
Here are the key differences between objects and classes in UML:
- Classes define the general characteristics (attributes and operations) that objects of that class will have. Objects are specific instances of a class.
- Classes are static definitions, while objects are dynamic instances of classes that exist at run time.
- In class diagrams, classes are represented as boxes containing the attributes and operations. Objects are represented as boxes with the class name followed by a colon and the object name (e.g. Person:John).
- Classes define the common properties for a set of objects, while each object is a unique instance of a class with its own identity and particular values for attributes.
- Classes are abstractions,
The document discusses architectural styles and decomposition techniques. It describes layers as hierarchical sets of subsystems that provide related services by utilizing underlying layers. Tiers partition a system into peer subsystems responsible for classes of services. Common architectural styles include pipes and filters, repository, client/server, model-view-controller, service-oriented, and peer-to-peer. Layers and tiers are often combined for complete decomposition, with subsystems divided into tiers and each tier organized into layers. The pipe and filter style focuses on dynamic interaction by processing data streams through filters connected by pipes.
- Reference architectures that provide templates for common system types
- Design patterns that capture successful solutions to recurring problems
- Architectural patterns that describe best practices for system organization
- Legacy applications that can be analyzed for reusable architectural elements
This document discusses software as a product and the software industry. It covers topics such as why software is important, emerging technologies according to Gartner's hype cycle from 2005-2010, software being an industrial product, the size of the worldwide software industry, different types of software including embedded software and software as a service. It also discusses software components, software architecture and engineering issues, producing software is difficult due to complexity, low productivity in the industry, the software development process, different process models, lifecycle differences around the world, development activities, process models, and software standards.
This short document contains 5 headings but provides no other details or context. It lists the headings "My Heading Here First Second Third Fourth Fifth" but does not elaborate on or explain these headings.
This document provides an overview of various types of architectural standards including conceptual standards like IEEE 1471 and DoDAF that define viewpoints and views, notational standards like UML and SysML, and process standards like TOGAF and RUP. It discusses the benefits of standards in promoting interoperability and network effects while also noting drawbacks like limiting flexibility. The document advises deciding when to adopt a standard based on whether in the early or late phase of a project.
Basi di Dati - C1 - Il modello relazionale dei dati
1. Basi di Dati
Basi di Dati
Il modello relazionale dei dati
2. Basi di Dati – Dove ci troviamo?
Basi di Dati Dove ci troviamo?
A) Introduzione
1 2
C) Modello Relazionale,
) ,
B) Prog Concettuale (ER)
B) Prog. Concettuale (ER)
Algebra relazionale, SQL
1 2 3 4 5 6 7 1 2 3 4 5 6 7
D) Prog. Logica e E) Tecnologia di un DBMS
Normalizzazione
1 2 3 4 5 6
1 2 3 4
F) Programmazione DB
1 2
2 Basi di Dati ‐ Il modello relazionale dei dati
3. Cronologia dei modelli per la rappresentazione dei dati
Cronologia dei modelli per la rappresentazione dei dati
Modello gerarchico (anni 60)
g ( )
Modello reticolare (anni 70)
Modello reticolare (anni 70)
Modello relazionale (anni 80)
Modello relazionale (anni 80)
Modello a oggetti (anni 90)
d ll ( )
3 Basi di Dati ‐ Il modello relazionale dei dati
6. Cronologia del modello relazionale
Cronologia del modello relazionale
Inventato da Codd nel 1970
(IBM Research di Santa Teresa, Cal)
Primi progetti:
Pi i tti
SYSTEM R (IBM), Ingres (Berkeley Un.)
Prima versione del linguaggio SQL (allora SEQUEL): 1974
Primi sistemi commerciali: inizio anni ‘80 (Oracle, IBM‐SQL
DS e DB2, Ingres, Informix, Sybase)
Successo commerciale: dal 1985.
6 Basi di Dati ‐ Il modello relazionale dei dati
7. System R – Curiosità…
System R Curiosità
“ IIrv T fori this project. We djust sortdofall off us to pick a
Traiger: Leonard had ordered ll
dh d k
name shrugged off, “It’s
not important.” He said, “It’s important in terms of
recognition to have a name ” We would make attempts
a name.” We
at coming up with a name over weeks. One was Rufus,
which was Franco’s dog.
Franco Putzolu: Rufus would have been a better
name. It stands for Relational User Friendly Universal
System.
”
Mike Blasgen: It
Mike Blasgen: It would have been a better name
a better
7 Basi di Dati ‐ Il modello relazionale dei dati
8. Definizione informale
Definizione informale
tabella
colonna schema
studente
MATR NOME CITTA’ C‐DIP
123 Carlo Bologna Inf
307 Giovanni Milano Log
istanza
415 Paola Torino Inf
riga
702 Antonio Roma Log
8 Basi di Dati ‐ Il modello relazionale dei dati
9. Definizione formale
Definizione formale
Dominio D :
un qualunque insieme di valori
Prodotto cartesiano su n domini (non
necessariamente distinti), D1 x D2 x …Dn:
necessariamente distinti) D1 x D2 x Dn:
insieme di tutte le n‐ple (tuple) < d1 , d2 , ... dn > ,
con di∈Di , 1 ≤i ≤
d Di 1 ≤i ≤ n
Relazione R su D1, D2, ... , Dn : un qualunque
sottoinsieme di D1 x D2 x ... Dn
9 Basi di Dati ‐ Il modello relazionale dei dati
10. Esempio
Esempio
D1 = (a,b)
(,)
D2 = (1,2,3)
D1 x D2 = ( <a,1>, <b,1>, <a,2>, <b,2>, <a,3>, <b,3> )
R1 = ( <a,1>, <b,3> )
R2 = ( <a,2>, <b,1>, <b,3>)
R3 ( )
R3 = ( )
R4 = ( <a,1>, <b,1>, <a,2>, <b,2>, <a,3>, <b,3> )
10 Basi di Dati ‐ Il modello relazionale dei dati
11. Proprietà
Proprietà
Grado della relazione:
numero di domini (n)
Cardinalità della relazione:
numero di tuple
numero di tuple
Attributo:
ib
nome dato ad un dominio in una relazione [I
nomi di attributo in una relazione devono
essere tutti distinti fra loro]
11 Basi di Dati ‐ Il modello relazionale dei dati
12. Proprietà
Proprietà
Schema di una relazione:
tabella (attr1, … , attrN)
b ll ( 1 N)
[I nomi degli attributi in uno schema devono essere tutti
distinti fra loro]
fra loro]
Istanza della relazione:
un insieme di tuple su (attr1, … , attrN)
p ( ,, )
R1(A,B)
R1(A,B) R2(C,D)
A B C D
a 1 c 1
b 3 b 3
a 2
12 Basi di Dati ‐ Il modello relazionale dei dati
13. Confronto della terminologia
Confronto della terminologia
Una differenza
U diff
DEFINIZIONE DEFINIZIONE significativa:
FORMALE INFORMALE
DEFINIZIONE
FORMALE
relazione
li tabella
t b ll assenza
di duplicati
attributo colonna
tupla, n‐pla
tupla, n‐ l
l riga
i
DEFINIZIONE
dominio tipo di dato
INFORMALE
cardinalita' numero di righe possibili duplicati
grado numero di colonne
13 Basi di Dati ‐ Il modello relazionale dei dati
14. Base di dati
Base di dati
Schema di base di dati:
un insieme di schemi di relazioni [tutti i nomi di relazioni
della base di dati devono essere differenti]
Istanza della base di dati:
R1(A,B)
R1(A,B)
( ) R2(C,D)
()
un insieme di istanze di relazioni
A B C D
a 1 c 1
b 3 b 3
a 2
14 Basi di Dati ‐ Il modello relazionale dei dati
18. Esempio: gestione degli esami universitari
Esempio: gestione degli esami universitari
studente
MATR NOME CITTA’ C‐DIP
123 Carlo Bologna Inf
415 Paola Torino Inf
702 Antonio Roma Log
corso
esame
COD‐
COD‐
MATR COD‐
COD‐ DATA VOTO TITOLO DOCENTE
CORSO
CORSO
1
123 1 7‐9‐04 30 matematica Barozzi
2
123 2 8‐1‐05 28 informatica Meo
702 2 7‐9‐04 20
18 Basi di Dati ‐ Il modello relazionale dei dati
19. Interrogazioni
Interrogazioni
quali professori hanno esaminato Carlo?
studente
MATR NOME CITTA’ C‐DIP
123 Carlo Bologna Inf
415 Paola Torino Inf
702 Antonio Roma Log
corso
esame
COD‐
COD‐
MATR COD‐
COD‐ DATA VOTO TITOLO DOCENTE
CORSO
CORSO
1
123 1 7‐9‐04 30 matematica Barozzi
2
123 2 8‐1‐05 28 informatica Meo
702 2 7‐9‐04 20
19 Basi di Dati ‐ Il modello relazionale dei dati
20. Interrogazioni
Interrogazioni
quali studenti hanno preso 30 in matematica?
studente
MATR NOME CITTA’ C‐DIP
123 Carlo
Cl Bologna
Bl Inf
If
415 Paola Torino Inf
702 Antonio Roma Log
corso
esame
COD‐
COD‐
MATR COD‐
COD‐ DATA VOTO TITOLO DOCENTE
CORSO
CORSO
1
123 1 7‐9‐04 30 matematica Barozzi
2
123 2 8‐1‐05 28 informatica Meo
702 2 7‐9‐04 20
20 Basi di Dati ‐ Il modello relazionale dei dati
21. Esempio: gestione personale
Esempio: gestione personale
impiegato
MATR NOME DATA‐
DATA‐ASS SALARIO MATR‐
MATR‐MGR
1 Piero 1‐1‐02 1500 €
1500 € 2
2 Giorgio 1‐1‐04 2000 €
2000 € null
3 Giovanni 1‐7‐03 1000 €
1000 € 2
assegnamento progetto
NUM‐
NUM‐PROG
MATR PERC NUM‐
NUM‐PROG TITOLO TIPO
3
1 50 3 Idea Esprit
4
1 50 4 Wide Esprit
3
2 100
4
3 100
21 Basi di Dati ‐ Il modello relazionale dei dati
22. Interrogazioni
Interrogazioni
chi e' il manager di Piero?
impiegato
MATR NOME DATA‐
DATA‐ASS SALARIO MATR‐
MATR‐MGR
1 Piero
Pi 1‐1‐02 1500 €
1500
1500 € 2
2 Giorgio 1‐1‐04 2000 €
2000 € null
3 Giovanni 1‐7‐03 1000 €
1000 € 2
assegnamento progetto
NUM‐
NUM‐PROG
MATR PERC NUM‐
NUM‐PROG TITOLO TIPO
3
1 50 3 Idea Esprit
4
1 50 4 Wide Esprit
3
2 100
4
3 100
22 Basi di Dati ‐ Il modello relazionale dei dati
23. Interrogazioni
Interrogazioni
in quali tipi di progetti lavora Giovanni?
impiegato
MATR NOME DATA‐
DATA‐ASS SALARIO MATR‐
MATR‐MGR
1 Piero
Pi 1‐1‐02 1500 €
1500
1500 € 2
2 Giorgio 1‐1‐04 2000 €
2000 € null
3 Giovanni 1‐7‐03 1000 €
1000 € 2
assegnamento progetto
NUM‐
NUM‐PROG
MATR PERC NUM‐
NUM‐PROG TITOLO TIPO
3
1 50 3 Idea Esprit
4
1 50 4 Wide Esprit
3
2 100
4
3 100
23 Basi di Dati ‐ Il modello relazionale dei dati
24. Esempio: gestione ordini
Esempio: gestione ordini
COD‐
COD‐CLI INDIRIZZO P‐IVA
cliente
ordine COD‐
COD‐ORD COD‐CLI DATA IMPORTO
COD‐CLI DATA
dettaglio
COD‐ORD
COD‐ORD COD‐
COD‐PROD QTA
prodotto
COD‐
COD‐PROD NOME PREZZO
24 Basi di Dati ‐ Il modello relazionale dei dati
25. Interrogazioni
q
quali ordini ha emesso Paolo?
quanti ordini ha emesso Paolo?
quanti ordini ha emesso Paolo?
quante candele sono state ordinate il 5/7/00?
quante candele sono state ordinate il 5/7/00?
calcolare per ciascun cliente la somma degli importi di
ll l l dl d
tutti gli ordini
estrarre l'ordine di importo più alto
25 Basi di Dati ‐ Il modello relazionale dei dati
26. Riflessioni
Riflessioni
Differenza fra schema e istanza
Due attività assai differenti:
progetto dello schema
gestione dell'istanza
Passaggio dai dati all'informazione
gg
query language
26 Basi di Dati ‐ Il modello relazionale dei dati
27. Come arricchire lo schema?
Come arricchire lo schema?
VINCOLI DI INTEGRITA':
escludono alcune istanze in quanto non rappresentano
correttamente il mondo applicativo
CHIAVI
VINCOLI SUI VALORI NULLI (poi)
VINCOLI SUI VALORI NULLI ( i)
INTEGRITA' REFERENZIALE (poi)
VINCOLI GENERICI (poi)
VINCOLI GENERICI ( i)
27 Basi di Dati ‐ Il modello relazionale dei dati
28. Nozione di CHIAVE
Nozione di CHIAVE
Sottoinsieme degli attributi dello schema che ha la
proprietà di unicità e minimalità
unicità:
non esistono due tuple con chiave uguale
minimalità:
sottraendo un qualunque attributo alla chiave si perde la
proprietà di unicità
28 Basi di Dati ‐ Il modello relazionale dei dati
30. Chiavi nell esempio: gestione personale
Chiavi nell'esempio: gestione personale
impiegato
MATR NOME DATA‐ASS
DATA‐ SALARIO MATR‐MIL
MATR‐
assegnamento
g
MATR NUM‐PROG
NUM‐ PERC
progetto
NUM‐
NUM‐PROG NOME PREZZO
30 Basi di Dati ‐ Il modello relazionale dei dati
31. Chiavi nell esempio: gestione ordini
Chiavi nell'esempio: gestione ordini
cliente
COD‐
COD‐CLI INDIRIZZO P‐IVA
ordine
di
COD‐
COD‐ORD COD‐CLI
COD‐ DATA IMPORTO
DATA IMPORTO
dettaglio
COD‐
COD ORD CO ‐PROD
CO ‐O COD‐
COD O QTA
QA
prodotto
COD‐
COD‐PROD NOME PREZZO
31 Basi di Dati ‐ Il modello relazionale dei dati
32. Con molteplici chiavi:
Con molteplici chiavi:
Una è definita CHIAVE PRIMARIA
le rimanenti chiavi sono SECONDARIE
CLIENTE
(COD‐CLIENTE,INDIRIZZO,P‐IVA)
Chiave primaria: COD‐CLIENTE
Chiave secondaria: P‐IVA
Chi di
32 Basi di Dati ‐ Il modello relazionale dei dati