I- Extended Databases


Published on

I- Extended Databases

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

I- Extended Databases

  1. 1. I- Extended Databases Key words:Key words: Knowledge Discovery inKnowledge Discovery in Databases (KDD).Databases (KDD). Data Mining (DM).Data Mining (DM). Data Warehousing (DW) .Data Warehousing (DW) . Query Optimization (QO).Query Optimization (QO).
  2. 2. Assistant Professor, Computer Science Department, Faculty of Science, Al-Tahadi University, P.O. Box 727, Sirt ,Libya, Dr. Zakaria Suliman ZubiDr. Zakaria Suliman Zubi ByBy
  3. 3. 3 I- Extended DatabasesI- Extended Databases  Abstract .  Introduction of the Indicative Databases .  I-Extended Databases (IE) motivation.  I-Extended Databases (IE) and KDD processes .  Example .  Conclusions and Remarks .  Questions.
  4. 4. 4 AbstractAbstract (1)  How we can handle generalizations in a very large database using Association Rules (AR), and inclusion Functional Dependencies (FD)?  The answer is Inductive database.  I- Extended database has a similar property to inductive databases.  I- Extended database contain exceedingly defined generalizations about the data .
  5. 5. 5 AbstractAbstract (2)  It can be used in the process of Data Mining.  It was proposed in ODBC_KDD(2) Model.  The query will uses normal database terminology.  The main aim of I-Extended database is to interact with a spatial Data Mining query called Knowledge Discovery Query Language (KDQL) described in [22].  The KDQL was demonstrated and introduced as a query in the ODBC_KDD (2) model in [22].
  6. 6. 6 Introduction of the Indicative DatabasesIntroduction of the Indicative Databases  KDD process, contains several steps: understanding the domain, preparing the data set, discovering patterns (i.e., computing a theory), post-processing of discovered patterns, and putting the results into use.  KDD, we need a query language that not only enables the user to select subsets of the data, but also to specify DM tasks and select patterns from the corresponding theories.  Considering the KDQL rules operator which was described in [ 21] as a possible querying language on mining association rules for i-extended database.  Query should be an object of a similar type than its arguments.
  7. 7. 7 The model was introduced at the Institute of Mathematics andThe model was introduced at the Institute of Mathematics and Informatics at Debrecen University, Debrecen, Hungary 2002.Informatics at Debrecen University, Debrecen, Hungary 2002. I-Extended Databases Motivation Gateway
  8. 8. 8  I-Extended database is a pair R = (R, (PR, e, V))  Where : –R is a database schema. –PRis a collection of patterns. –V is a set of result values . – e is the evaluation function that defines pattern semantics.  This function maps each pair (r, θi) to an element of V, where r is a database over R and θi P∊ R is a pattern.  An instance of the schema, i-extended database (r, s) over the schema R consists of a database r over the schema R and a subset s ⊆ PR. I-Extended Databases MotivationI-Extended Databases Motivation continuecontinue
  9. 9. 9  Example : If the patterns are Boolean formulae about the database, V is {true, false}, And the evaluation function e(r, θ) has value true iff the formula θ is true about r. In practice, a user might be interested in selecting from the intentionally defined collection of all Boolean formulas, the formulas which are true or the formulas which are false. I-Extended Databases MotivationI-Extended Databases Motivation continuecontinue
  10. 10. 10 I-Extended Databases MotivationI-Extended Databases Motivation continuecontinue  I-Extended Database : Is a database that in addition to data also contain exceedingly defined generalizations about the data. First we illustrate the Association Rules, and then we Generalize the approach and point out key issues for query evaluation in general.  I-Extended database is a database that has similar properties that are in inductive database that shows how it can be used throughout the whole process of DM due to the closure property of the framework.
  11. 11. 11 I-Extended Databases MotivationI-Extended Databases Motivation continuecontinue  The aim of I-Extended Database is as follow:The aim of I-Extended Database is as follow: – I-extended database consists of a normal database associated to a subset of patterns from a class of patterns, and an evaluation function that tells how the patterns occur in the data. – I-extended database can be queried (in principle) just by using normal relational algebra or SQL, with the added property of being able to refer to the values of the evaluation function on the patterns. – Modeling KDD processes as a sequence of queries on i-extended database gives rise to chances for reasoning and optimizing these processes
  12. 12. 12 I-Extended Databases (IE) and KDD processes  KDD consists of several steps one of these steps is Data Mining.  In Data Mining process we are concerned with unique class of patterns for a real life mining processes presented in a dynamic nature of knowledge acquisition scenario.  These interesting patterns will be presented in I-Extended Databases based on there captured frequency, confidence and support values.  Knowledge gathered often affects the search process, giving rise to new goals in addition to the original ones.
  13. 13. 13 I-Extended Databases (IE) and KDD processI-Extended Databases (IE) and KDD process continuecontinue  KDD processes can be described by sequences of operations, i.e., queries over relevant i-extended database.  Sequences of queries are abstract and concise descriptions of DM processes.  These descriptions can even be annotated by statistical information about the size of selected dataset, the size of intermediate collection of patterns etc..  Providing knowledge for further use of these relevant sequences.
  14. 14. 14 Example/ Patterns in three instances of I-Extended Database  Schema R = {A1,…..,An} of attributes with domain {0, 1}.  Relation r over R, an association rule about r is an expression of the form X⇒B where X ⊆ R and B ∊R X.  The intuitive meaning of the rule is that if a row of the matrix r has a 1 in each column of X, then the row tends to have a 1 also in column B.  This semantics is captured by frequency and confidence values. Given W ⊆ R, support (W, r) denotes the fraction of rows of r that have a 1 in each column of W.  The frequency of X ⇒ B in r is defined to be support(X ⋃{B}, r) while its confidence is support(X ⋃ {B}, r)/ support(X , r). Typically, we are interested in association rules for which the frequency and the confidence are greater
  15. 15. 15 Conclusions and RemarksConclusions and Remarks  I-Extended Databases enables the definition of mining process as a sequences of queries by using a closure property.  I-Extended Databases is a mandatory step towards to a general purpose query languages for KDD applications.  I-Extended Databases supports pattern generation, pattern filtering and pattern combining operations.  I-Extended Databases can uses standard database terminology to carry out any significant patterns without introducing any additional concepts .
  16. 16. 16 Importance ReferencesImportance References  [20] T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Communications of ACM, 39:58-64, 1996.  [21] Zakaria S. Zubi, Knowledge Discovery in Remote Access Database, Ch. 9 , PhD dissertation, Debrecen University, Hungary, 2002.  [22] Zakaria S. Zubi, Fazekas Gábor, On ODBC_KDD models, paper,5th International Conference on Applied Informatics, , 28 January -3 February 2001, Eger, Hungary,2001.
  17. 17. 17 Thank you!!!
  18. 18. 18
  19. 19. 19