Your SlideShare is downloading. ×
Inconsistencies of Connection for Heterogeneity and a New Rela,on Discovery Method
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Inconsistencies of Connection for Heterogeneity and a New Rela,on Discovery Method

75
views

Published on

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
75
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Inconsistencies  of  Connec,on  for   Heterogeneity  and   a  New  Rela,on  Discovery  Method  that   Solved  them Takafumi  NAKANISHI  ,  Kiyotaka  UCHIMOTO,  Yutaka  KIDAWARA   Na,onal  Ins,tute  of  Informa,on  and  Communica,on   Technology  (NICT),  Japan  
  • 2. What’s  Big  Data? •  Speed  up?  Processing  a  lot  of  data?   –  What  differences  are  there  between  VLDB  and  Big   Data.  (Very  Large  Database)?   •  Fragmental  data  exist   –  Un,l  now,  scien,sts  work  such  data  for  simula,on.   •  Heterogeneous  Database  Integra,on(Cross   database  search)   –  S,ll  Considering?
  • 3. Purposes  of  this  presenta,on •  We  should  consider  the  paradigm  shiV  in   computer  science.   – From  the  closed  assump,on  to  the  opened   assump,on   – What  are  there  any  problems?   •  Businesspeople  require  not  only  EDW  (Enterprise  Data   Warehouse)  but  also  the  other  analysis  methods.     •  Discovering  rela,on  between  heterogeneous  concept,   dataset,  etc.   •  Three  Opened  Assump,on’s  Evils  
  • 4. True  Problem  Defini,ons  of  Big  Data Rela,on  Discovery  in   Heterogeneity   Big   data Speeding  Up,   Promo,on  of   Streamlining,  and   Increasing  Data   Volume    for  Processing Schemaless   Data  and  New   Data  Processing   Method Distributed  Parallel   Processing,  High   Performance  Compu,ng   (HPC),  Network  Delay,   etc.   Construc)on  of  Big  data   environment  (Hardware,   middleware  researches) Big  data  analy)cs   (So=ware  researches) Closed  Assump,on  System   à  Open  Assump,on  System
  • 5. AI  Community DB  Community a1 a2 b10 b8 a9 a8 a7 a6 a5 a4 a3 b9 b6 b7 b4 b5 b2 b3 b1 Someone  adds  rela,onships  between  a3  and  b4 Rela,onships  among  persons  in  communi,es  AI  and  DB.  ai,  bj  are  researchers.  When  someone   adds  symmetric  and  transi,ve  rela,onships  between  a3  and  b4,  it  is  true  that  a1  is  related  to   b5  because  a1  is  related  to  a3,  a3  is  related  to  b4,  and  b4  is  related  to  b5.
  • 6. Office  Community Music  Community a1 a2 b10 b8 a9 a8 a7 a6 a5 a4 a3 b9 b6 b7 b4 b5 b2 b3 b1 Someone  adds   rela,onships   between  a3  and  b4 Rela,onships  among  persons  in  workplace  and  music  communi,es.   ai  are  co-­‐workers,  and  bj  are  musicians.  When  someone  adds  symmetric  and  transi,ve   rela,onships  between  a3  and  b4,  it  is  actually  not  true  that a1  is  related  to  b5.  In  graph   structure,  it  is  true  that  a1  is  related  to  b5. However,  realis,cally,  a1  and  b5  do  not  share   ground  without  other  defini,ons  or  analysis.
  • 7. Difference  of  two  examples •  “AI Community” ∩  “DB Community” ≠ ∅. à Closed Assumption – Representation of relations in the previous methods such as owl, RDF, etc. •  “Office  Community” ∩  “Music  Community”  =  ∅. àOpened Assumption – unable representation of relations in the previous method
  • 8. Proof  of  inconsistency  of  order  rela,on   between  two  certain  sets  [1/2] •  A = {a1, a2, … , an}, B = {b1, b2, …, bm} •  A ∩ B = ∅. •  Both sets A and B may define the order relations differently. •  prove that we cannot discover the relationship between sets A and B or other relationships when we get relationship f between a1 ∈ A and b1 ∈ B. à b1=f(a1)
  • 9. Proof  of  inconsistency  of  order  rela,on   between  two  certain  sets  [2/2] •  We prove that it is satisfied when bi = f(ai) is not true by induction. –  b1 = f(a1) is true by the above condition when i = 1. –  We assume that bk = f(ak) is true when i = k. –  When i = k + 1, bk+1 = f(ak+1) is not true. •  set A has an order relation. set B has another order relation. –  bk ≤ bk+1 may not be true, if ak ≤ ak+1 is true and vice versa. Furthermore, both ak ≤ ak+1 and bk ≤ bk+1 may not be true. •  Although b1 = f(a1) is true, bi = f(ai) is not.
  • 10. Proof  of  inconsistency  of  the  transi,ve   rela,on  between  two  certain  sets[1/2] •  A = {a1, a2, … , an}, B = {b1, b2, …, bm} •  A ∩ B = ∅. •  Set B has order relation b1 ≤ b2 ≤ b3 ≤ b4… – Transitive relation – If b1 ≤ b2 and b2 ≤ b3 are true, b1 ≤ b3 is true •  Set A has its own order relation.
  • 11. Proof  of  inconsistency  of  the  transi,ve   rela,on  between  two  certain  sets[2/2] •  Assume a1 = (1, 5), b1 =(2, 1), b2 = (3, 2), b3 = (4, 3). •  We prove that a1 ≤ b3 is true when we get relation a1 ≤ b1. •  To reveal the conclusion first, a1 ≤ b3 may not satisfy. •  The relationship of a1 and b1 focuses on each first element. •  Then a1 ≤ b1 is true. •  The order relation of set B focuses on more values of each second element. •  Then b1 ≤ b2 ≤ b3, and if b1 ≤ b2 and b2 ≤ b3 is true, then b1 ≤ b3 is true. •  However, a1 ≤ b3 is not true in the order set of set B. •  Like the relation of a1 and b1, an inconsistency occurs whose order and transitive relations of set B are not guaranteed.
  • 12. Inconsistencies     –  Three  Opened  Assump,on’s  Evils   •  Inconsistency  is  shown  whose  rela,on  does  not   guarantee  the  future   •  Inconsistency  where  any  transi,ve  rela,on  is  not   true,  when  anyone  connects  links  for   heterogeneous  fields   •  Inconsistency  where  any  rela,on  in   heterogeneous  fields  cannot  be  discovered  in  set   theory
  • 13. Misconcep,on  of  Future  Informa,on   Systems •  A  user  Do  Not  want  to  retrieve  some  data,  need   some  solu,ons   –  A  system  solve  some  clues  for  a  user  from  data  by   rela,vely  comparing   –  It  is  important  to  rela,vely  compare  between  data.   •  We  can  Not  write  anymore  rela,onships   –  dynamical  changing  depending  on  user,  situa,on,  etc.   –  when  data  are  changing,  rela,onships  are  changing   •  We  cannot  create  indexes.   •  We  cannot  discover  without  wri,ng  rela,onships   –  However,  a  system  can  compare  on  the  basis.  
  • 14. Functional Predicate Set Theory Coordinates System •  commutative property •  associative property •  distributive property •  reflexive relation •  antisymmetric relation •  transitive relation •  axis adaptability evaluation •  uniqueness evaluation •  certainty evaluation •  predicate satisfaction evaluation Incomplete  Mutual  Map  Transforma,on   Framework  between  set  theory  and  the   Cartesian  system  of  coordinates. Mutual  mapping  by  mathema,cal  rule,  formula,  etc.   (Because  the  mathema,cal  rule  and  formula  are  closed  assump,on)
  • 15. Overview  of  our  method Sampling  Data •  A  query  given  by  a  user •  Sampling  the  data  set  depend  on  a   query Selec,on  of  Basis •  A  system  selects  some  basis  for  solu,on  of  query •  Order  rela,onships?,  con,nues  or  equal  interval   Sampling?     Mapping    from  set   theory  to  the   Cartesian  system  of   coordinates •  Mathema,cal  rule/formula     à  closed  assump,on •  Crea,on  transforma,on  opera,on  on  the  closed   assump,on  manually. Discovery  of   rela,onships  on  the   the  Cartesian   system  of   coordinates •  Predefini,on  of  func,onal  predicates •  Sa,sfying  each  func,on  predicates  Re-­‐mapping  to  set   theory •   Representa,on  of    predicate  in  predicate  func,ons •   Representa,on  of  reasons  in  basis 1 2 3 4 5
  • 16. Example:  Crea,on  Func,onal   Predicate  –  dependOn •  ”dependOn” means that set A relies on set X. – The value of element ai of set A should only change with the variation of the value of element xj of set X. •  ”dependOn” is represented in {A}(X), when set A depends on set X.
  • 17. Example  Dataset   Jan. Feb. Mar. Apr. May. Jun. Jul. Aug. Sep. Oct. Nov. Dec. Ave. 2007 4.9 6.1 8.2 12.3 18.7 22.7 23.5 28 24.1 17.1 11 6.5 15.3 2008 3.6 2.9 9 13.6 18 21.1 26.3 25.8 22.9 17.6 10.7 6.9 14.9 2009 4.3 5.5 7.6 14.1 19.4 22.2 25.4 25.8 22 16.8 11.4 6.7 15.1 2010 4.3 4.8 7.2 11.2 18.1 23.5 27 29 24.2 17.7 11.2 7.2 15.5 2011 2.4 4.9 6.1 12.6 17.8 22.9 27.1 26.6 23.9 17.1 12.3 4.8 14.9   cucumber cabbage 2007 1168 604 2008 1226 594 2009 1102 662 2010 1231 739 2011 1179 573 MONTHLY  AVERAGE  TEMPERATURE  IN  GUMMA  PREFECTURE,  JAPAN ANNUAL  AVERAGE  PRICE  (Y  en)  OF  CUCUMBERS  (5kg)  AND  CABBAGE(10kg)
  • 18. Result     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Cucumber AAE 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 UE 0.031 0.394 0.028 0.345 0.707 0.002 0.207 0.188 0.355 0.924 0.090 0.043 CE 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 BV -9.590 -27.269 8.075 -27.022 -67.471 2.254 16.039 16.006 32.882 132.937 -25.899 11.466 Cabbage AAE 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 UE 0.243 0.024 0.007 0.199 0.052 0.255 0.045 0.330 0.003 0.114 0.048 0.436 CE 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 BV 34.617 8.705 -5.190 -26.357 23.588 37.635 9.576 27.231 3.696 59.930 -24.346 47.057 •  AAE: axis adaptability evaluation •  UE: uniqueness evaluation •  CE: certainty evaluation •  BV: predicate satisfaction evaluation {Cucumber Price}(May temperature) Discovered  dependOn  Rela,ons {Cucumber Price}(Oct temperature) {Cabbage Price}(Dec temperature)
  • 19. Conclusion •  Three  opened  assump,on  evils   –  We  represented  the  inconsistencies  of  past  researches  that   contributed  to  the  interconnec,on  of  such  heterogeneous   fields  as  Linked  Data,  and  our  past  researches.   •  Map  transforma,on  framework  from  set  theory  to  the   Cartesian  system  of  coordinates   –  defining  such  predicate  func,ons  as  disjoint, meet, overlap, coveredBy, covers, equal, contain, inside, correlate, moreThan, lessThan, alongWith, join, etc. •  A  preliminary  evalua,on  of  predicate  func,on   ”dependOn”
  • 20. Thank  you