OPAL – Real Understanding of Real Estate Forms<br />Xiaonan GuoOxford University Computing Laboratory, DIADEM group<br />
Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Diversity in Web Form Design<br />3<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Diversity in Web Form Design<br />3<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
OPAL – Ontology-based web Pattern Analysis with Logic<br />Overview<br />Data models and Mappings<br />Browser, Segmentati...
Overview<br />5<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Domain-independent<br />...
Overview<br />6<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Browser Model<br />8<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />html_element(e_320_...
Segmentation Model<br />9<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />grouping of re...
Segmentation Model – groups<br />10<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Form...
Segmentation Model – groups<br />161111<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...
Segmentation Model – labels<br />12<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />13<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />14<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />15<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />16<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />17<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Text...
Segmentation Model – labels<br />18<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasL...
Segmentation Model – labels<br />18<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasL...
Segmentation Model – labels<br />19<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasL...
Overview<br />20<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Obtained from Browser model<br />Relying on domain-specific knowledge, represents<br />linguistic annotations<br />machine...
Annotation Model<br />22<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />annotation(attr...
Describes conceptual entities on forms as in domain ontology<br />Achieved via phenomenological mapping, which<br />correl...
Domain Model – Ontology<br />24<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />purpose=...
Domain Model – Ontology<br />24<br />May 27, 2011<br />priceType={min, max, approximate, range}<br />OPAL - ontology-based...
Form elements are annotated as follows<br />Form elements are classified with Concept C<br />and Facets C_F<br />Domain Mo...
Domain Model – Classification<br />26<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />pr...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch...
Overview<br />28<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Analysis and Evaluation<br />29<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />UK Real ...
Analysis and Evaluatio  - Real-Estate<br />30<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic...
Analysis and Evaluatio  - Real-Estate<br />31<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic...
Improve structural segmentation<br />Visual segmentation and labeling<br />Ontology guided segmentation<br />Accelerate do...
Thank you very much !<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
Upcoming SlideShare
Loading in …5
×

OPAL Presentation

1,112 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

OPAL Presentation

  1. 1. OPAL – Real Understanding of Real Estate Forms<br />Xiaonan GuoOxford University Computing Laboratory, DIADEM group<br />
  2. 2. Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  3. 3. Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  4. 4. Diversity in Web Form Design<br />2<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  5. 5. Diversity in Web Form Design<br />3<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  6. 6. Diversity in Web Form Design<br />3<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  7. 7. OPAL – Ontology-based web Pattern Analysis with Logic<br />Overview<br />Data models and Mappings<br />Browser, Segmentation, Annotation, and Domain Model<br />Segmentation and Phenomenological Mapping<br />Analysis and Evaluation<br />Future Work<br />Outline<br />4<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  8. 8. Overview<br />5<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Domain-independent<br />Hierarchical modeling<br />Domain knowledge<br />+<br />Declarative Definition<br />Domain-aware <br />Form Analysis<br />OPAL<br />
  9. 9. Overview<br />6<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  10. 10. Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  11. 11. Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  12. 12. Browser Model<br />7<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  13. 13. Browser Model<br />8<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />html_element(e_320_input,320,321,319,input,d1).<br />html_attr(e_320_input_name,e_320_input,name,"location“,d1).<br />html_attr(e_320_input_type,e_320_input,type,"radio“,d1).<br />...<br />box(e_320_input,106,253,120,267,14,14).<br />css_attr(e_320_input,bottom,"auto").<br />css_attr(e_320_input,clear,"none"). <br />css_attr(e_320_input,color,"rgb(0,0,0)").<br />...<br />
  14. 14. Segmentation Model<br />9<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />grouping of related form elements, e.g. fields, labels<br />achieved via Segmentation Mapping, which<br />groups form fields<br />assigns labels to fields and groups<br />
  15. 15. Segmentation Model – groups<br />10<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Form elements are grouped if<br />they occur in sequence <br />they have similarities in attribute values or appearances<br />their least common ancestor contains no other elements<br />group(Es) :- similarFieldSequence(Es), leastCommonAncestor(A,Es), not hasAdditionalField(A,Es).<br />leastCommonAncestor(A,Es) :- commonAncestor(A,Es),<br /> not ( child(C,A), commonAncestor(C,Es) ).<br />partOf(E,A) :- <br /> group(Es), member(E,Es), leastCommonAncestor(A,Es).<br />
  16. 16. Segmentation Model – groups<br />161111<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />group([e_320_input,e_326_input,<br /> e_332_input,e_338_input,e_344_input]).<br />leastCommonAncestor(<br /> e_319_td,[e_320_input,e_326_input,...,e_344_input]).<br />partOf(e_320_input,e_319_td).<br />partOf(e_326_input,e_319_td).<br />partOf(e_332_input,e_319_td).<br />partOf(e_338_input,e_319_td).<br />partOf(e_344_input,e_319_td).<br />
  17. 17. Segmentation Model – labels<br />12<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />
  18. 18. Segmentation Model – labels<br />13<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />hasBasicLabel(E,L,T) :-<br /> html_element(E,_,_,_,input,_), html_attr(_,E,id,ID,_),<br /> html_element(N,_,_,_,label,_), html_attr(_,N,for,ID,_),<br /> child(L,N), html_text(L,T,_).<br />
  19. 19. Segmentation Model – labels<br />14<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />greatestUniqueAncestor(A,E) :- uniqueDescendant(E,A),<br /> not ( parent(P,A), uniqueDescendant(E,P) ).<br />hasBasicLabel(E,L,T) :- <br /> group(E), greatestUniqueAncestor(A,E), descendant(L,A), html_text(L,T,_).<br />
  20. 20. Segmentation Model – labels<br />15<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />hasLabel(E,L,T) :-<br /> partOf(E,G), leastCommonAncestor(G,Es), group(Es), hasNoLabel(Es), textLists(LLs,G),<br /> sameLength(Es,LLs), <br /> labelOneToOne(E,Ls,Es,LLs),<br /> member(L,Ls), <br /> html_text(L,T,_).<br />
  21. 21. Segmentation Model – labels<br />16<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />hasLabel(E,L,LText) :- <br />hasTextBox(F,B),<br />descendantTextOf(L,B),<br />html_text(L,_,_,_,_),<br />node_text(L,LText,true)<br />
  22. 22. Segmentation Model – labels<br />17<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Texts are assigned to fields and groups using<br />Field: HTML <label>, Greatest unique ancestor<br />Segment: Text-field alignment in groups<br />Page: Visual alignment<br />
  23. 23. Segmentation Model – labels<br />18<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasLabel(e_320_input,t_322,”Nailsea / Backwell”).<br />hasLabel(e_358_select,t_354,”Min. beds”).<br />
  24. 24. Segmentation Model – labels<br />18<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasLabel(e_319_td,t_316,”Area: ”).<br />
  25. 25. Segmentation Model – labels<br />19<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasLabel (e_304_input,t_302,"To Buy:").<br />hasLabel (e_308_input,t_306,"To Rent:").<br />hasLabel(e_320_input,t_322,"Nailsea / Backwell").hasLabel(e_326_input,t_328,"Portishead / Pill").hasLabel(e_338_input,t_340,"Yatton / Congresbury").hasLabel(e_332_input,t_334,"Clevedon").<br />hasLabel(e_344_input,t_346,"Bristol / Weston-super-mare").<br />hasBasicLabel(e_358_select,t_354,"Min Beds"). hasBasicLabel(e_515_select,t_400,"Min Price").hasBasicLabel(e_705_select,t_594,"Max Price").hasBasicLabel(e_788_select,t_784,"View Order").<br />hasLabel(e_319_td,t_316,"Area"). hasLabel(e_297_tbody,t_290,”Find a property to buy or rent…").<br />
  26. 26. Overview<br />20<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  27. 27. Obtained from Browser model<br />Relying on domain-specific knowledge, represents<br />linguistic annotations<br />machine learning based classifications<br />Annotation Model<br />21<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  28. 28. Annotation Model<br />22<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />annotation(attrclob_d1_5560,attrclob_d1,80,87,"Nailsea").<br />annotationFeature(attrclob_d1_5560,"majorType","location").<br />annotationFeature(attrclob_d1_5560,"minorType“,"district_county_etc").<br />annotation(elclob_d1_1707,elclob_d1,2028,2038,"Min. price").<br />annotationFeature(elclob_d1_1707,"modifier“,"min").<br />annotationFeature(elclob_d1_1707,"minorType“,"price").<br />annotationFeature(elclob_d1_1707,"majorType“,"reform.label").<br />
  29. 29. Describes conceptual entities on forms as in domain ontology<br />Achieved via phenomenological mapping, which<br />correlates labels with annotations<br />classifies form elements<br />Domain Model<br />23<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  30. 30. Domain Model – Ontology<br />24<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />purpose={buy, rent, combined}<br />
  31. 31. Domain Model – Ontology<br />24<br />May 27, 2011<br />priceType={min, max, approximate, range}<br />OPAL - ontology-based web pattern analysis with logic<br />
  32. 32. Form elements are annotated as follows<br />Form elements are classified with Concept C<br />and Facets C_F<br />Domain Model – Classification<br />25<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />hasAnnotation(E,A) :- <br /> hasLabel(E,_,T), <br /> annotation(Aid,_,_,_,T),<br /> annotationFeature(Aid,_,Anno).<br />C(X) :- leafSegment(X), hasAnnotation(X,A), Clabel(A).<br />C_F(X,F) :- C(X), hasAnnotation(X,F), C_FLabel(F).<br />C_F(X,F) :- C(X), hasValueAnnotation(X,F), C_FValue(F)<br />
  33. 33. Domain Model – Classification<br />26<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />priceElement(e_515_select,e_286_form). priceType(e_515_select,"min").<br />
  34. 34. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price Element(max)<br />...<br />
  35. 35. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price Element(max)<br />...<br />
  36. 36. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />Geographic<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price Element(max)<br />...<br />
  37. 37. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />Geographic<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price Element(max)<br />...<br />
  38. 38. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />Geographic<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price<br />Segment<br />Price Element(max)<br />...<br />
  39. 39. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />Geographic<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price<br />Segment<br />Price Element(max)<br />...<br />
  40. 40. Domain Model<br />27<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />...<br />AreaBranch Element<br />AreaBranch Element<br />Area-Branch<br />Segment<br />Geographic<br />Segment<br />AreaBranch Element<br />AreaBranch Element<br />Real-<br />Estate<br />Form<br />AreaBranch Element<br />...<br />Price Element(min)<br />Price<br />Segment<br />Price Element(max)<br />...<br />
  41. 41. Overview<br />28<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  42. 42. Analysis and Evaluation<br />29<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />UK Real Estate Domain<br />50 domain web forms (sampled from over 2800)<br />Tested Domain-independent and Domain-aware<br />
  43. 43. Analysis and Evaluatio - Real-Estate<br />30<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Domain independent<br />
  44. 44. Analysis and Evaluatio - Real-Estate<br />31<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />Domain independent<br />Domain aware<br />
  45. 45. Improve structural segmentation<br />Visual segmentation and labeling<br />Ontology guided segmentation<br />Accelerate domain adaption<br />Calling for machine learning for ontology creation<br />Enhance ambiguity resolution<br />Necessitating probabilistic logic in the future<br />Interactive form filling / probing<br />Conclusion and Future Work<br />32<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />
  46. 46. Thank you very much !<br />May 27, 2011<br />OPAL - ontology-based web pattern analysis with logic<br />

×