Your SlideShare is downloading. ×
OPAL Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

OPAL Presentation

611
views

Published on

Published in: Technology, Education

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
611
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. OPAL – Real Understanding of Real Estate Forms
    Xiaonan GuoOxford University Computing Laboratory, DIADEM group
  • 2. Diversity in Web Form Design
    2
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 3. Diversity in Web Form Design
    2
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 4. Diversity in Web Form Design
    2
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 5. Diversity in Web Form Design
    3
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 6. Diversity in Web Form Design
    3
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 7. OPAL – Ontology-based web Pattern Analysis with Logic
    Overview
    Data models and Mappings
    Browser, Segmentation, Annotation, and Domain Model
    Segmentation and Phenomenological Mapping
    Analysis and Evaluation
    Future Work
    Outline
    4
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 8. Overview
    5
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Domain-independent
    Hierarchical modeling
    Domain knowledge
    +
    Declarative Definition
    Domain-aware
    Form Analysis
    OPAL
  • 9. Overview
    6
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 10. Browser Model
    7
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 11. Browser Model
    7
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 12. Browser Model
    7
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 13. Browser Model
    8
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    html_element(e_320_input,320,321,319,input,d1).
    html_attr(e_320_input_name,e_320_input,name,"location“,d1).
    html_attr(e_320_input_type,e_320_input,type,"radio“,d1).
    ...
    box(e_320_input,106,253,120,267,14,14).
    css_attr(e_320_input,bottom,"auto").
    css_attr(e_320_input,clear,"none").
    css_attr(e_320_input,color,"rgb(0,0,0)").
    ...
  • 14. Segmentation Model
    9
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    grouping of related form elements, e.g. fields, labels
    achieved via Segmentation Mapping, which
    groups form fields
    assigns labels to fields and groups
  • 15. Segmentation Model – groups
    10
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Form elements are grouped if
    they occur in sequence
    they have similarities in attribute values or appearances
    their least common ancestor contains no other elements
    group(Es) :- similarFieldSequence(Es), leastCommonAncestor(A,Es), not hasAdditionalField(A,Es).
    leastCommonAncestor(A,Es) :- commonAncestor(A,Es),
    not ( child(C,A), commonAncestor(C,Es) ).
    partOf(E,A) :-
    group(Es), member(E,Es), leastCommonAncestor(A,Es).
  • 16. Segmentation Model – groups
    161111
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    group([e_320_input,e_326_input,
    e_332_input,e_338_input,e_344_input]).
    leastCommonAncestor(
    e_319_td,[e_320_input,e_326_input,...,e_344_input]).
    partOf(e_320_input,e_319_td).
    partOf(e_326_input,e_319_td).
    partOf(e_332_input,e_319_td).
    partOf(e_338_input,e_319_td).
    partOf(e_344_input,e_319_td).
  • 17. Segmentation Model – labels
    12
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
  • 18. Segmentation Model – labels
    13
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
    hasBasicLabel(E,L,T) :-
    html_element(E,_,_,_,input,_), html_attr(_,E,id,ID,_),
    html_element(N,_,_,_,label,_), html_attr(_,N,for,ID,_),
    child(L,N), html_text(L,T,_).
  • 19. Segmentation Model – labels
    14
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
    greatestUniqueAncestor(A,E) :- uniqueDescendant(E,A),
    not ( parent(P,A), uniqueDescendant(E,P) ).
    hasBasicLabel(E,L,T) :-
    group(E), greatestUniqueAncestor(A,E), descendant(L,A), html_text(L,T,_).
  • 20. Segmentation Model – labels
    15
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
    hasLabel(E,L,T) :-
    partOf(E,G), leastCommonAncestor(G,Es), group(Es), hasNoLabel(Es), textLists(LLs,G),
    sameLength(Es,LLs),
    labelOneToOne(E,Ls,Es,LLs),
    member(L,Ls),
    html_text(L,T,_).
  • 21. Segmentation Model – labels
    16
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
    hasLabel(E,L,LText) :-
    hasTextBox(F,B),
    descendantTextOf(L,B),
    html_text(L,_,_,_,_),
    node_text(L,LText,true)
  • 22. Segmentation Model – labels
    17
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Texts are assigned to fields and groups using
    Field: HTML <label>, Greatest unique ancestor
    Segment: Text-field alignment in groups
    Page: Visual alignment
  • 23. Segmentation Model – labels
    18
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    hasLabel(e_320_input,t_322,”Nailsea / Backwell”).
    hasLabel(e_358_select,t_354,”Min. beds”).
  • 24. Segmentation Model – labels
    18
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    hasLabel(e_319_td,t_316,”Area: ”).
  • 25. Segmentation Model – labels
    19
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    hasLabel (e_304_input,t_302,"To Buy:").
    hasLabel (e_308_input,t_306,"To Rent:").
    hasLabel(e_320_input,t_322,"Nailsea / Backwell").hasLabel(e_326_input,t_328,"Portishead / Pill").hasLabel(e_338_input,t_340,"Yatton / Congresbury").hasLabel(e_332_input,t_334,"Clevedon").
    hasLabel(e_344_input,t_346,"Bristol / Weston-super-mare").
    hasBasicLabel(e_358_select,t_354,"Min Beds"). hasBasicLabel(e_515_select,t_400,"Min Price").hasBasicLabel(e_705_select,t_594,"Max Price").hasBasicLabel(e_788_select,t_784,"View Order").
    hasLabel(e_319_td,t_316,"Area"). hasLabel(e_297_tbody,t_290,”Find a property to buy or rent…").
  • 26. Overview
    20
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 27. Obtained from Browser model
    Relying on domain-specific knowledge, represents
    linguistic annotations
    machine learning based classifications
    Annotation Model
    21
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 28. Annotation Model
    22
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    annotation(attrclob_d1_5560,attrclob_d1,80,87,"Nailsea").
    annotationFeature(attrclob_d1_5560,"majorType","location").
    annotationFeature(attrclob_d1_5560,"minorType“,"district_county_etc").
    annotation(elclob_d1_1707,elclob_d1,2028,2038,"Min. price").
    annotationFeature(elclob_d1_1707,"modifier“,"min").
    annotationFeature(elclob_d1_1707,"minorType“,"price").
    annotationFeature(elclob_d1_1707,"majorType“,"reform.label").
  • 29. Describes conceptual entities on forms as in domain ontology
    Achieved via phenomenological mapping, which
    correlates labels with annotations
    classifies form elements
    Domain Model
    23
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 30. Domain Model – Ontology
    24
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    purpose={buy, rent, combined}
  • 31. Domain Model – Ontology
    24
    May 27, 2011
    priceType={min, max, approximate, range}
    OPAL - ontology-based web pattern analysis with logic
  • 32. Form elements are annotated as follows
    Form elements are classified with Concept C
    and Facets C_F
    Domain Model – Classification
    25
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    hasAnnotation(E,A) :-
    hasLabel(E,_,T),
    annotation(Aid,_,_,_,T),
    annotationFeature(Aid,_,Anno).
    C(X) :- leafSegment(X), hasAnnotation(X,A), Clabel(A).
    C_F(X,F) :- C(X), hasAnnotation(X,F), C_FLabel(F).
    C_F(X,F) :- C(X), hasValueAnnotation(X,F), C_FValue(F)
  • 33. Domain Model – Classification
    26
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    priceElement(e_515_select,e_286_form). priceType(e_515_select,"min").
  • 34. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price Element(max)
    ...
  • 35. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price Element(max)
    ...
  • 36. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    Geographic
    Segment
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price Element(max)
    ...
  • 37. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    Geographic
    Segment
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price Element(max)
    ...
  • 38. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    Geographic
    Segment
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price
    Segment
    Price Element(max)
    ...
  • 39. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    Geographic
    Segment
    AreaBranch Element
    AreaBranch Element
    AreaBranch Element
    ...
    Price Element(min)
    Price
    Segment
    Price Element(max)
    ...
  • 40. Domain Model
    27
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    ...
    AreaBranch Element
    AreaBranch Element
    Area-Branch
    Segment
    Geographic
    Segment
    AreaBranch Element
    AreaBranch Element
    Real-
    Estate
    Form
    AreaBranch Element
    ...
    Price Element(min)
    Price
    Segment
    Price Element(max)
    ...
  • 41. Overview
    28
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 42. Analysis and Evaluation
    29
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    UK Real Estate Domain
    50 domain web forms (sampled from over 2800)
    Tested Domain-independent and Domain-aware
  • 43. Analysis and Evaluatio - Real-Estate
    30
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Domain independent
  • 44. Analysis and Evaluatio - Real-Estate
    31
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
    Domain independent
    Domain aware
  • 45. Improve structural segmentation
    Visual segmentation and labeling
    Ontology guided segmentation
    Accelerate domain adaption
    Calling for machine learning for ontology creation
    Enhance ambiguity resolution
    Necessitating probabilistic logic in the future
    Interactive form filling / probing
    Conclusion and Future Work
    32
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic
  • 46. Thank you very much !
    May 27, 2011
    OPAL - ontology-based web pattern analysis with logic

×