Your SlideShare is downloading. ×
0
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Semantic Analysis of User Browsing Patterns in the Web of Data @USEWOD, WWW2012

151

Published on

Enabling Semantic Analysis of User Browsing Patterns in the Web of Data …

Enabling Semantic Analysis of User Browsing Patterns in the Web of Data
USEWOD Workshop, @WWW2012

A useful step towards better interpretation and analysis of
the usage patterns is to formalize the semantics of the resources
that users are accessing in the Web. We focus on this problem and present an approach for the semantic formalization of usage logs, which lays the basis for e ffective
techniques of querying expressive usage patterns.
We also present a query answering approach, which is useful to find in the logs expressive patterns of usage behavior via formulation of semantic and temporal-based constraints.
We have processed over 30 thousand user browsing sessions
extracted from usage logs of DBPedia and Semantic Web
Dog Food. The logs are semantically formalized using respective
domain ontologies and RDF representations of the
Web resources being accessed. We show the e ffectiveness of our approach through experimental results, providing in this way an exploratory analysis of the way users browse the Web of Data.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
151
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. Enabling Semantic Analysis of User Browsing Patterns in the Web of Data M.Sc. Julia Hoxha Institute of Applied Informatics and Formal Description Methods (AIFB) Karlsruhe Institute of Technology USEWOD Workshop @WWW2012 Lyon, France KIT – University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu
  2. Paper  Hoxha, J., Junghans, M., and Agarwal, S. (2012). Enabling Semantic Analysis of User Browsing Patterns in the Web of Data. In 2nd International Workshop on Usage Analysis and the Web of Data (USEWOD), 21st International World Wide Web Conference (WWW2012), Lyon, France, vol. CoRR, abs/1204.2713.  http://arxiv.org/abs/1204.2713
  3. Outline  Introduction  Framework for Behavior Analysis  Semantic Modeling of Cross-site Browsing Behavior  Web Browsing Activity Model (WAM)  Formalization Approach  Querying Behavioral Patterns  Evaluation  Conclusions J. Hoxha – USEWOD Workshop, Lyon, 2012 3
  4. Introduction  Understanding user behavior in accessing Web resources helps site providers/domain experts: • Discover user preferences or detect bottlenecks swrc:Publication • ID Time Build adaptive Web sites User Action isA 1 [17:11:49:21 http://www.google.de/search?q=Lyon+www2012 • Make appropriate users, etc. 1 [17:11:49:33] http://dbpedia.org/page/Lyon recommendations to swrc:Proceedings 1 [17:11:49:39] http://data.semanticweb.org/conference/ www/2011/demo/a-demo-search-engine-for-products ns2:relatedToEvent dc:creator  How to facilitate the analysis of usage patterns? HTTP Requests of Usage Logs swrc:Conference InProceedi Event ngs ns3:based_near foaf:Person • Provide formal, semantic description of usage logs dbpedia: literal Populated Place • Offer techniques to expressively query patterns ns1:name SWDF Domain Ontology J. Hoxha – USEWOD Workshop, Lyon, 2012 4
  5. Modeling and Analysis Framework Pattern Mining Analysis Querying Capabilities Semantic Formalization Browsing Activity Formalization Transformation Preprocessing Event A Event B Event C Formalization Selection --------- Repository Domain Ontologies Semantic Activity Models Target Data --------- Event K Event N Preprocessed Data Annotation with Domain Ontology Transformed Data Semantic Formalization Semantic Activity Model Web Browsing Behavior Monitoring System Monitoring Cross-site Browsing Activities ? ? www www ... User 1 User n J. Hoxha – USEWOD Workshop, Lyon, 2012 User Session of browsing Events Event e1 = (A1, I1, t1) Type Ai ={content, function} s: <l1, l2, l3, Input I1 = {i1,...,ik} URL l1, Time t1 Event en = (An, In, tn) Type An ..., ln>In = {i1,...,ik} Input URL ln, Time tn 5
  6. Definitions  Event • l full URL invoked, T types, P parameter, t timestamp  Event types • Tc content type of an event • Tf function type of an event  Session • s is ordered sequence of events • , s.t. i is the event order in s • Ts start time and Te end time, s.t. J. Hoxha – USEWOD Workshop, Lyon, 2012 6
  7. Web browsing Activity Model (WAM) http://www.avis.com/car-rental/reservation/ start-reservation.ac?resForm.pickUpLocation=Lyon http://data.semanticweb.org/person/julia-hoxha owa:Parameter Name Literal wam:hasValue wam:hasName wam:Output Variable Literal wam:userID wam:userIP rdfs:subClassOf wam:Input Variable Domain Ontology used for semantic enrichment Literal time:Temporal Entity wam:Parameter time:Instant wam:hasInput wam:User time:Interval wam:hasParameter wam:hasUser wam:inInterval wam:Session wam:hasTime Literal Based on function and content wam:hasEvent wam:order wam:hasStartEvent wam:Event wam:StartEvent ? wam:hasEndEvent wam:EventType wam:function Type rdfs:subClassOf wam:eventURL wam:EndEvent event:Event wam:Function Type wam:EventURL wam:contentType wam:Content Type rdfs:subClassOf wam:fullURL Literal wam:baseURL wam:BaseURL wam:<http://greenlinkeddata.org/wam.owl#> time:<http://www.w3.org/2006/time#> event: <http://purl.org/NET/c4dm/event.owl#> rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 7 rdfs:<http://www.w3.org/2000/01/rdf-schema#>
  8. Formalization Approach  Formalization based on WAM ontology • Step 1. Semantic Enrichment • Step 2. Extend Knowledge Base (ABox assertions for Semantic Formalization Transformation Preprocessing Selection --------- events & domain ontology) • Step 3. RDF Serialization Target Data --------- Preprocessed Data Event A Event B Event C Event K Event N Annotation with Domain Ontology Semantic Activity Models Transformed Data  Semantic Enrichment • • • For each link in logs, find URI of Web resource Find RDF representation of the resource (via a Mapping Template) e.g. SWDF: Extract ontology classes to which it belongs – used as ContentType of event http://data.semanticweb.org/person/julia-hoxha/html HTML (Person, ResearchGroup, Publication, MusicGroup,- etc.) http://data.semanticweb.org/person/julia-hoxha - URI http://data.semanticweb.org/person/julia-hoxha/rdf - RDF/XML J. Hoxha – USEWOD Workshop, Lyon, 2012 8
  9. Semantic Analysis  Querying with semantic constraints Example: - In how many sessions within Mar-Apr 2011 users searched in Google, afterwards visited a page in SWDF? Various levels of abstraction: e.g. instead of google -> any search engine or instead of any page -> WWW2011 page or even higher abstraction -> Conference page „Conference“ isA „WWW2011“ isA e1.time e1.urlBase e1.type s: <e1, ..., e2, ef >  Address also temporal constraints regarding the dynamics of user browsing behavior J. Hoxha – USEWOD Workshop, Lyon, 2012 9
  10. Temporal Constraints  Consider real time (timestamps) and abstract time (order of events) to query usage patterns Q: find sessions with start time Ts and end time Te containing an event e1 with URL www.ex1.org, eventually succeeded by another e2 in the session with URL www.ex2.org  We address temporal logics capable of ontological reasoning AAistrue at atsome state true the next state isis trueat all states after the initial state s1 along the path on • apply temporal operators e.g. next, eventually, always the path X LTL Formula in a (based on Lineal Temporal Logic - LTL) State Transition System • query formulated as LTL formula extended with DL axioms LTL + DL - Proposition A as a set of Abox assertions e.g. J. Hoxha – USEWOD Workshop, Lyon, 2012 10
  11. DL-LTL Query Formulation  Queries formulate • 1) certain conditions on the session itself • 2) temporal patterns in the events within the session  Query: Q (s): find sessions with start time Ts and end time Te containing an event e1 with content type “publication”, eventually succeeded by another e2 with function type “search engine” 2) Temporal patterns within itself 1) Conditions on the session a session expressed as a DL-LTL formula, e.g. J. Hoxha – USEWOD Workshop, Lyon, 2012 11
  12. Query Answering Approach  Step 1. Check constraints on the session itself  Step 2. Verify temporal constraints applying model checking technique Iterate over sessions S={S1, S2,…,Sn} (a) build a finite state automaton (FSA) for each Si (b) verification of DL-LTL formula iterate over the states of FSA to determine whether a condition holds in the respective state J. Hoxha – USEWOD Workshop, Lyon, 2012 12
  13. Evaluation  Validate feasibility of the formalization approach  Show feasibility of the query answering approach • Query sessions with different patterns • Measure performance  Formalization SWDF 2009 DBPedia 3-3 Monitoring Period 01.Jul.0912.Jul.09 01.Jul.0912.Jul.09 avg.#sessions /day 235.9 2899 2831 31893 #sessions Bing 2.7% Google 97% • Only 1.46% of daily sessions containing SPARQL queries SDWF 2009: % of sessions initiated in the domain Dbpedia of sessions DBPedia 2009: %2009 initiated in the domain 13
  14. Evaluation (II)  Querying • answering time varies slightly for the queries (~0.15 seconds) • For up to 1000 sessions below 1.4 seconds time (sec) Q1 J. Hoxha – USEWOD Workshop, Lyon, 2012 • model checking time is small • OWL reasoning takes ~ 94% of the overall answering time nr. sessions 14
  15. Conclusions  Propose a framework for behavior modeling and analysis: • Approach for semantic formalization of logs • Techniques of querying patterns with temporal and semantic constraints  Challenges and Future Work • • • • Find datasets of client-side navigation logs at multiple sites Domain Ontology acquisition Classification Techniques to find FunctionType Optimization of Query Answering J. Hoxha – USEWOD Workshop, Lyon, 2012 15

×