Defensa.V11

Web User Behavior Analysis Doctorado en Sistemas de Ingeniería, Universidad de Chile. Prof. Guía: Juan D. Velásquez Pablo E. Román [email_address]

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Motivation, Hypothesis, Achievement

[object Object],[object Object],[object Object],[object Object],[object Object],Adaptive Web Sites

Why we study the web user browsing behavior? ,[object Object],[object Object],[object Object],[object Object],[object Object]

The main Problem ,[object Object],[object Object]

Research hypothesis ,[object Object]

The Thesis Proposal Web Intelligence A.I. in the Web Web Mining Knowledge Representation Advanced Inf Tech. in the Web Agent Ubiquitous Sys. Wireless Sys. Grid & Cloud Sys. Social Network Web Structure Mining Web Content Mining Web Usage Mining Web user neurocomputing Neurophysiology model for the analysis of the behavior discovering pattern of web user navigational behavior from the set of user’ trails

Web user neurocomputing in Brief ,[object Object],[object Object]

Machine learning vs. First principle model ,[object Object],[object Object],[object Object],[object Object],[object Object],Proposed Solution

Thesis dissertation: Main Contributions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Web data: Content (text, object,..) You can put anything you want on a Web page, from family to business info…. Hyperlink structure

Proposal: Data sources ,[object Object],[object Object]

Problem: Web data pre-processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Traditional approach for sessionization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Traditional heuristic for sessionization How to identify individual web users? Filtering: IP+Browser(Agent) Timeout of 30 minute Path completion: shortest path backward

Sessionization: The proposal ,[object Object]

Integer Programming for sessionization ( WI-IAT08 R. Dell, P.Roman, J. Velasquez ) ,[object Object],[object Object],Log register Sessions

Integer Program ~ Maximize the number of sessions. ( WI-IAT08, KES09 P. Roman et al ) Register used once One register on o Structure and time

Network model: Minimize number of session. (IDA10 P. Roman et al) ,[object Object],[object Object],[object Object],Source Sink … Z=3 1 0 0 (1,1) 0 0 0 1 : flow of a session 1 (1,1) 1’ 2 (1,1) 2’ 3 (1,1) 3’ 1 0 Now is feasible N N’ (1,1) 4 4’ 1 1 1 0 0

Experiment: Large scale (15 month) DII departmental web site. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://www.dii.uchile.cl/

A large scale experiment evaluation: F-Score over cookie retrieved sessions. (IDA10 P. Roman et al) ,[object Object],[object Object],Traditional sessionization Both proposal

A large scale experiment evaluation: F-Score over cookie retrieved sessions. Compared with 15 months of cookie retrieval Method Precision Recall F-Score Time Sessionization Integer Programming (SIP) 0.7788 0.6696 0.7201 6 Hour Network Flow (BCM) 0.7777 0.6671 0.7182 4 Min Canonical Sessionization 0.5091 0.6996 0.5993 1 Min

Summary: Pre-processing ,[object Object],[object Object],[object Object],[object Object],[object Object]

Strong Regularities: Distribution of sessions (WI-IAT08 P. Roman et al) ,[object Object],[object Object],[object Object]

Regularity  presence of internal rule (2008, CLAIO P. Roman et al.) ,[object Object],[object Object],[object Object],What we need is a theory for explaining such regularities!

Proposal: To adapt Psychological theory to web navigation, using web data. ,[object Object],[object Object],[object Object],[object Object],[object Object]

Biological experiment (1970-2005) ,[object Object],[object Object],[object Object],[object Object]

Neurophysiology of decision making: First Principles ,[object Object],[object Object],X 1 X 2 It decides option 1. Two options 0

LCA Model (Leacky Competing Accumulator) [M. Usher et al, 2001] ,[object Object],[object Object],[object Object],[object Object],[object Object],Important parameter!! X I

Application: The browsing process ,[object Object],[object Object],[object Object]

Modeling the likelihood of choosing each option (vector I ) ,[object Object],[object Object],[object Object]

Likelihood of a decision and web user utility ,[object Object],[object Object],[object Object],[object Object]

Assumption & Approximation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Adaptation of the LCA model (WI-IAT10, P. Roman et al) ,[object Object],[object Object],[object Object],Evidence Inhibition Dissipation Noise

The Fokker-Planck equation: probability density of not reaching a decision (AWIC09 P. Roman et al) . Never reach a decision in t’<t Neural activity is positive Neural activity is initially near to 0 Probability density

The probability of reaching a decision in time t. The probability of deciding option “j” in time “t”

Unconstrained exact solution Hermite Polinomials Exact solution (Ornstein-Uhlenbeck)

Exact unconstrained solution evolution ,[object Object],[object Object],[object Object],[object Object],(Ornstein-Uhlenbeck)

This approach is threefold ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Simulation: Monte Carlo simulation ,[object Object],[object Object]

Simulation algorithm: Deciding which link to follows.

Results: Simulated session length distribution ( BICA08, AWIC09, BAO10 ). ,[object Object],[object Object],[object Object],[object Object],[object Object]

Results: Number of visits per page. ,[object Object]

Adjustment of distribution of time used per session. Simulated session ,[object Object],[object Object],[object Object]

In Summary ,[object Object],[object Object]

Calibration ( WI-IAT 2010, P. Roman et al ) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Parameter Inference ,[object Object],[object Object],[object Object],[object Object],[object Object],Maximum log-likelihood: ,[object Object],[object Object],j i

Curse of dimensionality ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Distribution of number of links per page.

Proposal (1): To use symbolic processing ,[object Object],[object Object],[object Object],[object Object],[object Object],/ 1 - 1 ^ X 2

Proposal (2): Use the time propagator of the Cauchi problem ,[object Object],[object Object]

Proposal (3): Penalization method for ensuring border condition ,[object Object],F P (x) =(1-x) 2n +x 2n

Approximating the probability distribution Φ ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Experiment: DII departmental web site. ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Calibration of parameter ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],λ κ σ

Simulation of in the DII site ,[object Object],[object Object]

Comparing with ML approach ,[object Object],[object Object],[object Object]

Situation after 1 month: stability of the calibration. ,[object Object],[object Object],[object Object],[object Object]

In Summary ,[object Object],[object Object],[object Object]

Conclusion (1) ,[object Object],[object Object],[object Object],[object Object]

Conclusions (2) ,[object Object],[object Object],[object Object]

Conclusion (3) ,[object Object],[object Object]

Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object]

Publications: Book Chapters ,[object Object],[object Object],Publications: International Journal ,[object Object],[object Object]

International Conferences ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Publications: National Conferences ,[object Object],[object Object],[object Object],[object Object],Publications: National review ,[object Object]

Thanks you for your attention.

Defensa.V11

Recommended

Recommended

More Related Content

What's hot

What's hot (9)

Viewers also liked

Viewers also liked (8)

Similar to Defensa.V11

Similar to Defensa.V11 (20)

Recently uploaded

Recently uploaded (20)

Defensa.V11

Editor's Notes