Kdd for personalization
Upcoming SlideShare
Loading in...5
×
 

Kdd for personalization

on

  • 1,068 views

tutorial on KDD for personalization

tutorial on KDD for personalization

Statistics

Views

Total Views
1,068
Views on SlideShare
1,067
Embed Views
1

Actions

Likes
0
Downloads
5
Comments
0

1 Embed 1

http://www.docseek.net 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Kdd for personalization Kdd for personalization Document Transcript

    • KDD for Personalization PKDD 2001 Tutorial September 6, 2001Bamshad Mobasher - DePaul University, ChicagoBettina Berendt - Humboldt University BerlinMyra Spiliopoulou - Leipzig Graduate School of Management Web Personalization • The Problem – dynamically serve customized content (pages, products, recommendations, etc.) to users based on their profiles, preferences, or expected interests • Personalization v. Customization – In customization, user controls and customizes the site or the product based on his/her preferences – usually manual, but sometimes semi-automatic based on a given user profile – Personalization is done automatically based on the user’s actions, the user’s profile, and (possibly) the profiles of others with “similar” profiles PKDD 2001 Tutorial: “KDD for Personalization” [I-2] [2]
    • Customization Example my.yahoo.com my.yahoo.comPKDD 2001 Tutorial: “KDD for Personalization” [I-3] [3] Personalization Example amazon.com amazon.comPKDD 2001 Tutorial: “KDD for Personalization” [I-4] [4]
    • A simplified scheme for personalization what kind? selects - document etc. - query user how? information object(s) - request, specification - rating related to why? - similarity (syntactic/semantic) - co-occurrence in other users´ navigation histories - co-occurrence in user´s other navigation histories system recommends other information object(s)PKDD 2001 Tutorial: "KDD for Personalization" [I-5] ÃÒÓÛ Ì Ý Ù×ØÓÑ Ö ÃÒÓÛÐ × ÈÓÛ Ö Ê Ð Ø ÓÒ× Ô× × ÓÒ Ù×ØÓÑ Ö Ò× Ø ÔÖÓÔ Ð Ò ÓÖ Ò Þ Ø ÓÒ ÖÓÑ × ÑÔÐÝ ØÖ ØÒ Ù×ØÓÑ Ö× ÒØÐÝ ØÓ ØÖ ØÒ Ø Ñ Ö Ð ØÚ ØÓ Ø Ö Ò ×¸ ÔÖ Ö Ò ×¸ Ò Ú ÐÙ ÔÓØ ÒØ Ðº º º º ÃÒÓÛ Ò Ø Ù×ØÓÑ Ö × Ô Ö ÑÓÙÒØ Ò ØÓ Ý³× Ñ Ö ØÔÐ Û Ö Ø Ù×ØÓÑ Ö × ÑÓÖ ÓÔØ ÓÒ׸ Ö Ø Ö Ü Ð ØÝ Ò Ö ÜÔ Ø Ø ÓÒ׺ ººº ÂÓ Ò º Æ × ´ ÒØÙÖ µ ÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-6]
    • Ù×ØÓÑ Ö ÒÓÛÐ ÑÔÐ × ½ºµ ÕÙ × Ø ÓÒ Ó Ù×ØÓÑ Ö Ø ¾ºµ Ò ÐÝ× × Ó Ù×ØÓÑ Ö Ø ¿ºµ Ø ÓÒ Ò ÓÖ Ò ÛØ Ø Ò Ò× Ø×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-7] ÕÙ × Ø ÓÒ Ó Ù×ØÓÑ Ö Ø Ù×ØÓÑ Ö Ø Ö Ö ÓÖ Ò × Ó ¯ ÔÖ Ö Ò × ¯ ØÖ Ò× Ø ÓÒ× ¯ ÔÖ ¹× Ð × ÓÒØ Ø× ¯ Ø Ö¹× Ð × ×ÙÔÔÓÖØ ¯ ÑÓ Ö Ô Ò ÓÖÑ Ø ÓÒ ËÓÑ Ó Ø × Ø ¬ ÑÝ ÔÙÖ × ÖÓÑ Ø Ö Ô ÖØ × ¬ ÑÝ Ð Ò ÑÙÐØ ÔÐ ×Ô Ö Ø Ø × × Ø Ø × ÖÚ ÓÑÔÐ Ø ÐÝ Ö ÒØ ÔÙÖÔÓ× × ¬ Ö Ó Ú ÖÝ Ò ÕÙ Ð ØÝ Û Ø Ö ×Ô Ø ØÓ ÖÖÓÖ Ö Ø ×¸ Ö Ð Ð Øݸ ÓÚ Ö ¸ Ö ÔÖ × ÒØ Ø Ú Ò ××   Ø ÈÖ Ô Ö Ø ÓÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-8]
    • Ò ÐÝ× × Ó Ù×ØÓÑ Ö Ø Ø Ò ÐÝ× × × ÓÙÐ ÔÖÓÚ ÓÒ ÕÙ ×Ø ÓÒ× Ð ¯ Ï Ù× Ö× Û ÐÐ ÓÑ Ù×ØÓÑ Ö× ¯ Ï Ù×ØÓÑ Ö× Û ÐÐ Ö ØÙÖÒ Ò ¯ Ï Ó × ÑÓÖ Ð ÐÝ ØÓ Ö ×ÔÓÒ ØÓ ÔÖÓÑÓØ ÓÒ Ø ÓÒ ¯ Ï Ó ÛÓÙÐ ÒØ Ö ×Ø Ò ÖÓ××¹× Ð »ÙÔ¹× Ð ×Ù ×Ø ÓÒ× ÐÓ× ÐÝ Ö Ð Ø ØÓ ÕÙ ×Ø ÓÒ× Ð ¯ Á× Ø Ï ¹× Ø ÔÔÖÓÔÖ Ø ÐÝ × Ò ØÓ × ÖÚ Ø ÓÖ Ò × Ø ÓÒ³× Ó Ð× ¯ Ö Ø Ù×ØÓÑ Ö× × Ø × ¯ Ö Ø Ù×ØÓÑ Ö× × Ø × ÒÓÙ ØÓ ÓÑ Ò ¯ Ö Ø Ù×ØÓÑ Ö× × Ø × ÒÓÙ ØÓ ÓÑ ÔÖÓÑÓØ Ö× Ó Ø ×Ø   Ø ÅÒÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-9] Ø ÓÒ Ò ÓÖ Ò Û Ø Ø Ò Ò× Ø× ¯ Ð ÒÑ ÒØ Ó Ø Ñ Ö Ø Ò ÔÓÐ Ý ¯ Ð ÒÑ ÒØ Ó Ø ×ÙÔÔÐÝ Ò¸ Ò ÐÙ Ò Ø Ö × Ð × ×ÙÔÔÓÖØ ¯ Ù×ØÑ ÒØ Ó Ø Û × Ø ¡ ×Ø Ø × Ø Ö ¹ × Ò ¡ ÖÓÛ× Ò »Æ Ú Ø ÓÒ ×Ù ×Ø ÓÒ× ¡ Ê ÓÑÑ Ò Ø ÓÒ× ÓÒ Ø Ô ¡ ÁÒØ ÐÐ ÒØ ×× ×Ø Ò ¡ È Ö×ÓÒ Ð Þ Ð ÝÓÙØ Ò ÓÒØ ÒØ Ø Ì Ø Ñ Ð ØÛ Ò Ò× Ø Ò Ø ÓÒ × ÓÙÐ Ñ Ò Ñ Þ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-10]
    • Ì Ø ÓÒ × ÓÙÐ Ö Ø Ú ÐÙ ¯ ÓÖ Ø Ù×ØÓÑ Ö ¯ ÓÖ Ø ÓÖ Ò × Ø ÓÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-11] × ÓÖØ Ü ÙÖ× ÓÒ ÓÒ Ú ÐÙ Ö Ø ÓÒ ÁÒ ¾ ¹ ÓÑÑ Ö ¸ × ÒÓØ ×Ù ÒØ ØÓ ¯ Ó Ö Ò Ü ×Ø Ò ÔÖÓ Ù Ø Ø ÖÓÙ Ø ÁÒØ ÖÒ Ø ¯ Ø Þ Ô ÖØ» ÐÐ Ó Ø ÑÖ Ò ÞÒ Ò ¯ ÒØÖÓ Ù Ö ÐÐ ÒØ Ò Û ÔÖÓ Ù Ø Ò Ø ÑÖ Ø Ì ÔÖÓ Ù Ø ÑÙ×Ø Ö Ò Ú ÐÙ ØÓ ¯ ÛÒ Ø Ù×ØÓÑ Ö Ù×ØÓÑ Ö ÓÒÚ Ö× ÓÒ ¯ Ö Ø ÒØ Ù×ØÓÑ Ö Ù×ØÓÑ Ö Ê Ø ÒØ ÓÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [1-12]
    • Ì ÑÓ Ð Ó ÃÙ Ð Ò ÓÒ× Ö× Ø ÓÐÐÓÛ Ò ØÝÔ × Ó Ú ÐÙ ¿¾ ´½µ ÓÑÔ Ö Ø Ú ´¾µ ÑÔÖÓÚ Ò ÒÝ ´¿µ ÑÔÖÓÚ Ò Ø Ú ØÝ ´ µ ÒØ Ö Ø Ú ´ µ ÓÖ Ò × Ø ÓÒ Ð ´ µ ×ØÖ Ø ´ µ ÒÒÓÚ Ø ÚÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [1-13] ÖÓÑ ÕÙ × Ø ÓÒ ØÓ Ø ÓÒ ¯ Ì Ö × ÒÓ Ð Ó Ø º ¡ Ð ×ØÖ Ñ Ø ÙÑÙÐ Ø Ò ØÖ Ñ Ò ÓÙ× Ô º ¡ ÑÓ Ö Ô Ø Ò ÕÙ Ö º ¡ Ù×ØÓÑ Ö ÔÖÓ Ð × Ö Ú Ð Ð ÓÖ Ò ÕÙ Ö º ¯ Ì Ö × ÒÓ Ð Ó Ñ Ø Ó ÓÐÓ × ÓÖ Ø Ò ÐÝ× ×º ¯ Ì Ð ØÝ ØÓ ÜÔÐÓ Ø Ø Ø Ò Ö × × Ø ÑÙ ×ÐÓÛ Ö Ô Ò Ø ÒÙÑ Ö Ó Ô Ö×ÓÒ Ð Þ Ï × Ø × × ÒÓØ Ö ÐÐÝ Ð Ö º ¯ Ì ØÓÐ Ö Ð Ð Ô× ØÑ ØÛ Ò ÕÙ × Ø ÓÒ Ò Ø ÓÒ × ÐÓÛ ½ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [I-14]
    • Personalization: An HCI perspective = does personalization increase usability? A Web site’s usability is high if users - achieve their goals / perform their tasks in little time, - do so with a low error rate, - experience high subjective satisfaction. Usability testing: - qualitative and quantitative methods - experts and "normal" users - questionnaires and experiments Usability is a special concern on the Web because unlike with other products / software, "users experience usability first and pay later". (Nielsen [49] [B12])PKDD 2001 Tutorial: "KDD for Personalization" [I-15] Data Preparation for Personalization PKDD 2001 Tutorial: “KDD for Personalization” [DP-1]
    • Web Usage Mining • Discovery of meaningful patterns from data generated by client-server transactions on one or more Web servers • Typical Sources of Data – automatically generated data stored in server access logs, referrer logs, agent logs, and client-side cookies – e-commerce and product-oriented user events (e.g., shopping cart changes, ad or product click-throughs, etc.) – user profiles and/or user ratings – meta-data, page attributes, page content, site structure PKDD 2001 Tutorial: “KDD for Personalization” [DP-2]What’s in a Typical Server Log?<ip_addr><base_url> -- <date><method><file><protocol><code><bytes><referrer><user_agent> <ip_addr><base_url> <date><method><file><protocol><code><bytes><referrer><user_agent>203.30.5.145 www.acr-news.org - [01/Jun/1999:03:09:21 -0600] "GET /Calls/OWOM.htmlHTTP/1.0" 200 3942 "http://www.lycos.com/cgi-bin/pursuit?query=advertising+psychology&maxhits=20&cat=dir" "Mozilla/4.5 [en] (Win98;I)"203.30.5.145 www.acr-news.org - [01/Jun/1999:03:09:23 -0600] "GET/Calls/Images/earthani.gif HTTP/1.0" 200 10689 "http://www.acr-news.org/Calls/OWOM.html""Mozilla/4.5 [en] (Win98; I)"203.30.5.145 www.acr-news.org - [01/Jun/1999:03:09:24 -0600] "GET /Calls/Images/line.gifHTTP/1.0" 200 190 "http://www.acr-news.org/Calls/OWOM.html" "Mozilla/4.5 [en] (Win98; I)"203.30.5.145 www.acr-news.org - [01/Jun/1999:03:09:25 -0600] "GET /Calls/Images/red.gifHTTP/1.0" 200 104 "http://www.acr-news.org/Calls/OWOM.html" "Mozilla/4.5 [en] (Win98; I)"203.252.234.33 www.acr-news.org - [01/Jun/1999:03:32:31 -0600] "GET / HTTP/1.0" 200 4980"" "Mozilla/4.06 [en] (Win95; I)"203.252.234.33 www.acr-news.org - [01/Jun/1999:03:32:35 -0600] "GET /Images/line.gifHTTP/1.0" 200 190 "http://www.acr-news.org/" "Mozilla/4.06 [en] (Win95; I)"203.252.234.33 www.acr-news.org - [01/Jun/1999:03:32:35 -0600] "GET /Images/red.gifHTTP/1.0" 200 104 "http://www.acr-news.org/" "Mozilla/4.06 [en] (Win95; I)"203.252.234.33 www.acr-news.org - [01/Jun/1999:03:32:35 -0600] "GET /Images/earthani.gifHTTP/1.0" 200 10689 "http://www.acr-news.org/" "Mozilla/4.06 [en] (Win95; I)"203.252.234.33 www.acr-news.org - [01/Jun/1999:03:33:11 -0600] "GET /CP.html HTTP/1.0"200 3218 "http://www.acr-news.org/" "Mozilla/4.06 [en] (Win95; I)"
    • The Web Usage Mining Process C ontent and S tructure D ata P re processing P attern D iscove ry P attern A n alysis R aw U sage P reprocessed "Interesting" R ules, P atterns, D ata C lickstream R ules, P atterns, and S tatistics D ata and S tatisticsPKDD 2001 Tutorial: “KDD for Personalization” [DP-4] Usage Data Preprocessing Raw Usage Data Data User/Session Page View Path Cleaning Identification Identification Completion Server Session File Episode Identification Usage Statistics Site Structure and Content Episode FilePKDD 2001 Tutorial: “KDD for Personalization” [DP-5]
    • Data Preprocessing for Web Usage Mining • Data cleaning – remove irrelevant references and fields in server logs – remove references due to spider navigation – remove erroneous references – add missing references due to caching (done after sessionization) • Data integration – synchronize data from multiple server logs – integrate e-commerce and application server data – integrate meta-data (e.g., content labels) – integrate demographic / registration dataPKDD 2001 Tutorial: “KDD for Personalization” [DP-6] Data Preparation for Web Usage Mining (Cooley, Mobasher, Srivastava, 1999 [15]) • Data Transformation – user identification – sessionization / episode identification – pageview identification • a pageview is a set of page files and associated objects that contribute to a single display in a Web Browser • Data Reduction – sampling and dimensionality reduction (ignoring certain pageviews / items) • Identifying User Transactions (i.e., sets or sequences of pageviews possibly with associated weights)PKDD 2001 Tutorial: “KDD for Personalization” [DP-7]
    • User and Session Identification: Need for Reliable Usage Data • Validity of results in Web usage mining is affected by the ability to: – distinguish among different users to a site – reconstruct the activities of the users within the site • Difficult to obtaining reliable usage data – proxy servers and anonymizers – rotating IP addresses connections through ISPs – missing references due to caching – inability of servers to distinguish among different visitsPKDD 2001 Tutorial: “KDD for Personalization” [DP-8] Identifying Users and Sessions • Server log L is a list of log entries each containing timestamp, host identifier, URL request (including URL stem and query), referrer, agent, cookie, etc. • User identification and sessionization – user activity log is a sequence of log entries in L belonging to the same user – user identification is the process of partitioning L into a set of user activity logs – the goal of sessionization is to further partition each user activity log into sequences of entries corresponding to each user visitPKDD 2001 Tutorial: “KDD for Personalization” [DP-9]
    • Sessionization Heuristics • Real v. Constructed Sessions – Conceptually, the log L is partitioned into an ordered collection of “real” sessions R – Each heuristic h partitions L into an ordered collection of “constructed sessions” Ch – The ideal heuristic h*: Ch* = R • Two Basic Types of Sessionization Heuristics – Time-oriented heuristics – Navigation-oriented heuristicsPKDD 2001 Tutorial: “KDD for Personalization” [DP-10] Time-Oriented Heuristics • Consider boundaries on time spent on individual pages or in the entire a site during a single visit – Boundaries can be based on a maximum session length or maximum time allowable for each pageview – Additional granularity can be obtained by treating different boundaries on different (types of) pageviews h1: Given t0, and a threshold θ, the timestamp for first request in a constructed session S, the request with timestamp t is assigned to S, iff t - t0 ≤ θ. h2: Given t1, and a threshold δ, the timestamp for a request in constructed session S, the next request with timestamp t2 is assigned to S, iff t2 - t1 ≤ δ.PKDD 2001 Tutorial: “KDD for Personalization” [DP-11]
    • Navigation-Oriented Heuristics • Take the linkage between pages into account – “linkage” can be based on site topology (e.g., split a session at a request that could not have been reached from previous requests in the session) – or can be usage-based (using referrers in log entries) • usually more restrictive than topology-based heuristics and more difficult to implement in frame-based sites href: Given two consecutive requests p and q, with p belonging to constructed session S. Then q is assigned to S, if the referrer for q was previously invoked in S, or if the referrer for q is “undefined” and tq - tp ≤ ∆ (time delay ∆ is to allow for proper loading of frameset pages).PKDD 2001 Tutorial: “KDD for Personalization” [DP-12] Measures for Sessionization Accuracy (Berendt, Mobasher, Spiliopoulou, 2001 [7]) • A heuristic h maps entries in the log L into elements of constructed sessions, such that: – (a) each entry in L is mapped to exactly one element of a constructed session – (b) the mapping is order-preserving • Measures quantify the successful mappings of real sessions to constructed sessions – a measure M evaluates a heuristic h based on the differences between Ch and R – each measure assigns to h a value M(h) ∈ [0,1] so that M(h*) = 1PKDD 2001 Tutorial: “KDD for Personalization” [DP-13]
    • Measures for Sessionization Accuracy • Categorical and Gradual Measures – categorical measures: based on the number of real sessions that are reconstructed by the heuristics – gradual measures: based on the degree to which the real sessions are reconstructed by the heuristicsPKDD 2001 Tutorial: “KDD for Personalization” [DP-14] Categorical Measures • Based on the notion of “complete reconstruction” – a real session is completely reconstructed if all its elements are contained in the same constructed session – the measure Mcr(h) is the ratio of the number of completely reconstructed real sessions in Ch to the total number of real sessions |R|PKDD 2001 Tutorial: “KDD for Personalization” [DP-15]
    • Categorical Measures • Derived categorical measures: – Mcrs considers only completely reconstructed real sessions whose first element is also the first element of a constructed session – Mcre considers only completely reconstructed real sessions whose last element is also the last element of a constructed session – Mcrse considers only completely reconstructed real sessions with correct starts and ends • in absence of overlapping real sessions for individual users, this gives the number of constructed sessions that are identical to corresponding real sessionsPKDD 2001 Tutorial: “KDD for Personalization” [DP-16] Gradual Measures • Allow for measuring partial overlaps between real and constructed sessions – degree of overlap between real sessions r and constructed session c, dego(r,c), is the number of elements they have in common divided by total number of elements in r. – degree of overlap for a real session r is the maximum dego(r,c) over all constructed sessions c. – the measure Mo(h) is the average degree of overlap over all real sessions – if a real session is completely reconstructed, its overlap degree is 1PKDD 2001 Tutorial: “KDD for Personalization” [DP-17]
    • Gradual Measures • To take the size of constructed session into account, we define the degree of similarity – degs(r,c) = | r ∩ c | / | r ∪ c | – Ms(h) is is the average degree of similarityt over all real sessions – if a real session is completely reconstructed, its similarity degree is 1PKDD 2001 Tutorial: “KDD for Personalization” [DP-18] Which Measures? • The choice of the measures depends on the goals of usage analysis, for example: – “complete reconstruction” may be appropriate for clustering and association-based analyses (it correctly shows set of pages accessed together) • it also preserves sequential order of accesses, so it can be used for the analysis of users’ navigational behavior – Mcrs: useful for analyzing access to entry points – Mcre: useful for analyzing access to exit points – overlap-based measures can be useful for comparing overall effectiveness of sessionization heuristics in grouping pages or objectsPKDD 2001 Tutorial: “KDD for Personalization” [DP-19]
    • Which Sessionization Heuristics? • The choice of sessionization heuristic depends on the characteristics of the data – if individual users visit the site in short but temporally dense sessions, h2 may perform better than h1 – in cases when timestamps are not reliable (e.g., using integrated data across many log files), href may be a better choice for sessionization – referrer-based heuristics tend to perform worse in highly dynamic, frame-based sitesPKDD 2001 Tutorial: “KDD for Personalization” [DP-20] Comparison of Sessionization Heuristics h1-30 h2-10 h-ref •• cookies used to identify cookies used to identify unique users unique users 1.00 •• server generated session server generated session 0.95 variable used to identify variable used to identify 0.90 “real” sessions “real” sessions 0.85 •• site was frame-based and site was frame-based and 0.80 highly dynamic highly dynamic 0.75 •• thresholds of 30 and 10 thresholds of 30 and 10 0.70 minutes were used for h1 minutes were used for h1 and h2, respectively and h2, respectively 0.65 •• href performed poorly, due href performed poorly, due 0.60 to propagated errors in to propagated errors in 0.55 misclassified frameset misclassified frameset 0.50 references references M_o M_crse M_cr M_crs M_cre M_s •• 30% of users had multiple 30% of users had multiple IP addresses (coming from IP addresses (coming from behind proxy servers) behind proxy servers)PKDD 2001 Tutorial: “KDD for Personalization” [DP-21]
    • Mechanisms for User Identification Method Description Priv acy Adv antages Disadv antages Concerns IP A ddre s s + A s s um e e a c h unique Lo w A lw a ys a va ila ble . N o N o t g ua ra nte e d to be A g e nt IP a ddre s s /A g e nt a dditio na l unique . D e fe a te d by pa ir is a unique us e r te c hno lo g y re quire d. ro ta ting IP s . E m be dde d U s e dyna m ic a lly Lo w to A lw a ys a va ila ble . C a nno t c a pture S e s s io n Ids g e ne ra te d pa g e s to m e dium Inde pe nde nt o f IP re pe a t vis ito rs . a s s o c ia te ID w ith a ddre s s e s . A dditio na l o ve rhe a d e ve ry hype rlink fo r dyna m ic pa g e s . R e g is tra tio n U s e r e xplic itly lo g s M e dium C a n tra c k M a ny us e rs w o nt in to the s ite . individua ls no t jus t re g is te r. N o t bro w s e rs a va ila ble be fo re re g is tra tio n. C o o k ie S a ve ID o n the c lie nt M e dium to C a n tra c k re pe a t C a n be turne d o ff by m a c hine . hig h vis its fro m s a m e us e rs . bro w s e r. S o ftw a re P ro g ra m lo a de d into H ig h A c c ura te us a g e da ta Lik e ly to be re je c te d A g e nts bro w s e r a nd s e nds fo r a s ing le s ite . by us e rs . ba c k us a g e da ta .PKDD 2001 Tutorial: “KDD for Personalization” [DP-22] Impact of User Identification Heuristics These experiments show the impact of using IP+Agent heuristic for user These experiments show the impact of using IP+Agent heuristic for user identification on sessionization heuristics (as compared to cookies) identification on sessionization heuristics (as compared to cookies) h1-30-real h1-30-ipa h -ref-real h -ref-ipa 1.00 1.00 0.90 0.90 0.80 0.80 0.70 0.70 0.60 0.60 0.50 0.50 0.40 0.40 0.30 0.30 _s _o r e rs re _s r e _o rs re _c _c rs rs _c _c _c _c M M M M _c M _c M M M M M M MPKDD 2001 Tutorial: “KDD for Personalization” [DP-23]
    • Inferring User Transactions from Sessions • Observation: reference lengths follow an exponential distribution • Page types correlate with Histogram of reference lengths page reference lengths (secs) • Page types: navigational, content, or hybrid • Can automatically classify pages as navigational or content using statistical modeling • A transaction can be defined as an intra-session path ending in a content page, or as a set of navigational content content pages in a session pages pagesPKDD 2001 Tutorial: “KDD for Personalization” [DP-24] Path Completion • Refers to the problem of inferring missing user references due to caching. • Effective path completion requires extensive knowledge of the link structure within the site • Referrer information in server logs can also be used in disambiguating the inferred paths. • Problem gets much more complicated in frame- based sites.PKDD 2001 Tutorial: “KDD for Personalization” [DP-25]
    • Path Completion - An Example A User’s navigation path: A => B => D => E => D => B => C URL Referrer B C A -- B A D B E D D E F C B • There may be multiple candidates for completing the path. For example consider the two paths : E => D => B => C and E => D => B => A => C. • In this case, the referrer field allows us to partially disambiguate. But, what about: E => D => B => A => B => C? • One heuristic: always take the path that requires the fewestPKDD 2001 Tutorial: “KDD for Personalization” [DP-26] Integrating E-Commerce Events • Either product oriented or visit oriented • Not necessarily a one-to-one correspondence with user actions • Used to track and analyze conversion of browsers to buyers • Major difficulty for E-commerce events is defining and implementing the events for a site – however, in contrast to clickstream data, getting reliable preprocessed data is not a problem • Another major challenge is the successful integration with clickstream dataPKDD 2001 Tutorial: “KDD for Personalization” [DP-27]
    • Product-Oriented Events • Product View – Occurs every time a product is displayed on a pageview – Typical Types: Image, Link, Text • Product Click-through – Occurs every time a user “clicks” on a product to get more information • Category click-through • Product detail or extra detail (e.g. large image) click- through • Advertisement click-throughPKDD 2001 Tutorial: “KDD for Personalization” [DP-28] Product-Oriented Events • Shopping Cart Changes – Shopping Cart Add or Remove – Shopping Cart Change - quantity or other feature (e.g. size) is changed • Product Buy or Bid – Separate buy event occurs for each product in the shopping cart – Auction sites can track bid events in addition to the product purchasesPKDD 2001 Tutorial: “KDD for Personalization” [DP-29]
    • Content and Structure Preprocessing • Processing content and structure of the site are often essential for successful usage analysis • Two primary tasks: – determine what constitutes a unique page file (i.e., pageview) – represent content and structure of the pages in a quantifiable formPKDD 2001 Tutorial: “KDD for Personalization” [DP-30] Content and Structure Preprocessing • Basic elements in content and structure processing – creation of a site map • captures linkage and frame structure of the site • also needs to identify script templates for dynamically generated pages – extracting important content elements in pages • meta-information, keywords, internal and external links, etc. – identifying and classifying pages based on their content and structural characteristicsPKDD 2001 Tutorial: “KDD for Personalization” [DP-31]
    • Quantifying Content and Structure • Static Pages – All of information is contained within the HTML files for a site – Each file can be parsed to get a list of links, frames, images, and text – Files can be obtained through the file system, or HTTP requests from an automated agent (site spider)PKDD 2001 Tutorial: “KDD for Personalization” [DP-32] Quantifying Content and Structure • Dynamic Pages – Pages do not exist until they are created due to a specific request – Relevant information can come from a variety of sources: Templates, databases, scripts, HTML, etc. – Three methods of obtaining content and structure information: • Series of HTTP requests from a site mapping tool • Compile information from internal sources • Content server toolsPKDD 2001 Tutorial: “KDD for Personalization” [DP-33]
    • Integrating content and structure I Domain knowledge: content - purpose: group pages by their content - method: analyze text, meta-tags, and/or URL (query string) - grouping by classification or clustering Concept hierarchies Entertainment Performing Music ... Example of a Arts content-based Artists Genres New Releases ... concept hierarchy Blues Jazz New Age ...PKDD 2001 Tutorial: "KDD for Personalization" [DP-34] Integrating content and structure II Content profiles from feature clusters 1, vector space model: each unique word in corpus = one dimension, each page(view) is a vector with a non-zero weight for each word in that page(view), zero weight for other words 2. feature - pageview matrix (note: "feature" = word, "pageview" because of frames) music jazz artist ... pv1 1.00 0.80 0.05 pv2 1.00 0.00 0.70 ... 3. features as weighted vectors of pageviews jazz = [ <pv1,0.80>, <pv2,0.00>, ... ] 4. group features -> feature clusters -> content profilesPKDD 2001 Tutorial: "KDD for Personalization" [DP-35]
    • Integrating content and structure III Structure - purpose: group pages by their hyperlink structure - ex. page types in Pirolli et al. [54] and Cooley et al. [B20]: [B24] [15]: head, navigation, content, look-up, personal - ex. path distance to a reference page A.html B.html C.html dA = 1 dA = 2 - structure as weighted vector of page(view)s S = [ <A.html,0>, <B.html,1>, <C.html,0>, ... ](only B content page) S = [ <A.html,0>, <B.html,1>, <C.html,3>, ... ] (path distances) - grouping by classification or clusteringPKDD 2001 Tutorial: "KDD for Personalization" [DP-36] Relating content and structure to mined usage I : Content/structure mining as pre-/post-processing steps Ex. online catalog search (Berendt & Spiliopoulou [B18, B17]): [8, 6]): 1. service-based concept hierarchy: which query options? Info on schools indiv. school list of schools ... 1 parameter 2 par.s 3 parameters Location Name ... Location+Name ... ...PKDD 2001 Tutorial: "KDD for Personalization" [DP-37]
    • Relating content and structure to mined usage I 2. discovering and comparing navigation patterns in classified pages part of a resulting WUM navigation pattern:PKDD 2001 Tutorial: "KDD for Personalization" [DP-38] Relating content and structure to mined usage I Ex. WebSIFT Information Filter (from Cooley [14]): [B19]): Mined knowledge domain know- interesting belief example ledge source general site structure The head page is not the most usage statistics common entry point general site content A page designed to provide usage statistics content is being used as a navigation page frequent itemsets site structure A set of pages is frequently accessed together, but not usage clusters site content directly linked A usage cluster contains => discover patterns at different pages from multiple content levels of abstraction, discover categories deviations from intended usagePKDD 2001 Tutorial: "KDD for Personalization" [DP-39]
    • Relating content and structure to mined usage II : Usage, content, and structure mining as 3 ways of deriving a common kind of representation Mobasher, Dai, Luo, Sun, & Zhu [44] [B22] - a vector of tuples <pageview,weight>: usage: sessions / visits, or parts of them (past + current) content: features structure: pages and their characteristics - unordered or ordered collections => identify clusters that are similar, where similarity is by usage, content, or structurePKDD 2001 Tutorial: "KDD for Personalization" [DP-40] È ØØ ÖÒ × ÓÚ ÖÝ ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ ÅÝÖ ËÔ Ð ÓÔÓÙÐÓÙ ÀÀÄ º[PD-1] ºº
    • Ï ÒØ Ý Ø ÓÐÐÓÛ Ò ×Ô Ø× Ó Ø Ô Ö×ÓÒ Ð Þ Ø ÓÒ × ÖÚ ×¸ Û Ò ÒÚ × ×Ø Ö ×ÙÐØ Ó Ô ØØ ÖÒ × ÓÚ ÖÝ Î × Ð ØÝ Ë ÖÚ Ð Ñ ÒØ ¯ Ô Ö×ÓÒ Ð Ö ÓÑÑ Ò Ø ÓÒ ¯ ´Ð Ò ØÓµ Ô ¯ × Ð ÒØ ÝÒ Ñ Ù×ØÑ ÒØ ¯ ÔÔÐ Ø ÓÒ Ó Ø ¯ ×Ø Ø Ô »× Ø Ù×ØÑ ÒØ Å Ø Ò × ÓÒ ÕÙ × Ø ÓÒ Ø ÐÐ Ø ÓÒ ¯ Ù× Ö ÔÖÓ Ð × ¯ ÐÐ ×Ø Ô× ÓÒ¹Ð Ò ¯ Ù× Ö Ö Ø Ò × ¯ Ó ¹Ð Ò Ô ØØ ÖÒ × ÓÚ ÖÝ ¯ Ù× Ö Ú ÓÙÖ ² ÓÒ¹Ð Ò Ñ Ø Ò ¯ ÓÒØ ÒØ Ó Ó Ø×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ ÅÝÖ ËÔ Ð ÓÔÓÙÐÓÙ ÀÀÄ º[PD-2] ºº È ØØ ÖÒ × ÓÚ ÖÝ ÔØ Ú Û × Ø × Ì ÔÔÖÓ Ó È Ö ÓÛ ØÞ ² ØÞ ÓÒ ¾¸ ¿ Ì ÁÒ Ü Ò Ö ÓÒ× ×Ø× Ó Ø Ö Ô × × ½º ÄÓ ÔÖÓ ×× Ò ×Ø Ð × Ñ ÒØ Ó × ×× ÓÒ× × × Ø× Ó Ô Ö ÕÙ ×Ø× ¾º ÐÙ×Ø Ö Ñ Ò Ò ÖÓÙÔ Ò Ó Ó¹Ó ÙÖ Ò ÒÓÒ¹Ð Ò Ô × ÛØ ÐÔ Ó Ø ×Ø Ö Ô ¿º ÓÒ ÔØÙ Ð ÐÙ×Ø Ö Ò ¡ Ì Ö ÔÖ × ÒØ Ø Ú ÓÒ ÔØ Ó ÐÙ×Ø Ö × ÒØ º ¡ ÐÙ×Ø Ö Ñ Ñ Ö× ÒÓØ Ö Ò ØÓ Ø × ÓÒ ÔØ Ö Ö ÑÓÚ ÖÓÑ Ø ÐÙ×Ø Öº ¡ È × Ö Ò ØÓ Ø × ÓÒ ÔØ Ò ÒÓØ ÔÔ Ö Ò Ò Ø ÐÙ×Ø Ö Ö ØØ ØÓ Ø ÐÙ×Ø ÖºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-3]
    • ÓÖ ÐÙ×Ø Ö¸ Ø ÁÒ Ü Ò Ö ÔÖ × ÒØ× ØÓ Ø Ï × ÒÖ ¯ Ò Ò ÜÔ Û Ø Ð Ò × ØÓ ÐÐ Ô ×Ó ÐÙ×Ø Ö Ì Ï × ÒÖ × ¬ Û Ø ÖØ Ò ÛÔ × ÓÙÐ Ò ×Ø Ð × ¬ Û Ø Ø× Ð Ð × ÓÙÐ ¬ Û Ö Ø × ÓÙÐ ÐÓ Ø Ò Ø × Ø ÓÖ Ò ØÓ ÓÙÖ Ø ÓÖ Þ Ø ÓÒ Î × Ð ØÝ Ë ÖÚ Ð Ñ ÒØ Ô ÓÒØ Ò Ò ËØ Ø Ô »× Ø Ù×ØÑ ÒØ × Ò Ð ÔÔÐ Ø ÓÒ Ó Ø Å Ø Ò × ÓÒ Ç ¹Ð Ò Ô ØØ ÖÒ × ÓÚ ÖÝ Ù× Ö Ú ÓÙÖ Ò Ô ÓÒØ ÒØÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-4] È ØØ ÖÒ × ÓÚ ÖÝ ÓÖ Ê ÓÑÑ Ò Ø ÓÒ× Ì ÓÐÐ ÓÖ Ø Ú ÐØ Ö Ò ÔÔÖÓ Å Ò Ì Ó Ø× ×Ù ×Ø ØÓ Ù× Ö Ö Ø Ó× ÔÖ ÖÖ Ý Ù× Ö× × Ñ Ð Ö ØÓ Öº ½º Ì Ù× Ö³× ØÖ Ò× Ø ÓÒ × Ñ Ø Ò×Ø ÐÓ ØÖ Ò× Ø ÓÒ׺ ¾º Ì Ñ Ø × Ö Ö Ò º ¿º Ì ×Ø ´× Ø Ó µ Ñ Ø ´ ×µ Ö × Ð Ø º º Ì Ó Ø× Ø Ø Û Ö × ÓÛÒ Ò Ø × Ð Ø ØÖ Ò× Ø ÓÒ× Ö ÖÒ Ü ÐÙ Ò Ó Ø× ÐÖ Ý × Òº º Ì Ó Ø× Û Ø Ø ÖÑÓ×Ø Ö Ò Ö × ÓÛÒ ØÓ Ø Ù× Öº ÐÐ ×Ø Ô× ÓҹРÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-5]
    • È ØØ ÖÒ × ÓÚ ÖÝ ÓÖ Ê ÓÑÑ Ò Ø ÓÒ× Ì Ø Å Ò Ò ÔÔÖÓ Å Ò Í× Ö × Ñ Ð Ö ØÝ Ò Ò Ò Ø ÖÑ× Ó Ú ÓÙÖ¸ ÒØ Ö ×Ø׸ ÔÖ Ö Ò × Ø Ø Ø Ò ÑÓ ÐÐ Ó ¹Ð Ò ½º È ØØ ÖÒ × ÓÚ ÖÝ ÓÚ Ö Ø ÐÓ Ø ¾º Ì ÓÒØ ÒØ× Ó Ø Ù× Ö³× ØÖ Ò× Ø ÓÒ Ö Ñ Ø Ò×Ø Ø × ÓÚ Ö Ô ØØ ÖÒ׺ ¿º Ì Ñ Ø × Ö Ö Ò º º Ì Ó Ø× ××Ó Ø Û Ø Ø ×Ø Ñ Ø × Ö Ö Ò Ü ÐÙ Ò Ó Ø× ÐÖ Ý × Òº º Ì Ó Ø× Û Ø Ø ÖÑÓ×Ø Ö Ò Ö × ÓÛÒ ØÓ Ø Ù× Öº ×Ó Ø Ø µ Ì ÚÓÐÙÑ ØÒÓÙ× ÐÓ × Ô Ö ÓÖÑ Ö ÓÒÐÝ ÔÖÓ Ö Ú×× ÔÓØعÐÖÒ׺º µ ÇÒ¹Ð Ò Ñ Ò Ø Ò×Ø ÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-6] È ØØ ÖÒ × ÓÚ ÖÝ Ê ÓÑÑ Ò Ø ÓÒ× ÓÒ ÓÖÖ Ð Ø Ø Ñ× Ì ÔÔÖÓ Ó ÎÙ Ø Ò Ç Ö ÓÚ ¼ Ì Ö ÓÑÑ Ò Ø ÓÒ ÔÖÓ Ð Ñ × Ò × Ú Ò Ø Ö ØÒ × Ó Ø Ø Ú Ù× Ö ÓÒ × Ø Ó Ø Ñ׸ Û Û ÐÐ Ö Ö Ø Ò × ÓÒ Ø Ö Ñ Ò Ò Ø Ñ× Ì Ö ØÒ × Ó Ò Ø Ñ Ò ÔÖ Ø ÖÓÑ Ø Ö ØÒ × Å Ò ÓÒ ÓÖÖ Ð Ø Ø Ñ׺ Î × Ð ØÝ Ë ÖÚ Ð Ñ ÒØ ÔÔÐ Ø ÓÒ Ó Ø È Ö×ÓÒ Ð Ö ÓÑÑ Ò Ø ÓÒ Å Ø Ò × ÓÒ Ê Ø¹ Ç ¹Ð Ò × ÓÚ ÖÝ Ó ÔÖ ØÓÖ× ÓÖ Ø Ò × Ó ÓÖÖ Ð Ø Ø Ñ× ÑÔ Ø Ó Ø Ñ ÓÖÖ Ð Ø ÓÒ ÓÒ Ö Ø Ò ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-7]
    • Å Ø Ó ÓÐÓ Ý ¯ Ì Ö ØÒ Ó Ø Ñ Ú Ò ÒÓØ Ö Ø Ñ × ÔÔÖÓÜ Ñ Ø Ù× Ò Ð Ò Ö ÙÒ Ø ÓÒ ´Ò Ñ ÜÔ Öصº ¯ Ì ÚÖ ÓÖÖ Ð Ø ÓÒ ÑÓÒ Ô Ö× Ó Ø Ñ× × ÔÔÖÓÜ Ñ Ø Ù× Ò Ö Ò ÓÑ × ÑÔÐ Ò ÓÚ Ö Ø Ù× Ö Ö Ø Ò ×º ¯ Û Ø Ò × Ñ × ÔÖÓÔÓ× ØÓ Ð ÛØ Ø Ø Ø Ø Ù× Ö× Û Ø × Ñ Ð Ö ÔÖ Ö Ò × Ñ Ý ÔÖÓÚ Ö ÒØ Ö Ø Ò × ÓÖ Ø × Ñ × Ø Ó Ø Ñ׺ ÁÒ Ø × × Ñ ¬ Ì Ð Ò Ö ÜÔ ÖØ× ÓÖ ÐÐ Ô Ö× Ó Ø Ñ× Ò ÓÑÔÙØ Ó ¹Ð Ò º ¬ Ì Ö Ø Ò × ÓÖ Ò Ø Ú Ù× Ö Ö ÔÖ Ø ÖÓÑ Ø × Ø Ó Ô Ö× Ó Ø Ñ× Ö Ø Ö Ø Ò Ø × Ø Ó Ù× Ö Ö Ø Ò ×ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-8] È ØØ ÖÒ × ÓÚ ÖÝ Ê Ô Ø¹ ÙÝ Ò Ø ÓÖÝ ÓÖ Ô Ö×ÓÒ Ð Þ Ø ÓÒ Ì ÔÔÖÓ Ó Ý Ö¹Ë ÙÐÞ Ø Ð ¾ Å Ò µ Ê ÓÑÑ Ò Ø ÓÒ× Ö × ÓÒ ÓÖÖ Ð Ø ÔÖÓ Ù Ø׺ µ ÓÖÖ Ð Ø ÓÒ× Ò ÒØ ÛØ Ö Ò Ö ³× Ö Ô Ø¹ ÙÝ Ò Ø ÓÖݸ µ Ø Ö Ù×Ø Ò Ø ØÓ Ø Ô ÖØ ÙÐ Ö Ø × Ó ÒÓÒÝÑÓÙ× Ù× Ö × ×× ÓÒ׺ ÓÖ Ò ØÓ ÓÙÖ Ø ÓÖ Þ Ø ÓÒ Î × Ð ØÝ Ê ÓÑÑ Ò Ø ÓÒ Ó Ò¹ Ë ÖÚ Ð Ñ ÒØ ÔÔÐ Ø ÓÒ ÓÖÑ Ø ÓÒ ÔÖÓ Ù Ø× Ó Ø ÓÖ ÍÊÄ Å Ø Ò × ÓÒ Ù× Ö ÔÖ Ö¹ Ç ¹Ð Ò × ÓÚ ÖÝ Ó ÓÖÖ Ð Ø Ò × ÓÖ ÔÔÐ Ø ÓÒ Ó Ø× ÔÔÐ Ø ÓÒ Ó Ø×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-9]
    • Ö Ò Ö ³× Ö Ô Ø¹ ÙÝ Ò Ø ÓÖÝ ¡ ÔÖ Ø× ÙÝ Ö Ú ÓÙÖ ÖÓÑ ´ µ Ô Ò ØÖ Ø ÓÒ Ò ´ µ Ú Ö ÔÙÖ × Ö ÕÙ Ò Ý Ó Ò Ø Ñ ¡ Ý ÔÖÓÚ Ò Ö Ö Ò ÑÓ Ð Ø Ø Ö Ø Ö Þ × Ö Ô Ø Ó¹Ó ÙÖ Ò ÔÙÖ × × Ó Ø Ñ× × Ö Ò ÓÑ ÓÖ ÒÓØ Ö Ò ÓÑ Û Ö Ô Ò ØÖ Ø ÓÒ Ö Ö× ØÓ Ø ÔÖ Ö Ò Ó Ù×ØÓÑ Ö ÓÖ Ö Ò Ú Ö ÔÙÖ × Ö ÕÙ Ò Ý Ö Ö× ØÓ Ö Ô Ø ÔÙÖ × × Ó Ø Ø Ñ¸ ÒÓÖ Ò Ö Ø Ö ×Ø × Ó Ø Ø Ñ¸ ÑÓÙÒØ Ó Ø Ø Ñ Ò × Þ Ó Ø ÔÙÖ × × Û ÓÐ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-10] ××ÙÑÔØ ÓÒ× Ó ¾ ¬ Ì ÔÖÓ Ð ØÝ Ó Ö Ó¹Ó ÙÖ Ò × Ó ØÛÓ ÔÖÓ Ù Ø× Ò ×Ù × ÕÙ ÒØ ÔÙÖ × × ÓÐÐÓÛ× ÐÓ ÖØ Ñ × Ö × ×ØÖ ÙØ ÓÒº ¬ ËÙ × ÕÙ ÒØ ÔÙÖ × × Ó Ø × Ñ Ù×ØÓÑ Ö´×µ Ò Ó × ÖÚ × ÕÙ Ú Ð ÒØ ØÓ × Ø Ó ÔÙÖ × × ×× ÓÒ× ÙÖ Ò Ø ÐÓ Ô ÖÓ º Å Ø Ó ÓÐÓ Ý ¯ ÓÑÔÙØ Ø ÓÒ Ó Ø Ö ÕÙ Ò Ý ×ØÖ ÙØ ÓÒ× Ó ÐÐ Ó¹Ó ÙÖ Ò × Ó ÔÖÓ Ù Ø Ô Ö׸ ÓÙÒØ Ò ÓÒ Ó¹Ó ÙÖ Ò Ô Ö × ×× ÓÒ ÓÒÐÝ ¯ Ð Ñ Ò Ø ÓÒ Ó ×ØÖ ÙØ ÓÒ× Û Ø ×Ñ ÐÐ ÒÙÑ Ö Ó Ó × ÖÚ Ø ÓÒ× ¯ Ð Ñ Ò Ø ÓÒ Ó Ø Ô Ö ÒØ Ð Ó Ø Ö Ô Ø¹ ÙÝ Ô Ö× ¯ ÓÑÔÙØ Ø ÓÒ Ó Ø Ó¹Ó ÙÖ Ò ÔÖ ØÓÖ ÓÖ Ô Ö ×Ó Ø Ø ÓÙØÐ Ö× ÓÖ ÔÖ ØÓÖ Ò Ó × ÖÚ × ÓÖÖ Ð Ø Ø Ñ׺Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-11]
    • Pattern Discovery Association mining for personalization Basic Idea: match left-hand side of rules with the active user session and recommend items in the rule’s consequent Essential to store patterns in efficient data structures • the search of all rules in real-time is computationally ineffective Ordering of accessed pages is not taken into account Good recommendation accuracy, but the main problem is “coverage” • high support thresholds lead to low coverage and may eliminate important, but infrequent items from consideration • low support thresholds result in very large model sizes and computationally expensive pattern discovery phasePKDD 2001 Tutorial: “KDD for Personalization” [PD-12] [1] Association Mining - Basic Concepts We start with a set I of items and a set D of transactions. A transaction T is a set of items (a subset of I): I = { i1 , i 2 ,..., i m } T ⊆ I An Association Rule is an implication on itemsets X and Y, denoted by X ==> Y, where X ⊆ I , Y ⊆ I , X ∩Y =∅ The rule meets a minimum confidence of c, meaning that c% of transactions in D which contain X also contain Y. In addition for each itemset a minimum support of s must be satisfied: s ≤ X ∪Y / D c ≤ X ∪Y / XPKDD 2001 Tutorial: “KDD for Personalization” [PD-13] [2]
    • È ØØ ÖÒ × ÓÚ ÖÝ ××Ó Ø » ××Ó Ø Ø Ñ× Ò Ù× Ö× Ì ÔÔÖÓ Ó Ä Ò¸ ÐÚ Ö Þ ² ÊÙ Þ ¿ Å Ò µ Í× Ö× Ö ××Ó Ø ØÓ ÓØ Ö Ò Ø ÖÑ× Ó ÓÛ Ø Ý Ö Ø Ø Ñ׺ µ ÁØ Ñ× Ö ××Ó Ø ØÓ ÓØ Ö Û Ø Ö ×Ô Ø ØÓ Ù× Ö ÔÖ Ö Ò ×º ××Ó Ø ÓÒ× ÑÓÒ Ø Ñ× Ò ÓÙÒ Ó ¹Ð Ò º ××Ó Ø ÓÒ× ØÓ Ø Ø Ú Ù× Ö Ò ÓÙÒ ÓÒ¹Ð Ò º ÓÖ Ò ØÓ ÓÙÖ Ø ÓÖ Þ Ø ÓÒ Î × Ð ØÝ Ë ÖÚ Ð Ñ ÒØ ÔÔÐ Ø ÓÒ Ó Ø È Ö×ÓÒ Ð Ö ÓÑÑ Ò Ø ÓÒ Å Ø Ò × ÓÒ ××Ó Ø ÓÒ× ÇÒ¹Ð Ò × ÓÚ ÖÝ Ó ××Ó º ÑÓÒ Ø Ñ× Ò ÑÓÒ Ù× Ö× ÖÙÐ × Û Ø Ú Ò ÊÀËÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-14] Å Ø Ó ÓÐÓ Ý ¯ Ê ÓÑÑ Ò Ø ÓÒ× Ö ×Ù Ø ØÓ Ñ Ò ÑÙÑ ÓÒ Ò Ò Ñ Ò ÑÙÑ ÒÙÑ Ö Ó ÖÙÐ × ÓÒ×ØÖ ÒØ׺ ¯ Ì Ñ Ò Ö × ÓÚ Ö× ××Ó Ø ÓÒ ÖÙÐ × Ø Ö Ø Ú Ðݸ ÙÒØ Ð Ø ×Ö ÒÙÑ Ö Ó ÖÙÐ × × ÜØÖ Ø º Ì ×ÙÔÔÓÖØ ÙØÓ × Ù×Ø Ò Ø Ö Ø ÓÒº ¯ ÊÙÐ × ÓÒ ÖÒ ÓØ Ø Ñ× Ò Ù× Ö× Í× Ö½ Ð Æ Í× Ö¾ ×Ð µ Ì Ö ØÍ× Ö Ð ÁØ Ñ½ Ð Æ ÁØ Ñ¾ Ð µ Ì Ö ØÁØ Ñ Ð ¯ Ò Ø Ø Ñ× Ö ÓÑÔÙØ ÖÓÑ ××Ó Ø ÓÒ× ÒÚÓÐÚ Ò Ù× Ö× × Ñ Ð Ö ØÓ Ø Ø Ú Ù× Öº ÓÒ¹Ð Ò ¯ Ë ÓÖ × Ó Ø Ñ× Ö ÓÑÔÙØ ÖÓÑ ××Ó Ø ÓÒ× Ö Ø Ò Ù× Ö ÔÖ Ö Ò ×º Ó ¹Ð Ò ¯ Ì Ò Ø Ø Ñ× Û Ø ×Ø × ÓÖ × Ö ×Ù ×Ø ØÓ Ø ØÚ Ù× Öº ÓҹРÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-15]
    • Pattern Discovery Association mining for personalization The approach of Mobasher, et al, 2001 [45] Main Idea: avoid offline generation of all association rules; generate recommendations directly from itemsets • discovered frequent itemsets of are stored into an “itemset graph” (an extension of lexicographic tree structure of Agrawal, et al 1999 [2]) • recommendation generation can be done in constant time by doing a directed search to a limited depth According to our categorization Visibility: Personal recommenda- Service element: pageview tions or silent dynamic adjustment Matching based on: user behaviourPKDD 2001 Tutorial: “KDD for Personalization” [PD-16] [3] Methodology: • Construct Frequent Itemset Graph – each node at depth d in the graph corresponds to an itemset – I, of size d and is linked to itemsets of size d+1 that contain I at level d+1 – the single root node at level 0 corresponds to the empty itemset • frequent itemsets are matched against a users active session S by performing a search of graph to depth |S| • a recommendation r is an item at level |S+1| whose recommendation score is the confidence of rule S ==> rPKDD 2001 Tutorial: “KDD for Personalization” [PD-17] [4]
    • Pattern Discovery Sequence mining for personalization Main Idea: take the ordering of accessed items into account Two basic approaches • use contiguous sequences (e.g., Web navigational patterns) • use general sequential patterns Contiguous sequential patterns are often modeled as Markov chains and used for prefetching (i.e., predicting the next user access based on previously accessed pages In context of recommendations, they can achieve higher accuracy than other methods, but may be difficult to obtain reasonable coveragePKDD 2001 Tutorial: “KDD for Personalization” [PD-18] [5] Pattern Discovery Sequence mining for personalization Markov chain representation often leads to high space complexity due to model sizes Some Solutions • selective Markov Models (Deshpande, Karypis, 2000 [17]) use various pruning strategies to reduce the number of states (e.g., support or confidence pruning, error pruning) • longest repeating subsequences (Pitkow, Pirolli, 1999 []) similar to support pruning, used to focus only on significant navigational paths • increased coverage can be achieved by using all-Kth-order models (i.e., using all possible sizes for user histories)PKDD 2001 Tutorial: “KDD for Personalization” [PD-19] [6]
    • È ØØ ÖÒ × ÓÚ ÖÝ Ë ÕÙ Ò Ñ Ò Ò ÓÖ Ô Ö×ÓÒ Ð Þ Ø ÓÒ Ì ÔÔÖÓ Ó ÙÐ ² Ë Ñ Ø¹Ì Ñ ¾ Å Ò µ Ê ÓÑÑ Ò Ø ÓÒ× Ö × ÓÒ Ö ÕÙ ÒØ Ô ØØ ÖÒ× Ó Ô ×Ø Ú ÓÙÖº µ Ö ÓÑÑ Ò Ö × ÔÖ ØÓÖ ÓÖ Ð ×× Ó Ú ÒØ׺ µ Ì ÓÒ×Ø ÐÐ Ø ÓÒ Ó Ø Ö ÓÑÑ Ò Ö× ÓÖ ÐÐ Ð ×× × Ö ØÙÖÒ× Ø ×Ø Ö ÓÑÑ Ò Ø ÓÒ× ÓÖ Ú Ò Ù× Ö ×ØÓÖݺ ÓÖ Ò ØÓ ÓÙÖ Ø ÓÖ Þ Ø ÓÒ Î × Ð ØÝ Ë ÖÚ Ð Ñ ÒØ ÍÊÄ׸ × Ø Ó Ø× Ê ÓÑÑ Ò Ø ÓÒ Å Ø Ò × ÓÒ Ò Ú Ø ÓÒ Ç ¹Ð Ò ØÖ Ò Ò Ó Ð ×× Ö× ×ØÓÖ × Ò ÍÊÄ ÔÖÓÜ Ñ ØÝ ÐÓ Ð Ö ÓÑÑ Ò Ö ×Ý×Ø Ñ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-20] Ò Ö Ö Ñ ÛÓÖ ¯ Û Ø Ñ ×ÙÖ × ÓÖ Ø ÕÙ Ð ØÝ Ó Ö ÓÑÑ Ò Ø ÓÒ¸ Ø Ò Ø ×Ø Ò ØÛ Ò Ò Ø ÍÊÄ× ÒØÓ ÓÙÒØ ¯ ×Ø Ò Ù × Ò ØÛ Ò ÝÒ Ñ Ò ×Ø Ø Ö ÓÑÑ Ò Ö× Ø Ø Ó» Ó ÒÓØ Ø Ù× Ö ×ØÓÖ × ÒØÓ ÓÙÒØ ¯ ÓÑ Ò Ò ÐÓ Ð Ö ÓÑÑ Ò Ö ×Ý×Ø Ñ׸ Ó Û ÔÖ Ø× Ð ×× Ó Ú ÒØ× Û Ö Ð ×× Ò ÓÒ Ù× Ö ×ØÓÖݸ ÖÓÙÔ Ó ×ØÓÖ × ÓÖ Ø Û ÓÐ Ø × Øº Ì Ö Ý¸ Ò Ú Ø ÓÒ ×ØÓÖÝ × ¬ × Ø Ó Ú ÒØ× ¬ × ÕÙ Ò Ó Ú ÒØ× ¬ ÑÓÖ ÓÑÔÐ Ü ×ØÖÙ ØÙÖ Ó Ó¹Ó ÙÖ Ò Ú ÒØ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-21]
    • È ØØ ÖÒ × ÓÚ ÖÝ Í× ÔÖÓ Ð × ÓÖ Ô Ö×ÓÒ Ð Þ Ø ÓÒ Ì ÔÔÖÓ Ó ÅÓ × Ö Ø Ð ¿¸ ¾ ÌÛÓ ØÝÔ × Ó Ù× ÔÖÓ Ð × ÐÙ×Ø Ö× Ó × Ñ Ð Ö Ù× Ö ØÖ Ò× Ø ÓÒ× Ò¹ ÐÙ×Ø Ö× Ó Ô × ×× Ò Ý Û ØÒ × Ñ Ø Ø Ö ÑÓÚ × ØÓ Ø Ö Ô × ÛØ ×ÙÔÔÓÖØ Ð ×× Ø Ò Ñ Ò Ú ÐÙ Ö ØÒ Ø Ñ Ñ Ö× Ó ÐÙ×Ø Ö ÒØÓ ÓÒ Ö ÔÖ × ÒØ Ø Ú ÔÖÓ Ð ÓÖ Ò ØÓ ÓÙÖ Ø ÓÖ Þ Ø ÓÒ Î × Ð ØÝ È Ö×ÓÒ Ð Ö ÓÑÑ Ò ¹ Ë ÖÚ Ð Ñ ÒØ Ô Ú Û Ø ÓÒ ÓÖ × Ð ÒØ ÝÒ Ñ Ù×ØÑ ÒØ Å Ø Ò × ÓÒ Ù× Ö Ú ÓÙÖ Ç ¹Ð Ò × ÓÚ ÖÝ Ó Ð×Ó Ô ÓÒØ ÒØ Ò Ö Ø ÔÖÓ Ð ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-22] µ Ú × Ñ Ð Ö Ô Ö ÓÖÑ Ò ØÓ ÓÒ¹Ð Ò ÓÐÐ ÓÖ Ø Ú ÐØ Ö Ò Ñ× µ Ù× Ò Ñ Ò Ñ Ð ÒÙÑ Ö Ó Ô Ú Û× ÓÖ Ø Ø Ú Ù× Ö Å Ø Ó ÓÐÓ Ý ¯ ÈÖ ÔÖÓ ×× Ò Ô × ¬ ×× ÒÑ ÒØ Ó Û Ø× ØÓ Ø Ô Ú Û× ¬ Ë Ò Ò Ø ×Ø Ò ¸ × ÓÒ Ô ×Ø Ý Ø Ñ ¬ ÆÓÖÑ Ð Þ Ø ÓÒ Ó Ô Ú Û Û Ø× ¯ È Ì ÈÖÓ Ð Ö Ø ÓÒ × ÓÒ ÐÙ×Ø Ö Ò Ì Ò ÕÙ × ½º ÐÙ×Ø Ö Ò Ó Ù× Ø ØÓ ×Ø Ð × Ø Ö Ø ÔÖÓ Ð × ¾º Å Ø Ö Ð Þ Ø ÓÒ Ó Ø ÔÖÓ Ð × × Ú ØÓÖ× Ó ´Ô ¸Û ص Ô Ö× ¿º Ë Ò Ó Ø Ù× Ö³× ×ØÓÖÝ Ý Ñ Ò× Ó ×Ð Ò Û Ò ÓÛ Ø Ø ÐÐÓÛ× ÓÒÐÝ × Ø Ó Ô ×× × ØÓ ÓÒ× Ö Ò Ø ÔÖÓ Ð º Å Ø Ò Ø Ù× Ö × ×× ÓÒ Û Ø ÔÖÓ Ð º Å Ø Ö Ò ÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [PD-23]
    • A Framework for Personalization Based on Aggregate Profiles Offline PhasePKDD 2001 Tutorial: “KDD for Personalization” [PD-24] [7] A Framework for Personalization Based on Aggregate Profiles Input from the batch process Online Usage Profiles Phase Content Profiles • Match current user’s activity against the discovered profiles • Each recommended item is assigned a score based on – matching criteria and quality of aggregate profiles – “information value” of the item based on domain knowledgePKDD 2001 Tutorial: “KDD for Personalization” [PD-25] [8]
    • Aggregate Profiles Based on Clustering Transactions (PACT) (Mobasher, et al, [42, 43]) • Input – set of relevant pageviews in preprocessed log P = { p1 , p2 ,! , pn } – set of user transactions T = {t1 , t 2 , ! , t m } – each transaction is a pageview vector t = w( p1 , t ), w( p2 , t ),..., w( pn , t )PKDD 2001 Tutorial: “KDD for Personalization” [PD-26] [9] Aggregate Profiles Based on Clustering Transactions (PACT) • Transaction Clusters – each cluster contains a set of transaction vectors – for each cluster compute centroid as cluster representative " c = u1c , u2 ,!, un c c • Aggregate Usage Profiles – a set of pageview-weight pairs: for transaction cluster c C, select each pageview pi such that ui (in the cluster centroid) is greater than a pre-specified thresholdPKDD 2001 Tutorial: “KDD for Personalization” [PD-27] [10]
    • Example Aggregate Profiles • Example Profiles based on the PACT method – Based on data from the Association for Consumer Research Site: 1.00 1.00 Call for Papers Call for Papers 0.67 0.67 ACR News Special Topics ACR News Special Topics 0.67 0.67 CFP: Journal of Psychology and Marketing I CFP: Journal of Psychology and Marketing I 0.67 0.67 CFP: Journal of Psychology and Marketing II CFP: Journal of Psychology and Marketing II 0.67 0.67 CFP: Journal of Consumer Psychology II CFP: Journal of Consumer Psychology II 0.67 0.67 CFP: Journal of Consumer Psychology I CFP: Journal of Consumer Psychology I 1.00 1.00 CFP: Winter 2000 SCP Conference CFP: Winter 2000 SCP Conference 1.00 1.00 Call for Papers Call for Papers 0.36 0.36 CFP: ACR 1999 Asia-Pacific Conference CFP: ACR 1999 Asia-Pacific Conference 0.30 0.30 ACR 1999 Annual Conference ACR 1999 Annual Conference 0.25 0.25 ACR News Updates ACR News Updates 0.24 0.24 Conference Update Conference UpdatePKDD 2001 Tutorial: “KDD for Personalization” [PD-28] [11] Hypergraph-Based Clustering (Han, Karypis, Kumar, Mobasher, 1998 [26]) • Construct a hypergraph from sets of related items – Each hyperedge represents a frequent itemset – Weight of each hyperedge can be based on the characteristics of frequent itemsets or association rules (e.g., support, confidence, interest, etc.)PKDD 2001 Tutorial: “KDD for Personalization” [PD-29] [12]
    • Hypergraph-Based Clustering • Recursively partition hypergraph so that each partition contains only highly connected items – Given a hypergraph we find a k-way partitioning such that the weight of the hyperedges that are cut is minimized – The fitness of partitions measured in terms of the ratio of weights of cut edges to the weights of uncut edges within the partitions – The connectivity measures the percentage of edges within the partition with which the vertex is associated -- used for filtering partitions – Vertices from partial edges can be added back to clusters based on a user-specified overlap factorPKDD 2001 Tutorial: “KDD for Personalization” [PD-30] [13] Profiles Based on Hypergraph Clusters (Mobasher, Cooley, Srivastava, 1999 [41]) • Input – input for clustering is the set of large itemsets from association rule module – each itemset is a hyperedge (weights are a function of the interest of the itemset) support( I ) Interest ( I ) = ∏ i∈I support(i) – In practice can use the log of interest to avoid few highly frequent patterns from totally dominatingPKDD 2001 Tutorial: “KDD for Personalization” [PD-31] [14]
    • Profiles Based on Hypergraph Clusters • Aggregate Profiles (Item/Pageview Clusters) – clustering program directly outputs a set of overlapping pageview clusters – the weight associated with pageview p in a cluster C is based on the connectivity value of p in hypergraph partition: {e | e ⊆ C , p ∈ e} conn( p, C ) = {e | e ⊆ C}PKDD 2001 Tutorial: “KDD for Personalization” [PD-32] [15] Recommendation Engine for Using Aggregate Profiles • Match user’s activity against discovered profiles – a sliding window over the active session to capture the current user’s “short-term” history depth – profiles and the active session are treated as vectors – matching score is computed based on the similarity between vectors (e.g., normalized cosine similarity) • Recommendation scores are based on • matching score to aggregate profiles • “information value” of the recommended item (e.g., link distance of the recommendation to the active session) – recommendations are contributed by multiple profilesPKDD 2001 Tutorial: “KDD for Personalization” [16] [PD-33]
    • Active Session Window • Example: Session window of size 5 A.html ! B.html ! C.html ! D.html ! E.html ! D.html ! F.html active user session Session window • Associating weight with items in the active session: – assigned by site owner based on perceived importance – based on recency (recent pages weighted higher) or time spent on pages – based on page types (e.g., content v. navigational)PKDD 2001 Tutorial: “KDD for Personalization” [PD-34] [17] Example: Recommendations Based on PACT Example profiles: Current User Session U: A.html => B.html => C.html => E.html PROFILE 0 ------------- Assume session window size of 3 and unit weights, using 1.00 D.html (cosine) similarity between active session and each profile: 0.50 A.html 0.50 C.html Sim(U, P0) = (0.5+0.5) / SQRT (1.75 * 3) = 0.44 0.50 E.html Sim(U, P1) = (0.5+0.5+0.5) / SQRT(2.5*3) = 0.20 Sim(U, P2) = (0.75+0.5) / SQRT(1.69*3) = 0.25 PROFILE 1 ------------- Recommendations 1.00 A.html Candidate Recommendations: 0.50 B.html 0.50 C.html P0: D.html (SQRT(0.44*1.00) = 0.66) 0.50 D.html A.html (SQRT(0.44*0.50) = 0.47) 0.50 E.html 0.50 F.html P1: A.html (SQRT(0.20*1.00) = 0.45) PROFILE 2 D.html (SQRT(0.20*0.50) = 0.32) ------------- F.html (SQRT(0.20*0.50) = 0.32) 0.75 B.html 0.75 F.html 0.50 A.html P2: F.html (SQRT(0.22*0.75) = 0.41) 0.50 C.html A.html (SQRT(0.22*0.50) = 0.33) 0.25 D.html D.html (SQRT(0.22*0.25) = 0.23)PKDD 2001 Tutorial: “KDD for Personalization” [PD-35] [18]
    • Integration of Content Profiles (Mobasher, et al., 2000 [44]) • Cluster features over the n-dimensional space of pageviews • For each feature cluster derive a content profile by collecting pageviews in which these features appear as significant (represented as overlapping collections of pageview-weight pairs) Weight Pageview ID Significant Features (stems) 1.00 CFP: One World One Market world challeng busi co manag global 0.63 CFP: Intl Conf. on Marketing & Development challeng co contact develop intern 0.35 CFP: Journal of Global Marketing busi global 0.32 CFP: Journal of Consumer Psychology busi manag global Weight Pageview ID Significant Features (stems) 1.00 CFP: Journal of Psych. & Marketing psychologi consum special market 1.00 CFP: Journal of Consumer Psychology I psychologi journal consum special market 0.72 CFP: Journal of Global Marketing journal special market 0.61 CFP: Journal of Consumer Psychology II psychologi journal consum special 0.50 CFP: Society for Consumer Psychology psychologi consum special 0.50 CFP: Conf. on Gender, Market., Consumer Behavior journal consum marketPKDD 2001 Tutorial: “KDD for Personalization” [PD-36] [19] Integration of Content Profiles • Integration with Recommendation Engine – Usage and content profiles have similar representation, so they can be used by the recommendation engine in the same way • Item weights in profiles must be normalized, so content and usage profiles can be compared on the same scale – One approach: match active user session with all profiles (both content and usage); then use the maximal recommendation score for candidate recommendations – Another approach: use content profiles for generating recommendations only if no matching usage profiles (with sufficient confidence) is foundPKDD 2001 Tutorial: “KDD for Personalization” [PD-37] [20]
    • Evaluating Personalization PKDD 2001 Tutorial: “KDD for Personalization” [E-1] Evaluating usability: goals / tasks? Recall operational definition: A Web site’s usability is high if users - achieve their goals / perform their tasks in little time, - do so with a low error rate, - experience high subjective satisfaction. Depending on the site, relevant goals / tasks may be to: - stay in the site, return to the site, buy... => E-metrics - locate content (search), - learn, - ...PKDD 2001 Tutorial: "KDD for Personalization" [E-2]
    • Evaluating usability: methodological caveats Questionnaire data: self-reports are often biased; observation of behavior in experiments advisable Comparisons of sites with/without personalization, or before/after personalization introduced, with respect to "normal user behavior" (server logs): usually a quasi-experiment - many uncontrolled variables (e.g., user intentions) - poss. several differences between sites/site versions => causal attribution of success to personalization becomes difficultPKDD 2001 Tutorial: "KDD for Personalization" [E-3] Evaluating usability: results I CyberBehavior Research Center 1999 survey - 81% of 694 respondents have visited a person. site - 64% of those found it useful: helpful, time saving - perceived usefulness changes with product (books > music > inf.technol. > news/articles > other) - main problems: privacy, ineffectiveness when behav. did not reflect user "personally" (e.g., buying a gift) - concern that possible choices may be limited - little differences of opinion between personalization occurring in response to behavior or to solicited inputPKDD 2001 Tutorial: "KDD for Personalization" [E-4]
    • Evaluating usability: results II Belkin [3], reviewing studies of recommendations in IR systems carried out at Rutgers Univ. since 1995: - measures of performance and subj. satisfaction - relevance feedback worked well, but bettter with both increased knowledge of how it worked, and with increased control by the user of its suggestions: - relevance feedback + term suggestion performed better than, and was preferred to, pure relevance feedback - users preferred to save effort: were willing to hand over the subsidiary task of term selection to a system they trust edPKDD 2001 Tutorial: "KDD for Personalization" [E-5] Evaluating usability: results III Nielsen Net Ratings 1999 registered visitors of portal sites, i.e., those who can customize, - spend > 3 times longer at home portal than others - view 3-4 times more pagesPKDD 2001 Tutorial: "KDD for Personalization" [E-6]
    • Why are results scarce? Possible reasons "In essence, web design is a problem in user interface design. However, ... few web designers can afford to subject their web sites to formal usability testing in special labs." Perkowitz & Etzioni [52]: Adaptive web sites: an AI challenge. "Web personalization is much over-rated and mainly used as a poor excuse for not designing a navigable website." Nielsen [47]: Personalization is over-rated. "Personalization costs. ... You’re more likely to get a good return on your efforts ... by fixing other problems, such as difficulty in locating content." Lighthouse on the Web [36], quoting from Mainspring and User Interface EngineeringPKDD 2001 Tutorial: "KDD for Personalization" [E-7] Can other results be transferred? Research on adaptive educational software since ~ 1970 - usually, user control helpful for learning; adaptive interfaces particularly helpful for novices - interfaces changing over time: difficult to learn - adaptive presentation (more info depending on user knowledge) improves comprehension and reduces reading time - adaptive link annotation - can reduce no. of visited pages + learning time - encourages novices to navigate non-sequentially - enables users to rate the difficulty of a page betterPKDD 2001 Tutorial: "KDD for Personalization" [E-8]
    • Can other results be transferred? (contd.) - adaptive link ordering improves user performance in information search tasks - but unstable order of options is confusing for novices so hiding is better for novices - for novices, direct guidance is useful ("next" link is most popular choice) - the more users agree with the system’s suggestions, the better their test results (surveys in [11,12])PKDD 2001 Tutorial: "KDD for Personalization" [E-9] Further factors affecting subjective satisfaction- user control (general guideline for software development)- must match user’s interests at the moment- users don’t want extra work: "paradox of the active user"- users don’t like to be recognized too soon- users want to be anonymous, at least at certain times- users want openness / disclosure- people don’t want relationships with corporations, but with other people- be specific without being exclusive- consider information structure on Web (non-monetary rewards better than differential pricing) respect the user !PKDD 2001 Tutorial: "KDD for Personalization" [E-10]
    • È ØØ ÖÒ Ú ÐÙ Ø ÓÒ ÖÓÑ Ø Ù× Ò ×× È Ö×Ô Ø ÚÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ ÅÝÖ ËÔ Ð ÓÔÓÙÐÓÙ ÀÀÄ º[E-11] ºº Í× Ö Ë Ø × Ø ÓÒ ² Ù× Ò ×× ËÙ ×× ÓÑÔ ÒÝ ÓÔ Ö Ø Ò Ï ×Ø × ÓÙÐ Ö ØÓ Ö Ø Ú ÐÙ ÓÖ Ø× ´ÔÖÓ×Ô ØÚ µ Ù×ØÓÑ Ö× µ Á Ø Ö × ÒÓ Ú ÐÙ ÓÖ Ø Ù× Ö׸ Ø Ý Û ÐÐ ÒÓØ ÙÝ Ò Ø Ý Û ÐÐ ÒÓØ ÓÑ Òº µ Á Ø Ù× Ö×» Ù×ØÓÑ Ö× Ö ÒÓØ × Ø × ¸ Ø Ý Û ÐÐ ÒÓØ ÙÝ Ò »ÓÖ Ø Ý Û ÐÐ ÒÓØ ÓÑ Òº µ Í× Ö» Ù×ØÓÑ Ö × Ø × Ø ÓÒ × ÔÖ Ö ÕÙ × Ø ÓÖ Û ÒÒ Ò Ø Ñ ØÓ Ø ÓÑÔ Òݺ ¯ ÓÒÚ Ö× ÓÒ Ì Ù× Ö ÓÑ × Ù×ØÓÑ Öº Ï ÒÒ Ò Ñ Ò× ¯ Ê Ø ÒØ ÓÒ Ì Ù×ØÓÑ Ö ×Ø Ý× ÐÓÝ ÐºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-12]
    • Í× Ö Ë Ø × Ø ÓÒ ÅÓ ÐÐ Ò ÁÒ ØÓÖ× Ø Ø Ö ÕÙ Ö ÒØ Ö Ø ÓÒ Û Ø Ø Ù× Ö ¯ ÁÒØ Ö Ø Ú ØÝ ¯ × Ó Ù× ¯ ÈÐ × Ò ÒÚ ÖÓÒÑ Òظ ÒØ ÖØ Ò Ò ÒÚ ÖÓÒÑ ÒØ ¯ ÅÙÐØ ÔÐ Ò Ú Ø ÓÒ Ñ Ø Ô ÓÖ× ¯ ººº ¯ Î ÐÙ Ö Ø ÓÒ¸ × Ô Ö Ú Ý Ø Ù× Ö ÁÒ ØÓÖ× Ø Ø Ò Ñ ×ÙÖ » ÔÔÖÓÜ Ñ Ø Û Ø ÓÙØ Ù× Ö ÒØ Ö Ø ÓÒ ¯ È × Ô Ö Ú × ØÓÖ ¯ ÙÖ Ø ÓÒ Ó ×Ø Ý ¯ Î × ØÓÖ× Ô Ö Ô ¼ ¯ Ê ×ÔÓÒ× Ø Ñ ¼Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-13] Í× Ö Ë Ø × Ø ÓÒ ÓÑÔÙØ Ø ÓÒ ¯ Á ÒØ Ø ÓÒ Ó × Ø Ó × Ø× Ø ÓÒ Ò ØÓÖ× ¯ × ÒÓ Ò ÔÔÖÓÔÖ Ø ÕÙ ×Ø ÓÒÒ Ö ¯ ÈÖ × ÒØ Ø ÓÒ Ó Ø ÕÙ ×Ø ÓÒÒ Ö ØÓ Ö ÔÖ × ÒØ Ø Ú Ù× Ö × ÑÔÐ ¯ Ò ÐÝ× × Ó Ø Ö ×ÔÓÒ× × ¯ ÓÒ ÐÙ× ÓÒ× ÓÒ Ø ÑÔ Ø Ó Ø ÓÖÖ Ð Ø ÓÒ× ÑÓÒ Ø × Ø× Ø ÓÒ Ò ØÓÖ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-14]
    • Í× Ö Ë Ø × Ø ÓÒ Ò ÜÔ Ö Ñ ÒØ Ì ×ØÙ Ý Ó Ñ Ý ¾½ ¯ ØÓÖ× Ö Ø Ò Ù× Ö × Ø × Ø ÓÒ ¡ × Ó Ù× ¡ ÁÒ ÓÖÑ Ø ÓÒ ÙØ Ð ØÝ Ó Ø ÔÖ × ÒØ ÓÒØ ÒØ ¡ ØØÖ Ø Ú Ò ×× Ó Ø ÔÖ × ÒØ Ø ÓÒ Ñ Ø Ô ÓÖ ¡ ººº ¯ ÜÔ Ö Ñ ÒØ Ð × ØØ Ò × ÓÖ Ø Ú ÐÙ Ø ÓÒ Ó × ØÓ ÓÑÑ Ö Ð ×Ø × ¡ Å ÔÔ Ò Ó Ø ØÓÖ× ÓÒ ÕÙ ×Ø ÓÒÒ Ö ¡ ×Ø Ð × Ñ ÒØ Ó ÖÓÙÔ Ó Ö ÔÖ × ÒØ Ø Ú Ù× Ö× ¡ ÜÔ Ö Ñ ÒØ Ø ÓÒ ÓÒ ÐÓ Ð ÓÑÔÙØ Ö ÔÓÓÐ Ò Ú ØÖÓ ¯ ËØ Ø ×Ø Ð Ò ÐÝ× × Ó Ø Ù× Ö Ö ×ÔÓÒ× × ¯ Ê Ò Ò Ó Ø ØÓÖ× Ý ÑÔÓÖØ ÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-15] Ì Ò Ò × Ó ¾½ Ö ¯ ÉÙ Ð ØÝ Ó Ø ÔÖ × ÒØ Ø ÓÒ Ñ Ø Ô ÓÖ ÒØ ÖØ ÒÑ ÒØ Û Ò ×× Ò Ø ×Ø ÔÐ Ý× Ø ÑÓ×Ø ÑÔÓÖØ ÒØ ÖÓÐ º ¯ ÁÒ ÓÖÑ Ø ÓÒ ÙØ Ð ØÝ Ì ÑÓÙÒØ Ó Ò ÓÖÑ Ø ÓÒ Ñ Ú Ð Ð × Ø × ÓÒ ÑÓ×Ø ÑÔÓÖØ ÒØ ØÓÖº ÙÖØ Ö Ò Ò × Ì Û × Ø × Ø ×Ø ÒÓØ Ñ ×ØÖÓÒ Ò Ù× ÙÐ ÓÒÒ Ø ÓÒ ÛØ Ø ÒØ Ö ×Ø× Ó Ø ×ØÙ Ý Ô ÖØ Ô ÒØ× Ò ÒÓØ ×Ù Ò Ö ØÒ ÓÒØ ÜØ Ò × Ò× Ó ÓÑÑÙÒ ØÝ Ò ØÓ ÙÐ ÓÒØ ÒÙ Ò Ö Ð Ø ÓÒ× Ô ÛØ Û ×Ø Ù× Ö× ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-16]
    • ź ËÔ Ò ÓÐ Ò Ö ÔØÙÖ × Ú Ý Ö× Ó ÒØ ¹ Ù×ØÓÑ Ö¹× Ø × Ø ÓÒ Ö ÔÓÖØ× ÒØÓ Ø ÕÙ ×Ø ÓÒ Á× Ù×ØÓÑ Ö Ë Ø × Ø ÓÒ ÁÖÖ Ð Ú ÒØ Ò×Û Ö Ù×ØÓÑ Ö Ñ ×ÙÖ Ñ ÒØ ×Ý×Ø Ñ× × ÓÙÐ Ö Ú×Ø ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-17] Í× Ö Ë Ø × Ø ÓÒ ² Ù× Ò ×× ËÙ ×× ¯ Í× Ö» Ù×ØÓÑ Ö × Ø × Ø ÓÒ × ÔÖ Ö ÕÙ × Ø ÓÖ Û ¹× Ø ³× ×Ù ×׺ ¯ Í× Ö» Ù×ØÓÑ Ö × Ø × Ø ÓÒ Ó × ÒÓØ ÑÔÐÝ Û ¹× Ø ³× ×Ù ×׺ Ù× ¬ Ì Ó Ð Ó Û ¹× Ø × ÒÓØ ØÓ Ñ Ù× Ö× ÔÔݺ ¬ Ì Ó Ð Ó Û ¹× Ø × ØÓ ÓÒØÖ ÙØ ÒØÓ Ù× Ò ×× ×Ù ×׺Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-18]
    • Í× Ö Ë Ø × Ø ÓÒ ² Ù× Ò ×× ËÙ ×× ¯ Û Ö Ò ×× ¯ ÓÒØ Ø ¯ ÓÒÚ Ö× ÓÒ ¬ Ò ÓÒÑ ÒØ Ò×Ø Ó ÓÒÚ Ö× ÓÒ ¯ Ê Ø ÒØ ÓÒ Ò ¬ ØØÖ Ø ÓÒ Ò×Ø Ó Ö Ø ÒØ ÓÒ ÀÓÛ ÒØ × ÓÒ ÔØ× ØÖ Ò×Ð Ø ÒØÓ Ò ØÓÖ× ÓÑÔÙØ Ð ÙÔÓÒ Ù×ØÓÑ Ö ØÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-19] Ù× Ò ×× ËÙ ×× ÖÓÑ Ø Ú ÛÔÓ ÒØ Ó Ø ËØ ¡ ÆÙÑ Ö Ó Ô Ö ÕÙ ×Ø× ¯ ËØ Ò Ý × ¾¼ ¡ ÙÖ Ø ÓÒ Ó × Ø Ú × Ø× ¡ Ê ×ÔÓÒ× ØÑ ¡ ËÙÔÔÓÖØ Ò Ú Ø ÓÒ ÑÓ ¡ × ÓÚ Ö Ð ØÝ ¯ Ë Ø ÕÙ Ð ØÝ ¼ ¡ ×× Ð ØÝ ¡ È × Ô Ö Ú × ØÓÖ ¡ Î × ØÓÖ× Ô Ö ÔÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-20]
    • Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× Ì ÑÓ Ð Ó ÖØ ÓÒ Ø Ð Active Investigators S i Customers t Loyal e Customers U s e r Short−time Visitors sÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-21] Ì Ù× Ö¹ØÝÔ × Ò Ë ÓÖØ¹Ø Ñ Ú × ØÓÖ× ËØ Ý Ò Ø ×Ø ÓÖ Ú ÖÝ × ÓÖØ Ø Ñ ØÚ ÒÚ ×Ø ØÓÖ× ËØ Ý Ò Ø ×Ø ÐÓÒ Ö Ò ×× Ñ ÒÝ Ô × Ù×ØÓÑ Ö× È Ö ÓÖÑ ÔÙÖ × ÄÓÝ Ð Ù×ØÓÑ Ö× Ù×ØÓÑ Ö× Ø Ø Ö ¹Ú × Ø Ø ×Ø ØÓ Ø Ø Ø Ø ¯ Ì ×Ø Ò Ø ÓÒ ØÛ Ò × ÓÖØ¹Ø Ñ Ú × ØÓÖ× Ò ØÚ ÒÚ ×Ø ØÓÖ× × × ÓÒ ¡ ÙÖ Ø ÓÒ Ó ×Ø Ý ¡ ÒÙÑ Ö Ó Ô Ö ÕÙ ×Ø× ¯ Ì ÒÓØ ÓÒ Ó Ù×ØÓÑ Ö × Û Ðй Ò ÖÓÑ Ø Ù× Ò ×× Ô Ö×Ô ØÚ º ¯ Á Ö ¹Ú × Ø× Ò ØÖ ¸ ÐÓÝ ÐØÝ Ò Ñ ×ÙÖ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-22]
    • ÅÓ ÐÐ Ò Ñ Ù Ø × ¯ Ù× Ö Ø Ø Ô Ö ÓÖÑ× ÓÒÐÝ Û Ô Ö ÕÙ ×Ø× Ò ¡ × ÓÖØ Ø Ñ Ú × ØÓÖ ¡Ò ÜÔ Ö Ò Ù×ØÓÑ Ö ¯ Ô Ö ÕÙ ×Ø Ò Ø ¡ Ô ¡ ÖÑ×Ø ¡ Ó Ð Ö ÕÙ ×Ø× ÓÙÒØ ¯ Á× Ù×ØÓÑ Ö Ø Ø Ö ØÙÖÒ× ÙØ Ñ × ÒÓ ÔÙÖ × ×Ø ÐÐ ÐÓÝ Ð Ø ÓÒ Ð ØÝ Ó Ø ÑÓ Ð Á ÓÒØ Ø Ò Ý × ¾¼± Ò ÓÒÚ Ö× ÓÒ Ò Ý × ¾±¸ Û Ø × ÓÙÐ ÓÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-23] Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ÓÒØ Ø ² ÓÒÚ Ö× ÓÒ Ò ÝÓ Ô × È ÁÒÚÓ Ø ÓÒ Ó ×Ø Ø ÍÊÄ ÓÖ × Ö ÔØ ÌÖ ØÔ È ¸ Û Ó× ÒÚÓ Ø ÓÒ ÓÖÖ ×ÔÓÒ × ØÓ Ø ÙÐÐ ÐÐÑ ÒØ Ó Ø × Ø ³× Ó Ð ¡ Ð ¡ ÈÖÓ Ù Ø ÓÖ Ö Ò ¡ Ê ØÖ Ú Ð Ó × Ò Ð Ó ÙÑ ÒØ ÖÓÑ Ò Ö Ú ¡ ººº Ø ÓÒ Ô È ¸ Û Ó× ÒÚÓ Ø ÓÒ × ÔÖ Ö ÕÙ × Ø ÓÖ Ö Ò Ø Ö ØÔ ¡ ÈÖÓ Ù Ø Ò×Ô Ø ÓÒ ¡ ÉÙ ÖÝ ØÓÛ Ö × Ò Ö Ú ÓÖ ÓÒ¹Ð Ò Ø ÐÓ ¡ ºººÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-24]
    • Ì ÖÑ Ö Ò Ñ ÒØ Ý ËÔ Ð ÓÔÓÙÐÓÙ ² ÈÓ Ð Ø Ú ÒÚ ×Ø ØÓÖ Í× Ö ×× Ò Ò Ø ÓÒ Ô Ù×ØÓÑ Ö Ø Ú ÒÚ ×Ø ØÓÖ ×× Ò Ø Ö ØÔ Ò ¯ ÓÒØ Ø Ò ÝÓ Ô Ë ×× ÓÒ× ÓÒØ Ò Ò Ì È ÐÐË ×× ÓÒ× ¯ ÓÒÚ Ö× ÓÒ Ò ÝÓ Ô ØÓÛ Ö × Ø Ö ØÔ ÓÚ Ö ÖÓÙÔ Ó ÓÒÒ Ø Ò Ô Ø × Ë ×× ÓÒ× ÓÒØ Ò Ò ÓÒÒ Ø Ò È Ø Ë ×× ÓÒ× ÓÒØ Ò Ò Ò Ø ÓÒÈÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-25] Ì Ñ Ø Ó ÓÐÓ Ý Ó ØÓ Ñ ×ÙÖ Ò ÑÔÖÓÚ ÓÒØ Ø Ò ÓÒÚ Ö× ÓÒ Ò ÝÓ Ô × Áº ËÔ Ø ÓÒ Ó Ø Ø ÓÒ Ò Ø Ö Ø Ô × × ×ØÖ Ø ÓÒ ÔØ× Ò × ÖÚ ¹ × ÓÒ ÔØ Ö Ö Ý ÁÁº × ÓÚ ÖÝ Ó Ö ÕÙ ÒØ Ò Ú Ø ÓÒ Ô ØØ ÖÒ× ÒÚÓÐÚ Ò Ø ÓÒ Ô × ÁÁÁº × ÓÚ ÖÝ Ó Ö ÕÙ ÒØ ´ Ò Ð ×× Ö ÕÙ Òص Ô ØØ ÖÒ× Ð Ò ØÓ Ø Ö Ø Ô × Áκ È ØØ ÖÒ Ú ×Ù Ð Þ Ø ÓÒ ØÓ ÒØ Ý Ø Ô ×¸ Ø Û Ø ÓÒ Ò ÖÓÔ× ´Ø Ù× Ö× Ò ÓÒ Ø Ô Ø µÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-26]
    • Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ Ù×ØÓÑ Ö× Å ÖÓ¹ ÓÒÚ Ö× ÓÒ Ö Ø × Ý Ä Ø Ð ¿ ÓÙÖ ×Ø Ô× ÙÒØ Ð Ø ÔÙÖ × Ó ÔÖÓ Ù Ø ½µ ÈÖÓ Ù Ø ÑÔÖ ×× ÓÒ Ë Ò Ø ÝÔ ÖÐ Ò Ð Ò ØÓ ÔÖÓ Ù Ø ¾µ Ð Ø ÖÓÙ ÓÐÐÓÛ Ò Ø Ð Ò ØÓ Ø ÔÖÓ Ù Ø ¿µ × Ø ÔÐ Ñ ÒØ Ë Ð Ø Ò Ø ÔÖÓ Ù Ø ÓÖ ÔÙÖ × µ ÈÙÖ × Ò Ø ÔÖÓ Ù Ø ½µ ÔÖÓ Ù Ø ÑÔÖ ×× ÓÒ ¾µ Ð Ø ÖÓÙ ÐÓÓ ¹ØÓ¹ Ð Ö Ø Ò Ñ ØÖ × ÓÖ Ø Ñ ¿µ × Ø ÔÐ Ñ ÒØ Ð ¹ØÓ¹ × ØÖ Ø µ ÔÙÖ × × Ø¹ØÓ¹ ÙÝ Ö Ø ÐÓÓ ¹ØÓ¹ ÙÝ Ö ØÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-27] Ì Ñ Ø Ó ÓÐÓ Ý Ó ¿ ØÓ ÑÓÒ ØÓÖ × Ø Ø Ú Ò ×× Áº Á ÒØ Ø ÓÒ Ó Ø Ö ×Ô Ø× Ó × Ø ÓÖ Û Ñ Ö Ò Þ Ò Å Ö Ò Þ Ò Ù × Ì Ò ÕÙ × ÓÖ ÔÖ × ÒØ Ò Ò ÖÓÙÔ Ò ÔÖÓ Ù Ø× ØÓ ÑÓØ Ú Ø ÔÙÖ × × Ë ÓÔÔ Ò Ñ Ø Ô ÓÖ× Å Ò× Ó Ö ØÓ Ø × ÓÔÔ Ö× ÓÖ Ò Ò ÔÖÓ Ù Ø× Ó ÒØ Ö ×Ø Ï × Ò ØÙÖ × Ë Ø Ð ÝÓÙØ ÁÁº ÈÖÓ Ð Ñ ÓÑÔÓ× Ø ÓÒ ½º Ð ×× Ý Ò ÝÔ ÖÐ Ò × Ý Ø Ö Ñ Ö Ò Þ Ò ÔÙÖÔÓ× × ¾º Å ×ÙÖ Ò Ò Ò ÐÝ× Ò ØÖ ÖÓ×× Ø Ó× ÝÔ ÖÐ Ò × ¿º ØØÖ ÙØ Ò Ø Ø Ú Ò ×× Ó ÝÔ ÖÐ Ò ØÓ Ñ Ö Ò Þ Ò Ù ×¸ × ÓÔÔ Ò Ñ Ø Ô ÓÖ ÓÖ × Ò ØÙÖ × Ù× Ò Ú ×Ù Ð Þ Ø ÓÒ Ø Ò ÕÙ × ÓÒ ×Ø Ö Ð ×ÔÐ Ý×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-28]
    • Ù× Ò ×× ËÙ ×× Ù×ØÓÑ Ö ÄÓÝ ÐØÝ ÄÓÝ ÐØÝ × ÑÓÖ Ø Ò × Ø Ö ¹Ú × Ø Ø ÓÒº ÁØ Ö Ð Ø × ØÓ Ò Û ÔÙÖ × × Ò Ø Ö ¯ Ê Ò Ý ¯ Ö ÕÙ Ò Ý ¯ ÅÓÒ Ø ÖÝ Ú ÐÙ ÄÓÝ ÐØÝ ÓÒØÖ ÙØ × ØÓ Ø Ù×ØÓÑ Ö³× Ð Ø Ñ Ú ÐÙ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-29] Ù× Ò ×× ËÙ ×× Ù×ØÓÑ Ö ÐÓÝ ÐØÝ Ù×ØÓÑ Ö ÒÚÓÐÚ Ñ ÒØ Ý Âº Ä Ø Ð ¿¿ ØÓÖ× ØÒ Ù×ØÓÑ Ö ÐÓÝ ÐØÝ ¯ ÌÖÙ×Ø ¯ ÌÖ Ò× Ø ÓÒ Ó×Ø× Û Ò ØÙÖÒ Ö Ø Ý ¡ ÓÑÔÖ Ò× Ú Ò ÓÖÑ Ø ÓÒ Ø Ø ×Ù × ÓÖ ÔÙÖ × × ÓÒ ¡ Ë Ö Ú ÐÙ Ò Ø ÓÖÑ Ó ÓÑÑÓÒ Ð × ÑÓÒ Ù×ØÓÑ Ö× ¡ ÓÑÑÙÒ Ø ÓÒ ÑÓÒ Ù×ØÓÑ Ö× Ò ×ØÓÖ ¡ ÍÒ ÖØ ÒØÝ ÓÒ Ø ÔÖÓ Ù Ø ÕÙ Ð ØÝ ¡ ËÔ ØÝ Ó Ø ×ØÓÖ ¡ ÆÙÑ Ö Ó ÓÑÔ Ø ØÓÖ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-30]
    • ÀÝÔÓØ × × · ÓÑÔÖ Ò× Ú Ò ÓÖÑ Ø ÓÒ¸ × Ö Ú ÐÙ Ò ÓÑÑÙÒ Ø ÓÒ Ø ØÖÙ×Ø ÔÓ× Ø Ú Ðݺ · ÌÖÙ×Ø × ÔÓ× Ø Ú ÑÔ Ø ÓÒ Ù×ØÓÑ Ö ÐÓÝ ÐØݺ ÌÖ Ò× Ø ÓÒ Ó×Ø× Ú Ò Ø Ú ÑÔ Ø ÓÒ Ù×ØÓÑ Ö ÐÓÝ ÐØݺ ÌÖÙ×Ø Ö Ù × ØÖ Ò× Ø ÓÒ Ó×Ø׺ · ÍÒ ÖØ ÒØÝ Ò ÒÙÑ Ö Ó ÓÑÔ Ø ØÓÖ× Ò Ö × ØÖ Ò× Ø ÓÒ Ó×Ø׺ ËÔ ØÝ Ø× ØÖ Ò× Ø ÓÒ Ó×Ø× Ò Ø Ú Ðݺ Ò Ø ÖØ Ö×Ø × Ø Ó ÜÔ Ö Ñ ÒØ× · ËÔ ØÝ × ÔÓ× Ø Ú ÑÔ Ø ÓÒ ØÖÙ×غ Ò Ø ×Ø ÛØ ÉÙ ×Ø ÓÒÒ Ö ×Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-31] Ì ×Ø Ò Ø ÓÒ ØÛ Ò ÐÓÛ Ò ÒÚÓÐÚ Ñ ÒØ ÖÓÙÔ× × ÓÛ Ø Ø ÀÁ À ÖÓÙÔ ¯ ËÔ ØÝ × ÒÓ ÑÔ Ø ÓÒ ØÖÙ×غ · ËÔ ØÝ Ø× ØÖ Ò× Ø ÓÒ Ó×Ø× Ò Ø Ú Ðݺ ÄÇÏ ÖÓÙÔ · ÍÒ ÖØ ÒØÝ × ÔÓ× Ø Ú ÑÔ Ø ÓÒ ØÖÙ×غ Ì ÒÙÑ Ö Ó ÓÑÔ Ø ØÓÖ× Ö × × ØÖ Ò× Ø ÓÒ Ó×Ø׺ · Ë Ö Ú ÐÙ × ÔÓ× Ø Ú ÑÔ Ø ÓÒ ØÖÙ×غ Ò ØÒ Ø Ø Ø ØÓÖ× ØÒ Ù×ØÓÑ Ö ÐÓÝ ÐØÝ ´× Ø Ö ¹Ú × Ø×µ Ö ÑÓÒ Ø ØÛÓ ÖÓÙÔ׺Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-32]
    • Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× Ì ¹Ñ ØÖ × Ó Æ Ø Ò ×× ½ ØÓÖ× Ö Ñ ÛÓÖ ÓÖÑÙÐ Ï Ø × ÓÙÐ ×× ¹ Ï Ø × Ø ×× Ï Ø × ÓÙÐ Ñ ¹ ÑÒ Ø Ý Ø Ñ ¹ Ó Ø Ò ÐÝ× × ×ÙÖ Ò ÓÛ ×ÙÖ × × Ö ×ÙÐØ Ó Ò ÒØ ÖÚ Û¹ × ×ØÙ Ý Û Ø ¾¼ ×Ù ×× ÙÐ ¹ ÓÑÔ Ò × ½Èà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-33] Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ¹Å ØÖ × ØÓÖ× ½ Ï Ò Ñ ×ÙÖ Ò ×Ø ´ Ò Ù× Ò ××µ ×Ù ×׸ Ñ Ö Ø Ö× ÓÒ× Ö ¯ Û Ö Ò ×× ¯ ÕÙ × Ø ÓÒ Ú× Ò ÓÒÑ ÒØ ¯ ÓÒÚ Ö× ÓÒ Ú× ØØÖ Ø ÓÒ ¯ Ê Ø ÒØ ÓÒ Ú× ÙÖÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-34]
    • Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ¹Å ØÖ × Ö Ñ ÛÓÖ ½ Ì Ö × ÒÓ Ö ÙÔÓÒ Ò Ø ÓÒ Ó ÑÓ×Ø ØÓÖ× ¡ Á× Ø × Ó Ø Ò ÐÝ× × Ù× Ö¸ × ×× ÓÒ¸ Ô Ö ÕÙ ×ظ Ô ÑÔÖ ×× ÓÒ ÓÖ Ø ¡ Ï Ø × × ×× ÓÒ ¡ Ï Ò Ó × Ù× Ö ÓÑ × Ù×ØÓÑ Ö ¡ Ï Ò × Ù×ØÓÑ Ö ××ÙÑ ØÓ Ú ØØÖ Ø ¡ ÀÓÛ × ÐÓÝ ÐØÝ Ò Ì × × ÓÑÔ Òݹ ÒØ ÖÒ Ð Ò Ø ÓÒ × Ò ×× ÖݺÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-35] Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ¹Å ØÖ × Ö Ñ ÛÓÖ ½ Ü ÑÔÐ Ì Ú ÓÙÖ Ó ÐÓÝ Ð Ù×ØÓÑ Ö Ò Ø ÖÑ× Ó ¯ Ú × Ø ÙÖ Ø ÓÒ ¯ ÒÙÑ Ö Ó Ú × Ø× ÙÖ Ò Ô ÖÓ Ó ØÑ ¯ Ô × Ú×Ø ØÑ × ÙÒ Ñ ÒØ ÐÐÝ Ö ÒØ ÓÖ ¬ Ù×ØÓÑ Ö× Ø Ø Ñ ÔÙÖ × × Ò Ö Ø Ð ×ØÓÖ ¬ Ù×ØÓÑ Ö× Ø Ø ÔÐ Ò Ñ ÓÖ ÔÙÖ × ¸ º ºÓ ÓÒ ÙÖ Ð ÔÖÓ Ù Ø ´ ÓÒØÖ Ø¸ Öµ ¬ ÓÓÔ Ö Ø ÓÒ Ô ÖØÒ Ö× Ò ¾ × ØØ ÒÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-36]
    • Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ¹Å ØÖ × ÓÖÑÙÐ ½ Ð Ö × Ø Ó Ñ ØÖ × × ÔÖÓÔÓ× ¸ Ò ÐÙ Ò ¯ ×Ø Ò ×× ¯ ×Ð ÔÔ Ö Ò ×× ¯ Ó Ù× Ó Ô ÖØ× Ó ×Ø º Ì ÒØ Ø ÓÒ Ò ÑÓÒ ØÓÖ Ò Ó ¯ ÓÔØ Ñ Ð Ô Ø × × ÙÖØ Ö ×Ù ×Ø ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-37] Ù× Ò ×× ËÙ ×× ÖÓÑ Ø× ØÓ ÐÓÝ Ð Ù×ØÓÑ Ö× ¹Å ØÖ × ÓÖÑÙÐ ½ ÁÑÔÐ Ø ÓÒ× µ Ì ÒÓØ ÓÒ Ó × Ø ¹Ô ÖØ ÑÙ×Ø ÓÖ Ô ÖØ× Ó ×Ø ÔÖÓÔ ÖÐÝ Ò Ò ×× Ñ Ò Ø ØÓ ¯ ×Ø Ò ×× Ø Ø Ò ÐÝ× × ×Ó ØÛ Ö º ¯ ×Ð ÔÔ Ö Ò ×× ¯ Ó Ù× µ µ Ì ÑÙ×Ø ÑÓÒ ØÓÖ Ò Ó ÓÔØ Ñ Ð Ô Ø × ÑÔÐ Ñ ÒØ ×ÓÑ ÓÛº ÅÓÒ ØÓÖ Ò Ó µ Ì ÑÔ Ø Ó Ø × Ø ×ØÖÙ ØÙÖ ¯ ÓÔØ Ñ Ð Ô Ø × ÑÙ×Ø ÙÒ Ö×ØÓÓ Ò Ñ Ü¹ ÔРغÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-38]
    • Ù× Ò ×× ×Ù ×× Ò Ø ÖÓÐ Ó Ã ½º Ï Ö × ÓÙÐ ×Ù ×× Ñ ØÖ × ÔÔÐ ÙÔÓÒ ¯ Ì Û ÓÐ ÔÓÔÙÐ Ø ÓÒ Í× Ö ÔÓÔÙÐ Ø ÓÒ× Ö Ö Ö ÐÝ ÙÒ ÓÖѺ ¯ Ù× Ö» Ù×ØÓÑ Ö Ë Ð Ð ØÝ Ñ Ø Ò ××Ù ¸ ×Ô Ð×Óº ¯ ÖÓÙÔ Ó Ù× Ö×» Ù×ØÓÑ Ö× ÁØ × ×× ÒØ Ð ØÓ ×Ø Ò Ù × ÑÓÒ Ù× Ö» Ù×ØÓÑ Ö ÖÓÙÔ׸ º º Ò Ø ÖÑ× Ó ¬ ÜÔ Ö Ò ¬ ÒØ Ö ×Ø× ¬ Ò Ð Ý Ð Ú ÐÙ ÑÓ Ö Ô × ¬ Ú ÓÙÖÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-39] Ù× Ò ×× ×Ù ×× Ò Ø ÖÓÐ Ó Ã ¾º ÀÓÛ × ÓÙÐ Ø Ñ ØÖ × ÓÑÔÙØ ¯ Å ÔÔ Ò Ó ×Ø Ø ×Ø Ð Ñ ×ÙÖ × ´ ÙÖ Ý¸ ÒØ Ö ÐÙ×Ø Ö ×Ø Ò ¸ ÓÒ Ò ¸ ×ÙÔÔÓÖص ÓÒ Ù× Ò ×× Ñ ×ÙÖ × ¯ ÁÒ ÓÖÔÓÖ Ø ÓÒ Ó ÓÑÔÙØ Ø ÓÒ ÔÖ Ö ÕÙ × Ø × ÒØÓ Ø ÑÒÒ ÓÖ ¸ ¯ ÁÑÔ Ø Ó Ø × Ø ×ØÖÙ ØÙÖÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-40]
    • ¸ ÅÒ Ø Ô ¯ Ù× Ö ÓÒØ Ø× × Ø Ò × ÕÙ Ò Ó × ×× ÓÒ׺ ¬Ì Ø Ñ Ò× × ×× ÓÒ ÔÐ Ý× ÖÓÐ º ¬Ì Ð Ô× Ø Ñ ÑÓÒ × ×× ÓÒ× ÔÐ Ý× ÖÓÐ º ¬Ì ÚÓÐ Ø Ð ØÝ Ó Ï Ò ÔÓÔÙÐ Ø ÓÒ ÔÐ Ý× ÖÓÐ º ¯ ÊÅ Ó × ÖÚ × ÓØ Ø Ò Ú Ù Ð × ×× ÓÒ× Ó Ù× Ö Ò Ø Û ÓÐ Ð Ý Ð Ó Ø Ù× Öº ¬ ÓØ ÑÙ×Ø ×ÙÔÔÓÖØ Ò × ÑÐ ×× Û Ýº ¬ Ì ××Ó Ø Ò ÓÖÑ Ø ÓÒ ÑÙ×Ø ÒØ Ö Ø Ò ÜÔÐÓ Ø º ¯ Ì Û ¹× Ø ×ØÖÙ ØÙÖ Ø× Ú ÖÝØ Ò º ¬ ÇÖ Ö Ò Ò Ö Ô Ø Ø ÓÒ Ö ÑÔÓÖØ Òغ ¬ Á ÓÔØ Ñ Ð Ô Ø × Ö ×Ô ¸ ×Ù ÓÔØ Ñ Ð ÓÒ × ÑÙ×Ø ÕÙ ÒØ Ò ÑÓÒ ØÓÖ º ¬ Ì Ú ÓÙÖ Ð Ô ØØ ÖÒ× Ö Ø Ý Ø × Ø ×ØÖÙ ØÙÖ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [E-41] Personalization and Privacy PKDD 2001 Tutorial: “KDD for Personalization” [P-1]
    • Personalization and privacy What is privacy? "The right to be let alone." Warren & Brandeis [65] includes - limits on the government’s power to interfere with personal decisions - physical privacy: limits on others’ abilitiy to learn things about a person by accessing their property - information privacy: the "right to control information about ourselves"PKDD 2001 Tutorial: "KDD for Personalization" [P-2] Personalization and privacy Why is privacy a central concern for personalization? (1) Adapting to a person requires data on that person (2) The legal side: not all data may be collected/used (3) The commercial side: "The Internet industry is built on trust between businesses and their customers - and privacy is the number one ingredient in trust." TrustE: How does Online Privacy Impact Your Bottom Line? [62]PKDD 2001 Tutorial: "KDD for Personalization" [P-3]
    • What are the dangers to privacy? basic: data are correctly and legally transferred and stored, but embody "knowledge about a person" -> exacerbated by user ignorance (cf. widespread confusion or ignorance about what a cookie is: Ackerman, Cranor, & Reagle [1]) technical: data are corrupted during entry, transfer, or storage security: data are intercepted unethical practices: data are used for novel purposes, sold to third parties, ...PKDD 2001 Tutorial: "KDD for Personalization" [P-4] What data are transmitted during Web usage? transferred by the browser IP address domain name (-> organization) referrer address platform: browser type and version query strings, form fill-ins other technologies cookies globally unique identifiers web bugsPKDD 2001 Tutorial: "KDD for Personalization" [P-5]
    • User concerns about privacy User concerns about privacy vary - with respect to their severity: e.g., 27% marginally concerned, 56% pragmatic majority, 17% privacy fundamentalists (Ackerman et al. [1]) - the kind of data, e.g., credit card no. -> ... -> name -> ... -> email address -> ... -> favorite TV show: (Ackerman et al. [1]) - depending on whether personal identity or profiling information is disclosed (Spiekermann, Grossklags, & Berendt [57])PKDD 2001 Tutorial: "KDD for Personalization" [P-6] How to protect privacy I: general - avoid the generation of data ("data parsimony") - try to protect generated data How to protect privacy II: agents and methods German/European main stance - state / law - parties to the transaction / market self- - users / technology governance US main stancePKDD 2001 Tutorial: "KDD for Personalization" [P-7]
    • What to protect: Data in relation to persons (personally identifiable data) person-related data "Jane Doe plays football." person-relatable data "The person is a male American famous tennis player, and will soon marry a famous German tennis player." Note: IP addresses at least person-relatable!PKDD 2001 Tutorial: "KDD for Personalization" [P-8] German / EU legal basics I (German laws, EU directive 95/46/EC) person-related data may only be collected with the informed consent (opt-in!) about - who : who collects the data - what for : for what purpose - how much : quality and amount necessary for purpose usage that deviates from any of these 3 is illegal informed consent: anything that is not explicitly allowed is forbidden (the greater the risk, the more detail must be explained) rights against the state -> rights against other private partiesPKDD 2001 Tutorial: "KDD for Personalization" [P-9]
    • German / EU legal basics II analysis / research: - person-related data must be anonymized s.t. it cannot be related back to the person - aggregate into groups >= 10 - if necessary, original data can be stored by a trusteePKDD 2001 Tutorial: "KDD for Personalization" [P-10] German / EU legal basics III: Implications for personalization + = legal, - = illegal, ? = controversial + analyzing non-person-relatable web usage data + using results to personalize a web page based on the current user’s current session - using results to personalize based on past sessions - using results to send unsolicited snail/e-mail ? cookies: web site must also function without cookies; problematic if user unaware of cookie setting ? P3P: is the delegation of my privacy preferences to a computer program still an expression of my human will? Privacy statements must be opt-in (cf. software licence agreements: "I agree")PKDD 2001 Tutorial: "KDD for Personalization" [P-11]
    • Further rights under EU directive 95/46/EC - individuals can inspect and correct their data, and they can disallow usage - no data transfer to countries with inadequate data protection - independent institutions overlook data protection in each member country EU - US: Safe Harbor Principles (July 2000) - American enterprises that collect + process data from EU voluntarily subject themselves to principles that correspond to EU standard - FTC controlPKDD 2001 Tutorial: "KDD for Personalization" [P-12] US legal basics - 4th Amendment: limits government’s power to search people, their homes, and their papers; trespass laws, ... - government must not reveal medical histories etc. - government must not reveal certain information: Privacy Act, Driver’s Privacy Protection Act, ... - bars on third parties: video stores, lawyers, doctors, ..., "disclosure of private facts" tort -> apply only to a narrow range of revelations Information privacy gets protection from law of contract -> applies only to parties to a contract. Third parties: conflict information privacy - freedom of speech? Volokh [64]PKDD 2001 Tutorial: "KDD for Personalization" [P-13]
    • Self-governance: privacy seals US privacy seals: TRUSTe, BBBOnline, CPA Web Trust www.truste.org, www.bbb-online.org, www.cpawebtrust.org "Unlike self-regulation, self-governance requires that industry not act alone; rather, it must work in concert with existing laws and develop best practices. Self-governance relies on an informed marketplace that demands disclosure of privacy practices and the opportunity to exercise choice about how information is used. Government must fulfill its role by enforcing existing laws and assuring that industry continue to work toward ubiquitous adoption of best practices. Media and advocacy groups act as a collective conscience by scrutinizing the development of self-governance to assure it remains true to its underlying principles and goals and meets the challenges of evolving technologies and business models." TRUSTe Online Privacy Resource Book [63]PKDD 2001 Tutorial: "KDD for Personalization" [P-14] Self-governance: How does TRUSTe work? - contract signed between TRUSTe and the Web site - allows TRUSTe to address users’ privacy concerns regardless of their citizenship or the TRUSTe licensee - users can bring their complaints to TRUSTe Watchdog - Web site is required to respond quickly, TRUSTe can begin to mediate a resolution - change in company practice, or in posted policy - third party audit - refer case to government authorities, usually FTC "Companies acting outside the bounds of the TRUSTe license agreement may be in breach of contract and be subject to revocation of the TRUSTe seal. This may be the most powerful [TRUSTe] tool , because of ... public relations consequences ..." Note: non-EU standards compliant (opt-out possible)PKDD 2001 Tutorial: "KDD for Personalization" [P-15]
    • Technology: anonymizing web usage user 1 GET x.html proxy server web server user 2 GET x.html ... Ex.: www.anonymizer.com problem: proxy usually knows users’ identities Mix networks; Crowds encryption web server user encryption encryption Ex.: www.freedom.net, www.onion-router.net, anon.inf.tu-dresden.de, www.research.att.com/projects/crowds Pseudonomity and identity managementPKDD 2001 Tutorial: "KDD for Personalization" [P-16] P3P: The Platform for Privacy Preferences - P3P enables Web sites to express their privacy practices in a standard format that can be retrieved automatically and interpreted easily by user agents - an initiative of the World Wide Web Consortium (W3C) in conjunction with many industry partners including Microsoft - P3P allows the user agent to warn the user, or block communication altogether, if a selected Web site’s privacy policy does not comply with user preferencesPKDD 2001 Tutorial: "KDD for Personalization" [P-17]
    • P3P’s XML elements include (can be extended): who: <RECIPIENT> ours, delivery, same, other-recipient, unrelated, public what for: <PURPOSE> current, admin, develop, customization, tailoring, pseudo-analysis,pseudo-decision, individual-analysis, individual-decision, contact, telemarketing, history, other-purpose how much: categories physical, online, uniqueid,purchase,financial, computer, navigation, interactive, demographic, content, state, political, health, preference, location, government, other-category [67]PKDD 2001 Tutorial: "KDD for Personalization" [P-18] Problem "soft" interaction, communication flow In an experimental online store, agent Luci posed 56 questions in a sales dialogue. - 35-50% of questions were non-legitimate / irrelevant - still, 54% of participants answered at least 98% of the questions, although they had previously agreed to the sale und further usage of their data (Spiekermann et al. [57])PKDD 2001 Tutorial: "KDD for Personalization" [P-19]
    • Communication flow and "obedient" answering peip Do you consider yourself photogenic? pepr How important are trend models to you? Example u When do you usually take photos? questions pd What zoom do you want? Q categories top 10 product info more product infoprod.inf./purch.opt. purchase (Berendt [5])PKDD 2001 Tutorial: "KDD for Personalization" [P-20] Conclusions PKDD 2001 Tutorial: “KDD for Personalization” [C-1]
    • Conclusions powerful methods and software for personalization available, but many questions remain, including: - what are the relevant criteria of evaluation? how can they be combined? - privacy concerns: - recommendations welcomed by users - but: user reveals information, may not get a good return ... if there are not enough other users ... if that user is judged as an "uninteresting case" => often, more data are collected than put to good usePKDD 2001 Tutorial: "KDD for Personalization" [C-2] (Some) future directions anonymity, pseudonymity, and personalization "opt-in with incentives": permission marketing more explicit user modeling - involve the user in diagnosis, provide for opt-out / opt-in - integration with other data easier (XML etc.) changing roles of participants: - computers: knowledge organization and representation (-> personalization + information architecture design) - users interact more strongly with one another (Web communities) - service providers offer "real" personal assistantsPKDD 2001 Tutorial: "KDD for Personalization" [C-3]
    • References PKDD 2001 Tutorial: “KDD for Personalization” [R-1] Ê ÖÒ × ½ ÖÑ Ò¸ ź˺¸ Ö ÒÓÖ¸ ĺ º¸ Ò Âº Ê Ð º ÈÖ Ú Ý Ò ¹ ÓÑÑ Ö Ü Ñ Ò Ò Ù× Ö × Ò Ö Ó× Ò ÔÖ Ú Ý ÔÖ Ö Ò ×º ÁÒ ÈÖÓ Ò × Ó Ø Å ÓÒ Ö Ò ÓÒ Ð ØÖÓÒ ÓÑÑ Ö º × Ð×Ó ØØÔ »»ÛÛÛºÖ × Ö º Øغ ÓÑ»Ð Ö ÖÝ»ØÖ×»ÌÊ×» » º » ¾ ʺ ÖÛ Ð¸ º ÖÛ Ð¸ Ò Îº ÈÖ × º ØÖ ÔÖÓ Ø ÓÒ Ð ÓÖ Ø Ñ ÓÖ Ò Ö Ø ÓÒ Ó Ö ÕÙ ÒØ Ø Ñ× Ø׺ ÁÒ ÈÖÓ Ò × Ó Ø À È Ö ÓÖÑ Ò Ø ÅÒÒ ÏÓÖ × ÓÔ¸ ÈÙ ÖØÓ Ê Ó¸ ½ º ¿ Ð Ò¸ ƺº ´¾¼¼¼µº À ÐÔ Ò Ô ÓÔÐ Ò Û Ø Ø Ý ÓÒ³Ø ÒÓÛº ÓÑÑÙÒ Ø ÓÒ× Ó Ø Å¸ ¿ ´ µ¸ ½º Ð Ò¸ ƺº¸ ÓÓи º¸ À ¸ º¸ Â Ò ¸ º¸ à ÐÐݸ º¸ Ä Ò¸ ˺º¸ ÄÓ × ¸ ĺ¸ È Ö ¸ ˺ º¸ Ë Ú ¹ÃÒ Ô× Ð ¸ Ⱥ¸ Ò Ë ÓÖ ¸ º ´¾¼¼¼µº Ê Ð Ú Ò Ú Ö×Ù× ÐÓ Ð ÓÒØ ÜØ Ò ÐÝ× × × Ø ÖÑ ×Ù ×Ø ÓÒ Ú ×º ÁÒ ÈÖÓ Ò × Ó Ø Ø Ì ÜØ Ê ØÖ Ú Ð ÓÒ Ö Ò ÌÊ º Ï × Ò ØÓÒ¸ º º Ö Ò Ø¸ º ´¾¼¼½µº ÍÒ Ö×Ø Ò Ò Û Ù× Ø Ö ÒØ Ð Ú Ð× Ó ×ØÖ Ø ÓÒ Ó Ö× Ò Ò Ò Ú ×Ù Ð × Ò × ÕÙ Ò ×º ÁÒ ÏÓÖ Ò ÆÓØ × Ó Ø ÏÓÖ × ÓÔ Ï Ã ¾¼¼½ ÅÒÒ ÄÓ Ø ÖÓ×× ÐÐ Ù×ØÓÑ Ö ÌÓÙ ÔÓ ÒØ× ¸ Ø Å ËÁ à ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÃÒÓÛÐ × ÓÚ ÖÝ Ò Ø ÅÒÒ º Ë Ò Ö Ò × Ó¸ ¸ Ù Ù×غ Ö Ò Ø¸ º ´¾¼¼¼µº Ï Ù× Ñ Ò Ò ¸ × Ø × Ñ ÒØ ×¸ Ò Ø ×ÙÔÔÓÖØ Ó Ò Ú Ø ÓÒº ÁÒ ÏÓÖ Ò ÆÓØ × Ó Ø ÏÓÖ × ÓÔ Ï ÅÒÒ ÓÖ ¹ ÓÑÑ Ö ÐÐ Ò × Ò ÇÔÔÓÖØÙÒ Ø ×º Ø Å ËÁ à ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÃÒÓÛÐ × ÓÚ ÖÝ Ò Ø ÅÒÒ º ´ÔÔº ¿ ¿µº Ó×ØÓÒ¸ Å ¸ Ù Ù×غ º Ö Òظ º ÅÓ × Ö¸ ź ËÔ Ð ÓÔÓÙÐÓÙ¸ Ò Âº Ï ÐØ× Ö º Å ×ÙÖ Ò Ø ÙÖ Ý Ó × ×× ÓÒ Þ Ö× ÓÖ Ï Ù× Ò ÐÝ× ×º ÁÒ ÈÖÓ Ò × Ó Ø Ï Å Ò Ò ÏÓÖ × ÓÔ Ø Ø Ö×Ø ËÁ Å ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ Ø ÅÒÒ ¸ Ó¸ ¾¼¼½º Ö Ò Ø¸ º Ò ËÔ Ð ÓÔÓÙÐÓÙ¸ ź ´¾¼¼¼µº Ò ÐÝ× × Ó Ò Ú Ø ÓÒ Ú ÓÙÖ Ò Û × Ø × ÒØ Ö Ø Ò ÑÙÐØ ÔÐ Ò ÓÖÑ Ø ÓÒ ×Ý×Ø Ñ׺ Ì ÎÄ ÂÓÙÖÒ Ð¸ ¸ º Ⱥ ÖØ ÓÒ¸ ĺ º È Øظ Ò Êº ̺ Ï Ø×ÓÒº Ì ÛÓÖÐ Û Û × Ò Ú ÖØ × Ò Ñ ÙѺ ÂÓÙÖÒ Ð Ó Ú ÖØ × Ò Ê × Ö ¸ ¿ ´½µ ¿ ¸½ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-2]
    • Ê ÖÒ × ½¼ ÖÙ× ÐÓÚ× Ý¸ Ⱥ ´½ µº Å Ø Ó × Ò Ø Ò ÕÙ × Ó ÔØ Ú ÝÔ ÖÑ º Í× Ö ÅÓ Ð Ò Ò Í× Ö¹ ÔØ ÁÒØ Ö Ø ÓÒ¸ ¸ ½¾ º ½½ ÖÙ× ÐÓÚ× Ý¸ Ⱥ ´½ µº ÒØ Ø Ò ÕÙ × ÓÖ ÔØ Ú ÝÔ ÖÑ º ÁÒ º Æ ÓÐ × Ò Âº Å Ý Ð ´ ׺µ¸ ÁÒØ ÐÐ ÒØ ÝÔ ÖØ ÜØ Ú Ò Ø Ò ÕÙ × ÓÖ Ø ÏÓÖÐ Ï Ï ¸ ÖÐ Ò ËÔÖ Ò Öº ½¾ ¿¼º ½¾ ÖÙ× ÐÓÚ× Ý¸ Ⱥ Ò ÐÙÒ ¸ º ´½ µº ×ØÙ Ý Ó Ù× Ö ÑÓ Ð × ÐÒ ÒÒÓØ Ø ÓÒ Ò Ù Ø ÓÒ Ð ÝÔ ÖÑ º ÂÓÙÖÒ Ð Ó ÍÒ Ú Ö× Ð ÓÑÔÙØ Ö Ë Ò ¸ ¸ ¾ º ½¿ ÖÖÓÐи ºź Ò ÊÓ××ÓÒ¸ ź º ´½ µº Ì Ô Ö ÓÜ Ó Ø Ø Ú Ù× Öº ÁÒ ÂºÅº ÖÖÓÐÐ ´ ºµ¸ ÁÒØ Ö Ò Ì ÓÙ Ø Ó Ò Ø Ú ×Ô Ø× Ó ÀÙÑ Ò¹ ÓÑÔÙØ Ö ÁÒØ Ö Ø ÓÒº Ñ Ö ¸ Å ÅÁÌ ÈÖ ×׺ ½ ÓÓРݸ ʺ ´¾¼¼¼µº Ï Í× Å Ò Ò × ÓÚ ÖÝ Ò ÔÔÐ Ø ÓÒ Ó ÁÒØ Ö ×Ø Ò È ØØ ÖÒ× ÖÓÑ Ï Ø º ÍÒ Ú Ö× ØÝ Ó Å ÒÒ ×ÓØ ¸ ÙÐØÝ Ó Ø Ö Ù Ø Ë ÓÓÐ È º º ×× ÖØ Ø ÓÒº ØØÔ »»ÛÛÛº ׺ÙÑÒº Ù»Ö × Ö »Û × Ø»Ô Ô Ö×»ÖÛ Ø × ×ºÔ× ½ ÊÓ ÖØ ÓÓРݸ Ñ× ÅÓ × Ö¸ Ò Â Ô ËÖ Ú ×Ø Ú º Ø ÔÖ Ô Ö Ø ÓÒ ÓÖ Ñ Ò Ò ÛÓÖÐ Û Û ÖÓÛ× Ò Ô ØØ ÖÒ׺ ÂÓÙÖÒ Ð Ó ÃÒÓÛÐ Ò ÁÒ ÓÖÑ Ø ÓÒ ËÝ×Ø Ñ׸ ½´½µ¸ ½ º ½ ź ÙØÐ Ö Ò Âº ËØ ÖÒ º ¹Ñ ØÖ × Ù× Ò ×× Ñ ØÖ × ÓÖ Ø Ò Û ÓÒÓÑݺ Ì Ò Ð Ö ÔÓÖظ Æ Ø Ò ×× ÓÖÔº¸ ØØÔ »»ÛÛÛºÒ Ø Òº ÓÑ» Ñ ØÖ ×¸ ¾¼¼¼º ×× Ø ÂÙÐÝ ¾¾¸ ¾¼¼½º ½ ź × Ô Ò Ò º à ÖÝÔ ×º Ë Ð Ø Ú Å Ö ÓÚ ÑÓ Ð× ÓÖ ÔÖ Ø Ò Ï ¹Ô ×× ×º Ì Ò Ð Ê ÔÓÖØ ¼¼¹¼ ¸ ÍÒ Ú Ö× ØÝ Ó Å Ò ××ÓØ ¸ ¾¼¼¼º ½ Ñ ØÖÓÚ ¸ κ¸ Ë Ð ¸ º¸ Ò ÖÒ ¸ Ⱥ ´¾¼¼¼µº ÁÒÚÓÐÚ Ò Ø Ð ÖÒ Ö Ò ÒÓ× × ÔÓØ ÒØ Ð× Ò ÔÖÓ Ð Ñ׺ ÁÒ Ï ÁÒ ÓÖÑ Ø ÓÒ Ì ÒÓÐÓ × Ê × Ö ¸ Ù Ø ÓÒ Ò ÓÑÑ Ö º ÅÓÒØÔ ÐÐ Ö¸ Ö Ò ¸ Šݺ ½ Ö ØÚ » » Ó Ø ÙÖÓÔ Ò È ÖÐ Ñ ÒØ Ò Ø ÓÙÒ Ð Ó ¾ Ç ØÓ Ö ½ ÓÒ Ø ÔÖÓØ Ø ÓÒ Ó Ò Ú Ù Ð× Û Ø Ö Ö ØÓ Ø ÔÖÓ ×× Ò Ó Ô Ö×ÓÒ Ð Ø Ò ÓÒ Ø Ö ÑÓÚ Ñ ÒØ Ó ×Ù Ø º ØØÔ »» ÙÖÓÔ º Ùº ÒØ» ÓÑÑ» ÒØ ÖÒ Ð Ñ Ö Ø» Ò»Ñ » Ø ÔÖÓػРۻ Ò Üº ØÑ ¾¼ º Ö Þ Ò º Ù ÖÝ Òº Ì ×Ø Ò Û ×Ø × Ò Ò ÔÖÓÑÓØ ÓÒ Ð ÓÒØ Òغ ÂÓÙÖÒ Ð Ó Ú ÖØ × Ò Ê × Ö ¸ ¿ ´¾µ ½¸ ½ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-3] Ê ÖÒ × ¾½ º Ñ Ýº ÈÖÓ Ð Ò Ù× Ö Ö ×ÔÓÒ× × ØÓ ÓÑÑ Ö ÂÓÙÖÒ Ð Ó Ú ÖØ × Ò Ê × Ö ¸ ¿ ´¾µ ¸ ÐÛ × Ø ×º ŠݹÂÙÒ ½ º ¾¾ º Ù¸ º Ù Þ ¸ Ò Ãº º À ÑÑÓÒ º Å Ò Ò Ò Ú Ø ÓÒ ×ØÓÖÝ ÓÖ Ö ÓÑÑ Ò Ø ÓÒº ÁÒ ÈÖÓ º ¾¼¼¼ ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÁÒØ ÐÐ ÒØ Í× Ö ÁÒØ Ö ×¸ Æ Û ÇÖÐ Ò׸ ¾¼¼¼º ¾¿ Ö Ò Ð¸ ˺ ´¾¼¼¼µº Ø × Æ Ø ÓÒº Ì Ø Ó ÈÖ Ú Ý Ò Ø ¾½×Ø ÒØÙÖݺ Ë ×ØÓÔÓи Ç³Ê ÐÐݺ ¾ Ϻ ÙÐ Ò Äº Ë Ñ Ø¹Ì Ñ º Ê ÓÑÑ Ò Ö ×Ý×Ø ×Ñ× × ÓÒ Ò Ú Ø ÓÒ Ô Ø ØÙÖ ×º ÁÒ ¾ ¸ Ë Ò Ö Ò× × Ó¸ ¸ Ù º ¾¼¼½º ź ¾ º Ý Ö¹Ë ÙÐÞ¸ ź À ×Ð Ö¸ Ò Åº  Һ Ù×ØÓÑ Ö ÔÙÖ × Ò Ò ÑÓ Ð ÔÔÐ ØÓ Ö ÓÑÑ Ò Ö ×Ý×Ø Ñ׺ ÁÒ ¾ ¸ Ë Ò Ö Ò× × Ó¸ ¸ Ù º ¾¼¼½º ź ¾ ¹Àº À Ò¸ º à ÖÝÔ ×¸ κ ÃÙÑ Ö Ò º ÅÓ × Öº ÀÝÔ Ö Ö Ô × ÐÙ×Ø Ö Ò ÒÀ ¹ Ñ Ò× ÓÒ Ð Ø Ë Ø× ËÙÑÑ ÖÝ Ó Ê ×ÙÐØ׺ Á ÙÐÐ Ø Ò Ó Ø Ì Ò Ð ÓÑÑ ØØ ÓÒ Ø Ò Ò Ö Ò ¸ ´¾½µ ½¸ ½ º ¾ ̺ ÂÓ Ñ׸ º Ö Ø ¸ Ò Ìº Å Ø Ðк Ï Û Ø Ö ÌÓÙÖ Ù ÓÖ Ø ÏÓÖÐ Ï Ï º ÁÒ ÈÖÓ Ò × Ó Ø ½ Ø ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÖØ Ð ÁÒØ ÐÐ Ò ¸ Æ ÓÝ ¸ Â Ô Ò¸ ½ º ¾ ÃÓ × ¸ º¸ º ÃÓ Ò Ñ ÒÒ Ò Ïº ÈÓ Ð ´¾¼¼½µº È Ö×ÓÒ Ð Þ ÝÔ ÖÑ ÔÖ × ÒØ Ø ÓÒ Ø Ò ÕÙ × ÓÖ ÑÔÖÓÚ Ò ÓÒÐ Ò Ù×ØÓÑ Ö Ö Ð Ø ÓÒ× Ô׺ ÌÓ ÔÔ Ö Ò Ì ÃÒÓÛÐ Ò Ò Ö Ò Ê Ú Ûº ØØÔ »»ÛÛÛº ׺٠º Ù» Ó × »Ô Ô Ö×»¾¼¼½¹Ã ʹ Ó × ºÔ ¾ ʺ ÃÓ Ú ¸ º Å × Ò ¸ ź ËÔ Ð ÓÔÓÙÐÓÙ¸ Ò Âº ËÖ Ú ×Ø Ú ¸ ØÓÖ׺ à ³¾¼¼½ ÏÓÖ × ÓÔ Ï Ã ³¾¼¼½¸ Ë Ò Ö Ò× × Ó¸ ¸ Ù º ¾¼¼¼º ź ¿¼ ʺ ÃÓ Ú ¸ ź ËÔ Ð ÓÔÓÙÐÓÙ¸ Ò Âº ËÖ Ú ×Ø Ú ¸ ØÓÖ׺ à ³¾¼¼¼ ÏÓÖ × ÓÔ Ï Ã ³¾¼¼¼ ÓÒ Ï Å Ò Ò ÓÖ ¹ ÓÑÑ Ö ÐÐ Ò × Ò ÇÔÔÓÖØÙÒ Ø ×¸ Ó×ØÓÒ¸ Å ¸ Ù º ¾¼¼¼º ź ¿½ ÃÓØÛ ¸ ú ´½ µº ËÙÖÚ Ý Ï × Ø È Ö×ÓÒ Ð Þ Ø ÓÒº Ý Ö Ú ÓÖ Ê × Ö ÒØ Öº ØØÔ »»ÛÛÛº Óº ÓÑ» ÓÖÙÑ×» Ú ÓÖ» Ø»×ÙÖÚ Ý º ØÑк ¿¾ ʺ ÃÙ Ð Òº ÁÒ ÓÖÑ Ø ÓÒ×Ñ Ö Ø Ò Ò ÙÒ Ê × Ò Ö ÃÓÑÑ ÖÞ Ð × ÖÙÒ ÚÓÒ Ï ×× Òº ¾ Ø ÓÒ¸ ½ ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-4]
    • Ê ÖÒ × ¿¿ º Ä ¸ º à Ѹ Ò Âº º ÅÓÓÒº Ï Ø Ñ × ÒØ ÖÒ Ø Ù× Ö× Ú × Ø Ý Ö ×ØÓÖ × Ò Ý × Ò ØÓÖ× ÓÖ Ù×ØÓÑ Ö ÐÓÝ ÐØݺ ÁÒ ÈÖÓ º ÀÁ³¾¼¼¼¸ Ô × ¿¼ ¿½¾¸ Ì À Ù ¸ Æĸ ¾¼¼¼º ź ¿ º Ä ¸ ź ÈÓ Ð × ¸ º Ë ÓÒ Ö ¸ ʺ ÀÓ ¸ Ò Ëº ÓÑÓÖݺ Ò ÐÝ× × Ò Ú ×Ù Ð Þ Ø ÓÒ Ó Ñ ØÖ × ÓÖ ÓÒÐ Ò Ñ Ö Ò Þ Ò º ÁÒ ¿ ¸ Ô × ½¾¿ ½¿ º ¾¼¼¼º ¿ Àº Ä ÖÑ Òº Ä Ø Þ Ò ÒØ Ø Ø ×× ×Ø× Ï ÖÓÛ× Ò º ÁÒ ÈÖÓ Ò ×Ó Ø ½ ÁÒØ ÖÒ Ø ÓÒ Ð ÂÓ ÒØ ÓÒ Ö Ò ÓÒ ÖØ Ð ÁÒØ ÐÐ Ò ¸ ÅÓÒØÖ Ð¸ Ò ¸½ º ¿ Ä Ø ÓÙ× ÓÒ Ø Ï º ´¾¼¼¼µº È Ö×ÓÒ Ð Þ Ø ÓÒ Ó × ÓÒ ¹ÓÒ¹ÓÒ Û Ø Ö Ð Øݺ ØØÔ »»ÛÛÛº× ÓÖ Û Ð Öº ÓÑ» ÝÔ » ÝÔ ¼º ØÑк ¿ º ʺ Ϻ Ä Ò¸ ˺ º ÐÚ Ö Þ¸ Ò º ÊÙ Þº ÓÐÐ ÓÖ Ø Ú Ö ÓÑÑ Ò Ø ÓÒ Ú ÔØ Ú ××Ó Ø ÓÒ ÖÙÐ Ñ Ò Ò º ÁÒ ¿¼ ¸ ¾¼¼¼º ¿ º Ä Ù¸ Ϻ À×Ù¸ Ò ºÅ º ××Ó Ø ÓÒ ÖÙÐ × Û Ø ÑÙÐØ ÔÐ Ñ Ò ÑÙÑ ×ÙÔÔÓÖØ׺ ÁÒ ÈÖÓ Ò ×Ó Ø Å ËÁ à ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÃÒÓÛÐ × ÓÚ ÖÝ ² Ø Å Ò Ò ´Ã ¹ ¸ ÔÓ×Ø Öµ¸ Ë Ò Ó¸ ¸½ º ¿ ºÅ × Ò Ò Åº ËÔ Ð ÓÔÓÙÐÓÙ¸ ØÓÖ׺ Ú Ò × Ò Ï Í× ÅÒÒ Ò Í× Ö ÈÖÓ Ð Ò ÈÖÓ Ò ×Ó Ø Ï Ã ³ ÏÓÖ × ÓÔ¸ ÄÆ Á ½ ¿ º ËÔÖ Ò Ö Î ÖÐ ¸ ÂÙÐÝ ¾¼¼¼º ¼ ÅÓ × Ö¸ º¸ ÓÓРݸ ʺ¸ Ò ËÖ Ú ×Ø Ú ¸ º ´¾¼¼¼µº ÙØÓÑ Ø Ô Ö×ÓÒ Ð Þ Ø ÓÒ × ÓÒ Û Ù× ÑÒÒ º ÓÑÑÙÒ Ø ÓÒ× Ó Ø Å¸ ¿´ µ¸ ½ ¾ ½ ½º ½ º ÅÓ × Ö¸ ʺ ÓÓРݸ Ò Âº ËÖ Ú ×Ø Ú º Ö Ø Ò ÔØ Ú Û × Ø × Ø ÖÓÙ Ù× ¹ × ÐÙ×Ø Ö Ò Ó ÍÊÄ׺ ÁÒ Á ÃÒÓÛÐ Ò Ø Ò Ò Ö Ò ÏÓÖ × ÓÔ ´Ã ³ µ¸ ½ º ¾ º ÅÓ × Ö¸ Àº ¸ ̺ ÄÙÓ¸ Ò Åº Æ Û º ÁÑÔÖÓÚ Ò Ø Ø Ú Ò ×× Ó ÓÐÐ ÓÖ Ø Ú ÐØ Ö Ò ÓÒ ÒÓÒÝÑÓÙ× Û Ù× Ø º ¾¼¼½º ¿ º ÅÓ × Ö¸ Àº ¸ ̺ ÄÙÓ¸ ź Æ Û ¸ º ËÙÒ¸ Ò Âº Ï ÐØ× Ö º × ÓÚ ÖÝ Ó Ö Ø Ù× ÔÖÓ Ð × ÓÖ Û Ô Ö×ÓÒ Ð Þ Ø ÓÒº ÁÒ ¿¼ ¸ ¾¼¼¼ºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-5] Ê ÖÒ × º ÅÓ × Ö¸ Àº ¸ ̺ ÄÙÓ¸ º ËÙ¸ Ò Âº Ùº ÁÒØ Ö Ø Ò Û Ù× Ò ÓÒØ ÒØ Ñ Ò Ò ÓÖ ÑÓÖ Ø Ú Ô Ö×ÓÒ Ð Þ Ø ÓÒº ÁÒ ¹ ÓÑÑ Ö Ò Ï Ì ÒÓÐÓ ×¸ ÚÓÐÙÑ ½ Ó ÄÆ Ëº ËÔÖ Ò Ö Î ÖÐ ¸ Ë Ôغ ¾¼¼¼º º ÅÓ × Ö¸ Àº ¸ ̺ ÄÙÓ¸ ź Æ Û º Ø Ú Ô Ö×ÓÒ Ð Þ Ø ÓÒ × ÓÒ ××Ó Ø ÓÒ ÖÙÐ × ÓÚ ÖÝ ÖÓÑ Ï Ù× Ø º Ì Ò Ð Ê ÔÓÖØ ¼½¹¼½¼¸ ÔÖØÑ ÒØ Ó ÓÑÔÙØ Ö Ë Ò ¸ È ÙÐ ÍÒ Ú Ö× Øݺ º º Æ × º ÃÒÓÛ Ø Ý Ù×ØÓÑ Ö ÖÓÑ Ù×ØÓÑ Ö ÒÓÛÐ ØÓ Ù×ØÓÑ Ö Ò× Øº Ï Ø Ô Ô Ö¸ ÒØÙÖ ¸ ÒØÙÖ ÊÅ ÈÓÖØ Ð ØØÔ »»ÛÛÛº ÖÑÔÖÓ Øº ÓѸ ×× Ø ÂÙÐÝ ¾¾¸ ¾¼¼½º Æ Ð× Ò¸ º ´½ µº È Ö×ÓÒ Ð Þ Ø ÓÒ × ÇÚ Ö¹Ê Ø º Ð ÖØ ÓÜ ÓÖ Ç ØÓ Ö ¸ ½ º ØØÔ »»ÛÛÛºÙ× Øº ÓÑ» Ð ÖØ ÓÜ» ½¼¼ º ØÑÐ Æ Ð× Ò¸ º ´¾¼¼½µº Í× Ð ØÝ Å ØÖ ×º Ð ÖØ Óܸ  ÒÙ ÖÝ ¾½¸ ¾¼¼½º ØØÔ »»ÛÛÛºÙ× Øº ÓÑ» Ð ÖØ ÓÜ»¾¼¼½¼½¾½º ØÑÐ Æ Ð× Ò¸ º ´¾¼¼¼µº × ÒÒ Ï Í× Ð ØÝ Ì ÈÖ Ø Ó Ë ÑÔÐ Øݺ Æ ÛÊ Ö× ÈÙ Ð × Ò º ¼ º Ç Ö ÓÚ Ò Ëº ÎÙ Ø º Ö Ö ×× ÓÒ¹ × ÔÔÖÓ ÓÖ × Ð Ò ¹ÙÔ Ô Ö×ÓÒ Ð Þ Ö ÓÑÑ Ò Ö ×Ý×Ø Ñ× Ò ¹ ÓÑÑ Ö º ÁÒ ¿¼ ¸ ¾¼¼¼º ½ È Ö Òظ ˺¸ ÅÓ × Ö¸ º¸ Ò ÄÝØ Ò Ò¸ ˺ ´¾¼¼½µº Ò ÔØ Ú ÒØ ÓÖ Û ÜÔÐÓÖ Ø ÓÒ × ÓÒ ÓÒ ÔØ Ö Ö ×º ÁÒ ÈÖÓ Ò × Ó Ø Ø ÁÒØ ÖÒ Ø ÓÒ Ð ÓÒ Ö Ò ÓÒ ÀÙÑ Ò ÓÑÔÙØ Ö ÁÒØ Ö Ø ÓÒºÆ Û ÇÖÐ Ò׸ Ä ¸ Ù Ù×غ ¾ ź È Ö ÓÛ ØÞ Ò Çº ØÞ ÓÒ º ÔØ Ú Û ×Ø × ÙØÓÑ Ø ÐÐÝ ×ÝÒØ × Þ Ò Û Ô ×º ÁÒ ÈÖÓ º Ó Á»Á Á³ ¸ Ô × ¾ ¿¾¸ ½ º ¿ ź È Ö ÓÛ ØÞ Ò Çº ØÞ ÓÒ º ÔØ Ú Û × Ø ×º ËÔ Ð Ë Ø ÓÒ Ó Ø ÓÑÑÙÒ Ø ÓÒ× Ó Å ÓÒ È Ö×ÓÒ Ð Þ Ø ÓÒ Ì ÒÓÐÓ × ÛØ Ø ÅÒÒ ¸ ¿´ µ ½ ¾ ½ ¸ Ù ¾¼¼¼º È ÖÓÐÐ ¸ Ⱥ¸ È Ø ÓÛ¸ º¸ Ò Ê Ó¸ ʺ Ë Ð ÖÓÑ ×ÓÛ³× Ö ÜØÖ Ø Ò Ù× Ð ×ØÖÙ ØÙÖ × ÖÓÑ Ø Û º ÁÒ ÀÁ¹ ¸ Î Ò ÓÙÚ ÖºÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-6]
    • Ê ÖÒ × Ë Ò ÖÑ Ò¸ º ´½ µº × ÒÒ Ø Í× Ö ÁÒØ Ö ºÊ Ò ¸Å ×ÓÒ¹Ï ×Рݺ ź ËÔ Ò ÓÐ Ò º Ù×ØÓÑ Ö Ñ ×ÙÖ Ñ ÒØ ×Ý×Ø Ñ× ÓÔÔÓÖØÙÒ Ø × ÓÖ ÑÔÖÓÚ Ñ Òغ Ï Ø Ô Ô Ö¸ ÅÂË ××Ó Ø ×¸ ÒØÙÖ ÊÅ ÈÓÖØ Ð ØØÔ »»ÛÛÛº ÖÑÔÖÓ Øº ÓѸ ×× Ø ÂÙÐÝ ¾¾¸ ¾¼¼½º ËÔ ÖÑ ÒÒ¸ ˺¸ ÖÓ×× Ð ×¸ º¸ Ò Ö Ò Ø¸ º ´¾¼¼½µº ËØ Ø ÔÖ Ú Ý ÔÖ Ö Ò × Ú Ö×Ù× ØÙ Ð Ú ÓÙÖ Ò ÒÚ ÖÓÒÑ ÒØ× Ö Ð ØÝ º ÁÒ ÈÖÓ Ò × Ö º ÁÒØ ÖÒ Ø ÓÒ Ð Ò Ì ÙÒ Ï ÖØ× Ø× Ò ÓÖÑ Ø ¾¼¼½º Ù × ÙÖ ¸ ÖÑ Òݸ Ë ÔØ Ñ Öº ź ËÔ Ð ÓÔÓÙÐÓÙ Ò º ÈÓ Ð º Ø Ñ Ò Ò ÓÖ Ñ ×ÙÖ Ò Ò ÑÔÖÓÚ Ò Ø ×Ù ×× Ó Û × Ø ×º ÁÒ Êº ÃÓ Ú Ò º ÈÖÓÚÓ×ظ ØÓÖ׸ ÂÓÙÖÒ Ð Ó Ø Å Ò Ò Ò ÃÒÓÛÐ × ÓÚ Öݸ ËÔ Ð Á××Ù ÓÒ ¹ ÓÑÑ Ö ¸ ÚÓÐÙÑ ¸ Ô × ½½ º ÃÐÙÛ Ö Ñ ÈÙ Ð × Ö׸  Һ¹ ÔÖº ¾¼¼½º ËØ ÖÒ ¸ º ´½ µº Ó ÝÓÙ ÒÓÛ Ñ º Ï Å ×Ø Ö Å Þ Ò ¸ ÔÖ Ð¸ ½ º ØØÔ »»ÛÛÛº Óº ÓÑ» Ö Ú »Û Ù× Ò ××»¼ ¼½ Ù×ØÓÑ Öº ØÑÐ ¼ ̺ ËÙÐÐ Ú Òº Ê Ò Ö ÖÖ Ø ÓÒ ÔÖÓÔÓ× Ð ÓÖ Ò Ö ÒØ Ð Ò ÐÝ× × Ó Û × ÖÚ Ö ÐÓ Ð ×º ÁÒ ÈÖÓ º Ó Ø Ï ÓÒ Ö Ò ³ ¸ ½ º ½ Ì ÓÑÔ×ÓÒ¸ ź ´½ µº Ê ×Ø Ö Î × ØÓÖ× Ö ÔÓÖØ Ð³× ×Ø Ö Ò º Ì ÁÒ Ù×ØÖÝ ËØ Ò Ö ¸ ÂÙÒ ¸½ º ØØÔ »»ÛÛÛºØ ×Ø Ò Ö º ÓѺ Ù»Ñ ØÖ ×» ×ÔÐ Ý»¼¸½¾ ¿¸ ¼½¸¼¼º ØÑÐ ¾ ÌÖÙ×Ø º ´ÒÓ Ø µº ÀÓÛ Ó × ÈÖ Ú Ý ÁÑÔ Ø ÓÙÖ ÓØØÓÑ Ä Ò ØØÔ »»ÛÛÛºØÖÙ×Ø ºÓÖ » Ù×»ÔÙ ÓØØÓѺ ØÑÐ ¿ ÌÖÙ×Ø º ´¾¼¼¼µº ÌÖÙ×Ø ÇÒÐ Ò ÈÖ Ú Ý Ê ×ÓÙÖ ÓÓ º ØØÔ »»ÛÛÛºØÖÙ×Ø ºÓÖ » ÓÙØ»ÓÔÖ º Ó ÎÓÐÓ ¸ º ´¾¼¼¼µº È Ö×ÓÒ Ð Þ Ø ÓÒ Ò ÔÖ Ú Ýº ÓÑÑÙÒ Ø ÓÒ× Ó Ø Å¸ ¿´ µ¸ º Ï ÖÖ Ò¸ ˺ Ò Ö Ò ×¸ ĺ Ì Ö Ø Ó ÔÖ Ú Ýº À ÖÚ Ö Ä Û Ê Ú Û¸ ¸ ½ ¿º Ï Ö¸ º ´½ µº Ô ×Ó Ð ÖÒ Ö ÑÓ Ð Ò º Ó ÒØ Ú Ë Ò ¸ ¾¼¸ ½ ¾¿ º Ï¿ º Ì ÈÐ Ø ÓÖÑ ÓÖ ÈÖ Ú Ý ÈÖ Ö Ò × ½º¼ ´È¿È½º¼µ ËÔ Ø ÓÒº ØØÔ »»ÛÛÛºÛ¿ºÓÖ »ÌÊ»¾¼¼¼» ʹȿȹ¾¼¼¼½¾½ Ò ØØÔ »»ÛÛÛºÛ¿ºÓÖ »ÌʻȿȺÈà ¾¼¼½ ÌÙØÓÖ Ð Ã ÓÖ È Ö×ÓÒ Ð Þ Ø ÓÒ [R-7]